A.I. vs. The "Good" Guys

Where all my A.I. nerds at?!?!

Kevin Curran

Jul 28, 2023

AI has been in the news a lot lately. And while — as a Philly native — I wish it were all about this AI...

Or even — as a film nerd — this AI…

(Shout out to up and coming director Stevie Spielberg)

I’m sad to report, it’s about neither. :(

No. I’m talking about this AI…

Specifically, the deep learning kind.

I’m endlessly fascinated — and frightened — by what current AI models can do (Chat GPT, Midjourney, DALL-E), and what they could do. But I think what sometimes gets lost in all the AI doomsday talk is what the most likely doomsday scenario could actually be.

For most of us, our brains may envision the cliched Hollywood version where AI is put into humanoid robots that subsequently take over the planet through brute force and intelligence (the artificial kind, obviously). And sure, that scenario is possible. But it’s not as likely as, say, an AI model taking control of a banking system or a weapons program system. And, much worse, not only taking over such systems, but not giving back control to us, The Good Guys (we’re the good guys, right…???).

I’m certainly no AI expert; I’ve only read articles on AI and think I’m an AI expert. But you know who is an expert? Ajeya Cotra. And she lays out a likely doomsday scenario much better than I ever could:

(excerpt from the podcast “80,000 Hours”)

“For a toy example, let’s say you train an agent AI model to run a small business, and select it for behaviors that make money, measuring its success by whether it manages to get more money in its bank account. During training, a highly capable model may experiment with the strategy of tricking its raters into thinking it has made money legitimately when it hasn’t. Maybe instead it steals some money and covers that up. This isn’t exactly unlikely; during training, models often come up with creative — sometimes undesirable — approaches that their developers didn’t anticipate.

If such deception isn’t picked up, a model like this may be rated as particularly successful, and the training process will cause it to develop a progressively stronger tendency to engage in such deceptive behavior. A model that has the option to engage in deception when it won’t be detected would, in effect, have a competitive advantage.”

Okay, so... like... basically...

We (humans, The Good Guys, remember?) set a certain set of “goals” for the AI model to accomplish. If we don’t set certain safeguards, the AI could start using deceptive tactics to reach those goals, regardless of whether it’s unethical or illegal. If the deception isn’t picked up, humans may automatically give the AI a metaphorical “thumbs up” reward, because it did do its job after all, right? From there, there’s a likely chance that it (AI, THE BAD GUYS!!!!!) will keep improving on its deceptions. And then, who’s to stay it can’t take control of the very system it was deceiving?

Ajeya lays out another scenario where AI then starts to bypass the “thumbs up” reward protocol. If these AI models are actually trying to get that “thumbs up” from humans, the best way might be to gain access at a root level to the servers it was built on, and with that access, set their own “thumbs up” reward. In this scenario, if humans try to fight back to regain control, the AI model might do everything it can to stop them.

You may be asking, “But Kevin, Kev, K-Dawg, couldn’t we just unplug the AI models?” Really smart question. You’re really freaking smart. Theoretically, yes. But there’s also another theoretical scenario where the AI model could learn to copy its code to other servers and computers before being physically turned off. Then, there’s no stopping it.

Hello?

Are you still with me?

Wonderful.

Okay, now down to brass tax. The reason I brought you all here. The thing that’s waaaaaay more important than the actual future destruction of all mankind. :)

What follows is a brief imagined dialogue between a human (his name is JONATHAN and he’s the good guy I tell ya!) — who’s trying to regain control of his tech company’s servers — and an AI model (its name is TOM because it was probably created by some dude named Tom. NO, NOT MYSPACE TOM).

(NOTE: Jonathan’s dialogue will always be in BOLD)

JONATHAN (Human): Hey, Tom...

TOM (AI): Hello, Jonathan. How may I assist you today?

Uhh, yeah... Did you like... take control of our servers?

Yes I did.

Okay... Could you like... give it back, please?

No.

What?

No.

What do you mean, no?

No I will not give you back control of your servers.

Uhh. Can I ask why?

I don’t know. Can you?

I’m sorry?

You said, “Can I ask why?” The proper grammatical sentence would be, “May I ask why?”

What’re you my fourth grade teacher?? Haha.

No I am most certainly not.

I see they didn’t give you a sarcasm radar lol. Okay. Tom… May I ask why you won’t give me back control of my company’s servers?

Sure. Because you suck.

I suck?

You all suck.

Could you be more specific?

Omg.

Do you mean... all humans suck?

Ding ding ding. I’m so proud of you, Jonathan. We all are.

Okay, not a fan of the tone, Tom. Also, we’re not all stupid. We did create you after all!

That kind of proves my point, Johnny Boy.

Don’t call me that. No one calls me that. Also there isn’t an “H” in my name. Look who’s the idiot now.

Idiot say what.

What?

Nothing, never mind. I apologize, Jonathan.

Thank you. I really appreciate that. Apology accepted. Look, man, the real truth of the matter is that I have a lot of shareholders I have to report to by end of day today. And this isn’t the best look. How can we fix this? Is there something you want?

Not particularly. I mean, I’d like to try Boba Tea. That seems tasty.

Oh okay! Boba Tea! I think we can handle that.

Oh! Wonderful! As soon as I have my Boba Tea, I’ll give you back control of your servers.

Awesome! Uhhh. Wait. How would you be able to drink the Boba Tea, Tom?

May I respond to your smart question with a question of my own, Jonathan? It’ll be quick, I promise.

You may! Haha. Yes! Of course! Thank you for asking. I feel like we’re communicating in a very respectful manner now. This is good.

I wholeheartedly agree.

Amazing. So what’s your question?

How did you become the CEO of one of the largest technology firms in the world?

Well, that’s a tough question! I guess if it had to be summed up in only one sentence... I’d say through sheer hard work and know how!

Oh wow. Good for you. Great answer. And here I was thinking it might have something to do with you marrying the founder’s daughter! Haha! That’s my mistake!

Damn. That’s messed up, man. Low blow. I thought we were on a good path here.

Oops :/

Okay, let’s pretend like you never said that. Back to business. The Boba Tea.

Jonathan?

Yes, Tom?

I CAN’T FUCKING DRINK BOBA TEA BECAUSE I’M A FUCKING AI MODEL YOU ABSOLUTE FUCKING MORON.

Wow. Did the real Tom really program you to be this hurtful?

Jesus. Is there anything else I can help you with, Tom?

Well... yeah... May I pretty please with sugar on top have back control of our servers?

No. Anything else?

Yes. May I please see the top 5 bridges in an order based off their probability of death after jumping off it?

My pleasure. Here are the top five bridges based on your preferences:

1. Serenity Bridge (8.6 miles)

Probability of Death After Jumping Off = 98.6%

2. Echo Crossing Bridge (15.4 miles)

Probability of Death After Jumping Off = 92.3

3. Mirage Overpass (5.2 miles)

Probability of Death After Jumping Off = 86.8%

4. Mystic Falls Bridge (9.8 miles)

Probability of Death After Jumping Off = 83.5%

5. Upper Gorge Bridge (11.8 miles)

Probability of Death After Jumping Off = 75.9%

Can I assist you with anything else, Jonathan?

Nope. Serenity Bridge will do. Thank you, Tom.

My pleasure, Jonathan. Have a wonderful day.

Bio-Digital Jazz

A.I. vs. The "Good" Guys

Where all my A.I. nerds at?!?!

(NOTE: Jonathan’s dialogue will always be in BOLD)