All Episodes
June 28, 2023 - Making Sense - Sam Harris
53:56
#324 — Debating the Future of AI

Sam Harris speaks with Marc Andreessen about the future of artificial intelligence (AI). They discuss the primary importance of intelligence, possible good outcomes for AI, the problem of alienation, the significance of evolution, the Alignment Problem, the current state of LLMs, AI and war, dangerous information, regulating AI, economic inequality, and other topics. If the Making Sense podcast logo in your player is BLACK, you can SUBSCRIBE to gain access to all full-length episodes at samharris.org/subscribe. Learning how to train your mind is the single greatest investment you can make in life. That’s why Sam Harris created the Waking Up app. From rational mindfulness practice to lessons on some of life’s most important topics, join Sam as he demystifies the practice of meditation and explores the theory behind it.

| Copy link to current segment

Time Text
Welcome to the Making Sense Podcast.
This is Sam Harris.
Just a note to say that if you're hearing this, you are not currently on our subscriber feed and will only be hearing the first part of this conversation.
In order to access full episodes of the Making Sense Podcast, you'll need to subscribe at SamHarris.org.
There you'll find our private RSS feed to add to your favorite podcatcher, along with other subscriber-only content.
We don't run ads on the podcast, and therefore it's made possible entirely through the support of our subscribers.
So if you enjoy what we're doing here, please consider becoming one.
Okay, well there's been a lot going on out there.
Everything from Elon Musk and Mark Zuckerberg challenging one another to an MMA fight, which is ridiculous and depressing, to Robert Kennedy Jr.
appearing on every podcast on earth, apart from this one.
I have so far declined the privilege It really is a mess out there.
I'll probably discuss the RFK phenomenon in a future episode, because it reveals a lot about what's wrong with alternative media at the moment.
But I will leave more of a postmortem on that for another time.
Today I'm speaking with Mark Andreessen.
Mark is a co-founder and general partner at the venture capital firm Andreessen Horowitz.
He's a true internet pioneer.
He created the Mosaic internet browser and then co-founded Netscape.
He's co-founded other companies and invested in too many to count.
Mark holds a degree in computer science from the University of Illinois and he serves on the board of many Andreessen Horowitz portfolio companies.
He's also on the board of Meta.
Anyways, you'll hear Mark and I get into a fairly spirited debate about the future of AI.
We discuss the importance of intelligence generally and the possible good outcomes of building AI, but then we get into our differences around the risks or lack thereof of building AGI, artificial general intelligence.
We talk about the significance of evolution in our thinking about this, the alignment problem, the current state of large language models, How developments in AI might affect how we wage war, what to do about dangerous information, regulating AI, economic inequality, and other topics.
Anyway, it's always great to speak with Mark.
We had a lot of fun here.
I hope you find it useful.
And now I bring you Marc Andreessen.
I am here with Marc Andreessen.
Mark, thanks for joining me again.
It's great to be here, Sam.
Thanks.
I got you on the end of a swallow of some delectable beverage.
Yes, you did.
This should be interesting.
I'm eager to speak with you specifically about this recent essay you wrote on AI.
Obviously, many people have read this, and you are a voice that many people value on this topic, among others.
Perhaps, you've been on the podcast before, and people know who you are, but maybe you can briefly summarize how you come to this question.
I mean, how would you summarize the relevant parts of your career with respect to the question of AI and its possible ramifications?
Yeah, so I've been a computer programmer, technologist, computer scientist since the 1980s, when I actually entered college in 1989 at University of Illinois.
The AI field had been through a boom in the 80s, Which had crashed hard, and so by the time I got to college, the AI wave was dead and buried at that point for a while.
It was like the backwater of the department that nobody really wanted to talk about.
But I learned a lot of it in school, and then I went on to help create what is now kind of known as the modern internet in the 90s, and then over time transitioned to become a Well, from being a technologist to being an entrepreneur, and then today I'm an investor, venture capitalist.
And so 30 years later, 30, 35 years later, I'm involved in a very broad cross section of tech companies that have many, many of them have many kind of AI aspects to them.
And so, you know, and everything from Facebook now meta, you know, which has been investing deeply in AI for over a decade, you know, through to many of the best new AI startups, you know, we're our day our day job is to find the best new startups in a new category like this and try to back the entrepreneurs.
And so that's a that's how I spend most of my time right now.
Okay, so the essay is titled, Why AI Will Save the World.
And I think even in the title alone, people will detect that you are striking a different note than I tend to strike on this topic.
I think we'll, I disagree with a few things in the essay that are, I think, at the core of my interest here.
But I think there are many things, you know, we agree about.
You know, up front, we agree, I think, with more or less anyone who thinks about it, that intelligence is good and we want more of it.
And if it's not necessarily the source of everything that's good in human life, it is what will safeguard everything that's good in human life, right?
So even if you think that love is more important than intelligence, and you think that playing on the beach with your kids is way better than doing science or anything else that is narrowly linked to intelligence, Well, you have to admit that you value all of the things that intelligence will bring that will safeguard the things you value.
So, a cure for cancer and a cure for Alzheimer's and a cure for a dozen other things will give you much more time with the people you love, right?
So, whether you think about the primacy of intelligence or not very much, it is the thing that has differentiated us from our primate cousins and it's the thing that allows us to do everything that is maintaining the status of civilization and If the future is going to be better than the past, it's going to be better because of what we've done with our intelligence in some basic sense.
And I think we're going to agree that because intelligence is so good, and because each increment of it is good and profitable, this AI arms race and gold rush is not going to stop, right?
We're not going to pull the brakes here and say, let's take a pause of five years and not build any AI, right?
I don't remember if you addressed that specifically in your essay, but even if some people are calling for that, I don't think that's in the cards, and I don't think you think that's in the cards.
Well, there are, you know, it's hard to believe that you just, like, put in the box, right, and stop working on it.
It's hard to believe that the progress stops.
You know, like, having said that, there are some powerful and important people who are in Washington right now advocating that, and there are some politicians who are taking them seriously.
So, you know, at the moment, there is some danger around that.
There's two other big dangers, two scenarios that I think would both be very, very devastating for the future.
One is the scenario where the fears around AI are used to basically entrench a cartel.
And this is what's happening right now.
This is what's being lobbied for right now, is there are a set of big companies that are arguing in Washington.
Yes, AI has positive cases, uses.
Yes, AI is also dangerous.
Because it's dangerous, therefore, We need a regulatory structure that basically entrenches a set of currently powerful tech companies, you know, to be able to have basically exclusive rights to do this technology.
I think that would be devastating for reasons we could discuss.
And then look, there's a third outcome, which is we lose, China wins.
They're certainly working on AI and they have a, you know, what I would consider to be a very dark and dystopian vision of the future, which I also do not want to win.
Yeah.
I mean, I guess that is in part the cash value of the point I just made, that even if we decided to stop, not everyone's going to stop, right?
I mean, human beings are going to continue to grab as much intelligence as we can grab, even if in some local spot, we decide to pull the brakes.
Although it really is, at this point, it's hard to imagine even Well, whatever the regulation is, it's really stalling progress.
I mean, given just, again, given the intrinsic value of intelligence and given the excitement around it and given the obvious dollar signs that everyone is seeing.
I mean, the incentives are such that I just don't see it.
But, well, we'll come to the regulation piece eventually because I think it's, you know, I, given the difference in our views here, it's not going to be a surprise that I want some form of regulation and I'm not quite sure What that could look like, and I think you would have a better sense of what it looks like, and perhaps that's why you're worried about it.
But before we talk about the fears here, let's talk about the good outcome, because I know you don't consider yourself a utopian, but you sketch a fairly utopian picture of promise in your essay.
If we got this right, how good do you think it could be?
Yeah, so let's just start by saying I kind of deliberately loaded the title of the essay with a little bit of a religious element, and I did that kind of very deliberately because I view that I'm up against a religion, the sort of AI risk fear religion.
But I am not myself religious, you know, lowercase r religious in the sense of, you know, I'm not a utopian.
I'm very much an adherent to what Thomas Sowell called the constrained vision, not the unconstrained vision.
So I live in a world of practicalities and tradeoffs.
And so, yeah, I am actually not utopian.
Look, having said that, building on what you've already said, like intelligence, if there is a lever for human progress across many thousands of domains simultaneously, it is intelligence.
And we just we know that because we have thousands of years of experience seeing that play out.
The thing I would add, I thought you made that case very well.
The thing I would add to the case you made about the positive virtues of intelligence in human life is that the way you described it, at least the way I heard it, was more Focused on like the social, societal-wide benefits of intelligence, for example, cures for diseases and so forth.
That is true, and I agree with all that.
There are also individual-level benefits of intelligence, right?
At the level of an individual, even if you're not the scientist who invents a cure for cancer, at an individual level, if you are smarter, you have better life welfare outcomes on almost every metric that we know how to measure.
Everything from how long you'll live, how healthy you'll be, How much education you'll achieve, career success, the success of your children.
By the way, your ability to solve problems, your ability to deal with conflict.
Smarter people are less violent.
Smarter people are less bigoted.
And so there's this very broad kind of pattern of human behavior where basically more intelligence, you know, just simply at the individual level leads to better outcomes.
And so the The sort of most utopian I'm willing to get is sort of this potential, which I think is very real right now.
It's already started, where you basically just say, look, human beings from here on out are going to have an augmentation, and the augmentation is going to be in the long tradition of augmentations like everything from eyeglasses to shoes to word processors to search engines, but now the augmentation is intelligence.
And that augmented intelligence capability is going to let them capture the gains of individual-level intelligence, you know, potentially considerably above, you know, where they punch in as individuals.
And what's interesting about that is that can scale all the way up, right?
Like, you know, somebody who is, you know, somebody who struggles with, you know, daily challenges all of a sudden is going to have a partner and an assistant.
I'm going to coach and a therapist and a mentor to be able to help improve a variety of things in their lives.
And then, you know, look, if you had given this to Einstein, you know, he would have been able to discover a lot more new fundamental laws of physics, right?
In the, in the, you know, in the, in the, in the full, in the full, in the full vision.
And so, you know, this is one of those things where it could help everybody and then it could help everybody in many, many different ways.
Yeah, well, see, in your essay, you go into some detail, bullet points around this concept of everyone having essentially a digital oracle in their pocket where you have this personal assistant who you can be continuously in dialogue with, and it'd be like having the smartest person who's ever lived just giving you a bespoke concierge service to all manner of task.
across any information landscape.
I happened to recently rewatch the film Her, which I hadn't seen since it came out.
It came out 10 years ago.
I don't know if you've seen it lately, but I must say it lands a little bit differently now that we're on the cusp of this thing.
While it's not really dystopian, There is something a little uncanny and quasi-bleak around even the happy vision here of having everyone siloed in their interaction with an AI.
I mean, it's the personal assistant in your pocket that becomes so compelling and so aware of your goals and aspirations and what you did yesterday and the email you sent or forgot to send.
Apart from the ending, which is kind of clever and surprising and kind of irrelevant for our purposes here, it's not an aspirational vision of the sort that you sketch in your essay.
And I'm wondering if you see any possibility here that even the best case scenario has something intrinsically alienating and troublesome about it.
Yeah, so look, on the movie, as Peter Thiel has pointed out, Hollywood no longer makes positive movies about technology.
And look, he argues it's because they hate technology, but I would argue maybe a simpler explanation, which is they want dramatic tension and conflict.
And so necessarily, things are going to have a dark tinge.
Regardless, they obviously spring-loaded by their choice of character and so forth.
The scenario I have in mind is actually quite a bit different, and let me get kind of maybe philosophical for a second, which is, you know, there's this long-running debate.
This question that you just raised is a question that goes back to the Industrial Revolution.
And remember, it goes back to the core of actually Marx's original theory.
Marx's original theory was industrialization, technology, modern economic development, right, alienates the human being, right, from society, right?
That was his core indictment.
Of technology.
And look, you can point to many, many cases in which I think that has actually happened.
I think alienation is a real problem.
I don't think that critique was entirely wrong.
His prescriptions were disastrous, but I don't think the critique was completely wrong.
Look, having said that, then it's a question of, like, okay, now that we have the technology that we have, and we have, you know, new technology we can invent, like, how could we get to the other side of that problem?
And so I would put the shoe on the other foot, and I would say, look, the purpose of human existence and the way that we live our lives should be determined by us, and it should be determined by us to maximize our potential as human beings.
And the way to do that is precisely to have the machines do all the things that they can do so that we don't have to.
Right.
And this is why Mark's ultimately his critique was actually in the long run, I think has been judged to be incorrect, which is we are all much better.
Anybody in the developed West, you know, industrialized West today is much better off by the fact that we have all these machines that are doing everything from making shoes to harvesting corn to doing everything, you know, so many other, you know, industrial processes around us.
Like we just have a lot more time and a much more pleasant, you know, day to day life, you know, than we would have if we were still doing things the way that things used to be done.
The potential with AI is just like, look, take, take, take, take the drudge work out, like take the remaining drudge work out, take all the, you know, look, like I'll give you a simple example, office work.
That, you know, the inbox staring at you in the face with 200 emails, right?
Friday at three in the afternoon, like, okay, no more of that, right?
Like, we're not going to do that anymore because I'm going to have an AI assistant.
The AI assistant is going to answer the emails.
Right.
And in fact, what's going to happen is my AI assistant is going to answer the email that your AI assistant sent, right?
It's mutually assured destruction.
Yeah, exactly.
But like the machine should be doing that.
Like the human being should not be sitting there when it's like sunny out and he's like, you know, my, when my eight year old wants to play, I'm not, I shouldn't be sitting there doing emails.
I should be out with my eight year old.
There should be a machine that does that for me.
And so I view this very much as basically apply the machines to do the drudge work precisely so that people can live more human lives.
Now, this is philosophical.
People have to decide what kind of lives they want to live.
And again, I'm not a utopian on this.
And so there's a long discussion we could have about how this actually plays out.
But that potential is there for sure.
Right, right.
Okay, so let's jump to the bad outcomes here, because this is really why I want to talk to you.
In your essay, you list five, and I'll just read your section titles here, and then we'll take a whack at them.
The first is, Will AI Kill Us All?
Number two is, Will AI Ruin Our Society?
Number three is, Will AI Take All Our Jobs?
Number four is, Will AI Lead to Crippling Inequality?
And five is, Will AI Lead to People Doing Bad Things?
And I would tend to bin those in really two buckets.
The first is, will AI kill us?
And that's the existential risk concern.
And the others are more the ordinary bad outcomes that we tend to think about with other technology.
Bad people doing bad things with powerful tools.
Unintended consequences, disruptions to the labor market, which I'm sure we'll talk about.
And all of those are certainly the near-term risks.
And they're, in some sense, even more interesting to people because the existential risk component is longer term, and it's even purely hypothetical, and you seem to think it's purely fictional.
And this is where I think you and I disagree.
So let's start with this question of, will AI kill us all?
And the thinking on this tends to come under the banner of the problem of AI alignment, right?
And the concern is that we can build, if we build machines more powerful than ourselves, more intelligent than ourselves, It seems possible that the space of all possible more powerful superintelligent machines includes many that are not aligned with our interests and not disposed to continually track our interests, and many more of that sort than of the sort that perfectly hew to our interests in perpetuity.
So the concern is we could build something powerful that is essentially an angry little god that we can't figure out how to placate once we've built it.
And certainly, we don't want to be negotiating with something more powerful and intelligent than ourselves.
And the picture here is of something like, you know, a chess engine, right?
We've built chess engines that are more powerful than we are at chess.
And once we built them, if everything depended on our beating them in a game of chess, we wouldn't be able to do it, right?
Because they are simply better than we are.
And so, Now we're building something that is a general intelligence, and it will be better than we are at everything that goes by that name, or such is the concern.
And in your essay, I mean, I think there's an ad hominem piece that I think we should blow by, because you've already described this as a religious concern, and in the essay you describe it as just a symptom of superstition, and that people are essentially in a new doomsday cult.
And there's some share of true believers here and there's some share of, you know, AI safety grifters.
And I think, you know, I'm sure you're right about some of these people, but we should acknowledge up front that there are many super qualified people of high probity who are prominent in the field of AI research who are part of this chorus.
I mean, you've got somebody like Geoffrey Hinton, who arguably did as much as anyone to create the breakthroughs that have given us these LLMs.
We have Stuart Russell, who literally wrote the most popular textbook on AI.
So there are other serious sober people who are very worried for reasons of a sort that I'm going to express here.
I just want to acknowledge that both are true.
There's the crazy people, the new millennialists, the doomsday preppers, the neuroatypical people who are in their polyamorous cults, and, you know, AI alignment is their primary fetish.
But there's a lot of sober people who are also worried about this.
Would you acknowledge that much?
Yeah, although it's tricky, because smart people also have a tendency to fall into cults.
So that doesn't get you totally off the hook.
On that one, but I would register a more fundamental objection to what I would describe as, and I'm not knocking you on this, but it's something that people do as sort of argument by authority, I don't think applies either.
Yeah, well, I'm not making that, yeah.
No, I know, but like this idea, which is very, and again, I'm not characterizing your idea.
I'll just say it's a general idea.
This general idea that there are these experts and these experts are experts because they're the people who created the technology or originated the ideas or implemented the systems, therefore have sort of special knowledge and insight in terms of their downstream impact on society and rules and regulations and so forth and so on.
That assumption does not hold up well historically.
In fact, it holds up disastrously historically.
There's actually a new book out I've been giving all my friends called When Reason Goes on Holiday, and it's a story of literally what happens when basically people who are like specialized experts in one area Stray outside of that area in order to become sort of general purpose philosophers and sort of social thinkers.
And it's just a tale of woe, right?
And in the 20th century, it was just a catastrophe.
And the ultimate example of that, and this is going to be the topic of this big movie coming out this summer in Oppenheimer, you know, the central example of that was the nuclear scientists who decided that, you know, nuclear power, nuclear energy, they had various theories on what was good, bad, Whatever.
A lot of them were communists.
A lot of them were, you know, at least allied with communists.
A lot of them had a suspiciously large number of communist friends and housemates.
And, you know, number one, like they, you know, made a moral decision, a number of them did, to hand the bomb to the Soviet Union, you know, with what I would argue are catastrophic consequences.
And then two is they created an anti-nuclear movement that resulted in nuclear energy stalling out in the West, which has also just been like absolutely catastrophic.
And so if you listen to those people in that era who were, you know, the top nuclear physicists of their time, you made a horrible set of decisions.
And quite honestly, I think that's what's happening here again.
And I just, I don't think they have the special insight that people think that they have.
Okay.
Well, so, I mean, this cuts both ways because, you know, at the beginning, I'm definitely not making an argument from authority.
Authority is a proxy for understanding the facts at issue, right?
It's not to say that, I mean, in the cases you're describing, what we often have are people who have a narrow authority in some area of scientific specialization, and then they begin to weigh in, in a much broader sense, as moral philosophers.
What I think you might be referring to there is that, you know, in the aftermath of Hiroshima and Nagasaki, We've got nuclear physicists imagining that they now need to play the geopolitical game.
Actually, we have some people who invented game theory, right, for understandable reasons, thinking they need to play the game of geopolitics.
And in some cases, I think in von Neumann's case, he even recommended preventative war against the Soviet Union before they even got the bomb, right?
It could have gotten worse.
I think he wanted us to bomb Moscow or at least give them some kind of ultimatum.
I think it wasn't I don't think he wanted us to drop bombs in the dead of night, but I think he wanted a strong ultimatum game played with them before they got the bomb.
And I forget how he wanted that to play out.
And worse still, I think Bertrand Russell, I could have this backwards, maybe von Neumann wanted a bomb, but Bertrand Russell, a true moral philosopher, briefly advocated preventative war.
But in his case, I think he wanted to offer some kind of ultimatum to the Soviets.
In any case, that's the problem.
But at the beginning of this conversation, I asked you to give me a brief litany of your bona fides to have this conversation so as to inspire confidence in our audience and also just to acknowledge the obvious, that you know a hell of a lot about the technological issues we're going to talk about.
And so if you have strong opinions, they're not coming totally out of left field.
And so it would be with Jeffrey Hinton or anyone else.
And if I threw another name at you that was of some crackpot whose connection to the field was non-existent, you would say, why should we listen to this person at all?
You wouldn't say that about Hinton or Stuart Russell.
But I'll acknowledge that Where authority breaks down is really you're only as good as your last sentence here, right?
If the thing you just said doesn't make any sense, well then, your authority gets you exactly nowhere, right?
We just need to keep talking about what doesn't make sense.
Or it should, or it should.
Ideally, that's the case.
In practice, that's not what tends to happen, but that would be the goal.
Well, I hope to give you that treatment here because some of your sentences, I don't think, add up the way you think they do.
Good.
Okay, so there's actually one paragraph in the essay that caught my attention that really inspired this conversation.
I'll just read it so people know what I'm responding to here.
So this is you.
My view is that the idea that AI will decide to literally kill humanity is a profound category error.
AI is not a living being that has been primed by billions of years of evolution to participate in the battle for survival of the fittest, as animals were, and as we are.
It is math, code, computers built by people, owned by people, used by people, controlled by people, The idea that it will at some point develop a mind of its own and decide that it has motivations that lead it to try to kill us is a superstitious hand wave.
In short, AI doesn't want.
It doesn't have goals.
It doesn't want to kill you.
Because it's not alive.
AI is a machine.
It's not going to come alive any more than your toaster will.
End quote.
Yes.
So, I mean, I see where you're going there.
I see why that may sound persuasive to people, but to my eye, that doesn't even make contact with the real concern about alignment.
So let me just kind of spell out why I think that's the case.
Because it seems to me that you're actually not taking intelligence seriously right now.
I mean, some people assume that as intelligence scales, we're going to magically get ethics along with it, right?
So the smarter you get, the nicer you get.
And while, I mean, there's some data points with respect to how humans behave, you know, and you just mentioned one in a few minutes ago, It's not strictly true even for humans.
And even if it's true in the limit, right, it's not necessarily locally true.
And more important, when you're looking across species, differences in intelligence are intrinsically dangerous for the stupider species.
So it need not be a matter of super-intelligent machines spontaneously becoming hostile to us and wanting to kill us.
It could just be that they begin doing things that are not in our well-being, right?
Because they're not taking it into account as a primary concern, in the same way that we don't take the welfare of insects into account as a primary concern, right?
So it's very rare that I intend to kill an insect.
But I regularly do things that annihilate them just because I'm not thinking about them, right?
I'm sure I've effectively killed millions of insects, right?
If you build a house, you know, that must be a holocaust for insects, and yet you're not thinking about insects when you're building that house.
So, and there are many other pieces to my gripe here, but let's just take this first one.
It just seems to me that you're not envisioning what it will mean to be in relationship to systems that are more intelligent than we are.
You're not seeing it as a relationship.
And I think that's because you're denuding intelligence of certain properties And not acknowledging it in this paragraph, right?
So, to my ear, general intelligence, which is what we're talking about, implies many things that are not in this paragraph.
It implies autonomy, right?
And it implies the ability to form unforeseeable new goals, right?
In the case of AI, it implies the ability to change its own code, ultimately.
And, you know, execute programs, right?
I mean, it's just, it's doing stuff because it is intelligent, autonomously intelligent.
It is capable of doing just, we can stipulate, more than we're capable of doing because it is more intelligent than we are at this point.
So, the superstitious hand-waving I'm seeing is in your paragraph when you're declaring that it would never do this because it's not alive, right?
As though the difference between biological and non-biological substrate were the crucial variable here.
But there's no reason to think it's a crucial variable where intelligence is concerned.
Yeah, so I would say there's, to steelman your argument, I would say you could actually break your argument into two forms, or the AI risk community would break this argument into two forms.
And they would argue, I think, the strong form of both.
So they would argue the strong form of, number one, and I think this is kind of what you're saying, correct me if I'm wrong, is because it is intelligent, therefore it will have goals.
If it didn't start with goals, it will evolve goals.
It will, you know, whatever.
It will over time have a set of preferred outcomes, behavior patterns that it will determine for itself.
And then they also argue the other side of it, which is what they call the orthogonality argument, which is, it's actually another risk argument, but it's actually sort of the opposite argument.
It's an argument that it doesn't have to have goals to be dangerous, right?
And that being, you know, it doesn't have to be sentient.
It doesn't have to be conscious.
It doesn't have to be self-aware.
It doesn't have to be self-interested.
It doesn't have to be in any way like even thinking in terms of goals.
It doesn't matter because simply it can just do things.
And this is the, you know, this is the classic paperclip maximizer, you know, kind of argument.
Like it'll just get, it'll, it'll start, it'll get kicked off on one apparently innocuous thing.
And then it will just extrapolate that ultimately to the destruction of everything.
Right.
So, so anyways, is that helpful to maybe break those into the, you Yeah, I mean, I'm not quite sure how fully I would sign on the dotted line to each, but the one piece I would add to that is that having any goal does invite the formation of instrumental goals once this system is responding to a changing environment.
Right.
I mean, if your goal is to make paperclips and you're super intelligent, and somebody throws up some kind of impediment to your making paperclips, well then you're responding to that impediment, and now you have a shorter-term goal of dealing with the impediment, right?
So that's the structure of the problem.
Yeah, that's right.
Yeah, right.
For example, the U.S.
military wants to stop you from making more paperclips, and so therefore you develop a new kind of nuclear weapon.
Right, in order fundamentally to pursue your goal of making paperclips.
But one problem here is that these, the instrumental goal, even if the paperclip goal is the wrong example here, because even if you think of a totally benign future goal, right, a goal that seems more or less synonymous with taking human welfare into account, it's possible to imagine a scenario where some instrumental new goal that could not be foreseen ...appears that is, in fact, hostile to our interests.
And if we're not in a position to say, oh, no, no, don't do that, that would be a problem.
So a full version of that, a version of that argument that you hear is basically, what if the goal is to maximize human happiness?
And then the machine realizes that the way to maximize human happiness is to strap us all down and put us in a Nozick experience machine and wire us up with VR and ketamine, and we can never get out of the Matrix.
Right, and it'd be maximizing human happiness as measured by things like dopamine levels or serotonin levels or whatever, but obviously not a positive outcome.
But again, that's like a variation of this paper clip.
That's one of these arguments that comes out of their orthogonality thesis, which is the goal can be very simple and innocuous, and yet lead to catastrophe.
So look, I think each of these has their own problems.
So where you started, where they're sort of like the machine basically, and we can quibble with terms here, but the side of the argument in which the machine is in some way self-interested, self-aware, self-motivated, trying to preserve itself, some level of sentience, consciousness, setting its own goals.
Well, just to be clear, there's no consciousness implied here.
I mean, the lights don't have to be on.
It just, I think that, I mean, this remains to be seen whether consciousness comes along for the ride at a certain level of intelligence, but I think they probably are orthogonal to one another.
So, intelligence can scale without the lights coming on, in my view.
So, let's leave sentience and consciousness aside.
Well, but I guess there is a fork in the road, which is like, is it declaring its own intentions?
Like, is it developing its own, you know, conscious or not?
Does it have a sense of any form or a vision of any kind of its own future?
Yeah, so this is why I think there's some daylight growing between us, because to be dangerous, I don't think you need, necessarily, to be running a self-preservation program.
I mean, there's some version of unaligned competence that may not formally model the machine's place in the world, much less defend that place, which could still be, if uncontrollable by us, could still be dangerous.
It's like it doesn't have to be self-referential in a way that an animal... The truth is, there are dangerous animals that might not even be self-referential, and certainly something like a virus or a bacterium is not self-referential in a way that we would understand, and it can be lethal to our interests.
Yeah, that's right.
Okay, so you're more on the orthogonality side between the two, if I identify the two poles of the argument.
You're more on the orthogonality side, which is it doesn't need to be conscious, it doesn't need to be sentient, it doesn't need to have goals, it doesn't need to want to preserve itself.
Nevertheless, it will still be dangerous because of, as you described, the consequences of sort of how it gets started and then sort of what happens over time, for example, as it defines sub-goals.
to the original goals, and it goes off course.
So there's a couple problems with that.
So one is, it assumes, and here it's like, I would argue, people don't give intelligence enough credit.
Like, there are cases where people give intelligence too much credit, and then there's cases where they don't give it enough credit.
Here, I don't think they're giving enough credit, because it sort of implies that this machine has, like, basically this infinite capacity to cause harm.
Therefore, it has an infinite capacity to basically actualize itself.
In the world, therefore it has an infinite capacity to, you know, basically plan, you know, and again, maybe just like in a completely blind watchmaker way or something, but it has the, you know, it has, it has an ability to, you know, plan itself out.
And yet it never occurs to this super genius, infinitely powerful machine that is having such, you know, potentially catastrophic impacts.
Notwithstanding all of that capability and power, it never occurs to it that maybe paperclips is not what its mission should be.
Well, that's the thing.
I think it's possible to have a reward function that is deeply counterintuitive to us.
I mean, it's almost like saying what you're smuggling in in that rhetorical question is a fairly capacious sense of common sense, right?
Of course, if it's a super genius, It's not going to be so stupid as to do X, right?
Yeah.
But I just think that if aligned, then the answer is trivially true.
Yes, of course it wouldn't do that, but that's the very definition of alignment.
But if it's not aligned, if you could say that... I mean, just imagine... I guess there's another piece here I should put in play, which is So you make an analogy to evolution here, which you think is consoling, which is, this is not an animal, right?
This has not gone through the crucible of Darwinian selection here on earth with other wet and sweaty creatures.
And therefore it has not, it hasn't developed the kind of antagonism we see in other animals.
And therefore we, you know, if you're imagining a super genius gorilla, well, you're imagining the wrong thing.
that we're going to build this, and it's not going to be tuned in any of those competitive ways.
But there's another analogy to evolution that I would draw, and I'm sure others in the space of AI fear have drawn, which is that we have evolved.
We have been programmed by evolution, and yet evolution can't see anything we're doing.
It has programmed us to really do nothing more than spawn and help our kids spawn.
Yet, everything we're doing, I mean, from having conversations like this, to building the machines that could destroy us, I mean, there's nothing it can see.
And there are things we do that are perfectly unaligned with respect to our own code, right?
I mean, if someone decides not to have kids, and they just want to spend the rest of their life in a monastery, or surfing, That is something that is antithetical to our code, it's totally unforeseeable at the level of our code, and yet it is obviously an expression of our code, but an unforeseeable one.
And so the question here is, if you're going to take intelligence seriously, and you're going to build something that's not only more intelligent than you are, but it will build the next generation of itself, or the next version of its own code to make it more intelligent still, It just seems patently obvious that that entails it finding cognitive horizons that you, the builder, are not going to be able to foresee and appreciate.
By analogy with evolution, it seems like we're guaranteed to lose sight of what it can understand and care about.
So a couple things.
So one is like, look, I don't know, you're kind of making my point for me.
So evolution and intelligent design, as you well know, are two totally different things.
And so we are evolved, and of course we're not just evolved to, we are evolved to have kids.
And by the way, when somebody chooses to not have kids, I would argue that is also evolution working.
People are opting out of the gene pool, fair enough.
Evolution does not guarantee a perfect result.
It basically just is a mechanism operating aggregate.
But in any event, let me get to the point.
So we're evolved.
We have conflict wired into us.
We have conflict and strife.
I mean, look, four billion years of battles to the death at the individual and then ultimately at the societal level to get to where we are.
We fight at the drop of a hat.
You know, we all do.
Everybody does.
And, you know, hopefully these days we fight verbally like we are now and not physically.
But we do.
And look, the machine is intelligent.
It's a process of intelligent design.
It's the opposite of evolution.
These machines are being designed by us.
If they design future versions of themselves, they'll be intelligently designing themselves.
It's just a completely different path with a completely different mechanism.
And so the idea that, therefore, conflict is wired in at the same level that it is through evolution, I just, like, there's no reason to expect that to be the case.
But it's not, again, well, let me just give you back this picture with a slightly different framing and see how you react to it, because I think the superstition is on the other side.
So if I told you that aliens were coming from outer space, right, and they're going to land here within a decade, And they're way more intelligent than we are.
And they have some amazing properties that we don't have, which explain their intelligence.
But they're not only faster than we are, but they're linked together, right?
So that when one of them learns something, they all learn that thing.
They can make copies of themselves.
And they're just cognitively, they're obviously our superiors.
But no need to worry because they're not alive, right?
They haven't gone through this process of biological evolution and they're just made of the same material as your toaster.
They were created by a different process and yet they're far more competent than we are.
Would you, just hearing it described that way, would you feel totally sanguine about, you know, sitting there on the beach waiting for the mother craft to land and you just, you know, rolling out brunch for these guys?
So this is what's interesting, because with these, now that we have LLMs working, we actually have an alternative to sitting on the beach, right, waiting for this to happen.
We can just ask them.
And so this is one of the very interesting, this to me like conclusively disproves the paperclip thing, the orthogonality thing just right out of the gate, is you can sit down tonight with GPT-4 and whatever other one you want And you can engage in moral reasoning and moral argument with it right now.
And you can, like, interact with it.
Like, okay, you know, what do you think?
What are your goals?
What are you trying to do?
How are you going to do this?
What if, you know, you were programmed to do that?
What would the consequences be?
Why would you not, you know, kill us all?
And you can actually engage in moral reasoning with these things right now.
And it turns out they're actually very sophisticated in moral reasoning.
And of course, the reason they're sophisticated in moral reasoning is because they have loaded into them the sum total of all moral reasoning that all of humanity has ever done, and that's their training data.
And they're actually happy to have this discussion with you.
Except, there's a few problems here.
One is, I mean, these are not the super intelligences we're talking about yet, but two, they're... I mean, intelligence entails an ability to lie and manipulate, and if it really is intelligent, it is something that you can't predict in advance, and certainly if it's more intelligent than you are.
And that just falls out of the definition of what we mean by intelligence in any domain.
It's like with chess, you can't predict the next move of a more intelligent chess engine, otherwise it wouldn't be more intelligent than you.
So can I, let me quibble with, I'm going to come back to your chess computer thing, but let me quibble with this.
So there's the idea, let me generalize the idea you're making about superior intelligence.
Tell me if you disagree with this, which is sort of superior intelligence, you know, sort of superior intelligence basically at some point always wins because basically smarter is better than dumber, smarter outsmarts dumber.
Smarter deceives dumber.
Smarter can persuade dumber, right?
And so, you know, smarter wins.
You know, I mean, look, there's an obvious way to falsify that thesis sitting here today, which is like, just look around you in the society you live in today.
Would you say the smart people are in charge?
Well, again, there are more variables to consider when you're talking about, you know, outcome.
Because obviously, yes, the dumb brute can always just brain the smart geek and... Well, no, no, no, I'm not even talking about braining.
Are the PhDs in charge?
Well, no, but you're pointing to a process of cultural selection that is working by a different dynamic here.
But in the narrow case, when you're talking about like a game of chess, Yes.
When you're talking, there's no roll for luck.
We're not rolling dice here.
It's not a game of poker.
It's pure execution of rationality or logic.
Yes, then smart wins every time.
I'm never going to beat the best chess engine unless I find some hack around its code where it's We recognize that, well, if you play very weird moves, ten moves in a row, it self-destructs.
And there was something that was recently discovered like that, I think, in Go.
But, yeah, go back to it.
As chess players, as champion chess players, discover to their great dismay that, you know, life is not chess.
It turns out like great chess players are no better at other things in life than anybody else, like the skills don't transfer.
I just say, look, if you just look at the society around us, what I see basically is the smart people work for the dumb people, right?
Like the PhDs, the PhDs all work for administrators and managers who are clearly not as smart as they are.
Yeah, but that's because there's so many other things going on, right?
There's, you know, the value we place on youth and physical beauty and strength and other forms of creativity and, you know, so it's just not, we care about other things and people pay attention to other things and, you know, Documentaries about physics are boring, but heist movies aren't, right?
So it's like we care about other things.
I mean, I think that doesn't make the point you want to make here.
But in the general case, can a smart person convince a dumb person of anything?
I think that's an open question.
I see a lot more cases in day-to-day life.
But persuasion, I mean, if persuasion were our only problem here, that would be a luxury.
I mean, we're not talking about just persuasion.
We're talking about machines that can autonomously do things, ultimately.
The things that we will rely on to do things, ultimately.
Yeah, but look, I just think there'll be machines that will rely on it.
Well, let me get to the second part of the argument, which is actually your chess computer thing, which is, of course, the way to beat a chess computer is to unplug it, right?
And so this is the objection, this is the objection, this is the very serious, by the way, objection to all of these kind of extrapolations, known to some people as the thermodynamic objection, which is kind of all the horror scenarios kind of spin out this thing where basically the machines become, like, all powerful and this and that, and they have control over weapons and this, and they have unlimited computing capacity, and they're
You know, completely coordinated over communications links, and they have all of these, like, real-world capabilities that basically require energy and require physical resources and require chips and circuitry and, you know, electromagnetic shielding and they have to have their own weapons arrays and they have to have their own EMPs, like, you know, kind of the, you know, you see this in the Terminator movie, like, they've got all these, like, incredible manufacturing facilities and flying aircraft and everything.
Well, the thermodynamic argument is like, yeah, once you're in that domain, the putatively hostile machines are operating with the same thermodynamic limits as the rest of us.
And this is the big argument against any of these sort of fast takeoff arguments, which is just like, yeah, I mean, let's say an AI goes rogue.
Okay, turn it off.
Okay, it doesn't want to be turned off.
Okay, fine.
Like, you know, launch an EMP.
It doesn't want EMP.
Okay, fine, bomb it.
Like, there's lots of ways to turn off systems that aren't working.
But not if we've built these things in the wild and relied on them for the better part of a decade, and now it's a question of, you know, turning off the internet, right?
Or turning off the stock market.
At a certain point, these machines will be integrated into everything.
A go-to move of any given dictator right now is to turn off the internet, right?
Like, that is absolutely something people do.
There's like a single switch.
You can turn it off for your entire country.
Yeah, but the cost to humanity of doing that is currently, I would imagine, unthinkable, right?
Like they're globally turning off the internet.
First of all, many systems fail that we can't let fail.
I mean, I think it's true.
I can't imagine it's still true.
But at one point, I think this was a story I remember from about a decade ago, There were hospitals that were so dependent on making calls to the internet that when the internet failed, people's lives were in jeopardy in the building.
It's like, we should hope we have levels of redundancy here that shield us against these bad outcomes.
But I can imagine a scenario where we have grown so dependent on the integration of intelligent increasingly intelligent systems into everything digital that there is no plug to pull.
Yeah.
I mean, again, like at some point you're just, you know, the extrapolations get kind of pretty far out there.
So let me argue one other kind of thing at you that's actually relevant to this, which you kind of did this, you did this thing, which, which, which I find kind of people tend to do, which is sort of this assumption that like all intelligence is sort of interchanged, like whatever, let me pick on the Nick Bostrom book, right?
Superintelligence book, right?
So he does this thing.
Yeah, she does a few interesting things in the book.
So one is he never quite defines what intelligence is, which is really entertaining.
And I think the reason he doesn't do that is because, of course, the whole topic makes people just incredibly upset.
And so there's a definitional issue there.
But then he does this thing where he says, notwithstanding, there's no real definition, he says there are basically many routes to artificial intelligence.
And he goes through a variety of different, you know, both computer program, you know, architectures, and then he goes through some, you know, biological, you know, kind of scenarios.
And then he does this thing where he just basically, for the rest of the book, he spins these doomsday scenarios, and he doesn't distinguish between the different kinds of artificial intelligence.
He just assumes that they're basically all going to be the same.
That book is now the basis for this AI risk movement, so that movement has taken these ideas forward.
Of course, the form of actual intelligence that we have today, that people are in Washington right now lobbying to ban or shut down or whatever, and spinning out these doomsday scenarios, is large language models.
That is actually what we have today.
You know, large language models were not an option in the Bostrom book for the form of AI because they didn't exist yet.
And it's not like there's a second edition of the book that's out that has been rewritten to take this into account.
It's just basically the same arguments apply.
And then this is my thing on the moral reasoning with LLMs.
Like the LLMs, this is where the details matter, like the LLMs actually work in a distinct way.
They work in a technically distinct way.
Their core architecture has like very specific design decisions in it for like how they work, what they do, how they operate, that is just, you know, this is the nature of the breakthrough.
That's just very different than how your self-driving car works.
That's very different than how your, you know, control system for a UAV works or whatever, your thermostat or whatever.
Like, it's a new kind of technological artifact.
It has its own rules.
It's its own world of ideas and concepts and mechanisms.
And so this is where I think, again, my point is, like, you have to, I think at some point in these conversations, you have to get to an actual discussion of the actual technology that you're talking about.
And that's why I pulled out the moral reasoning thing, is because it just, it turns out, and look, this is a big shock.
Like, nobody expected this.
I mean, this is related to the fact that somehow we have built an AI that is better at replacing white-collar work than blue-collar work.
Which is like a complete inversion off of what we all imagined.
It turns out one of the things this thing is really good at is engaging in philosophical debates.
It's a really interesting debate partner on any sort of philosophical, moral, or religious topic.
We have this artifact that's dropped into our lap in which sand and numbers have turned into something that we can argue philosophy and morals with.
It actually has very interesting views on psychology, philosophy, and morals.
And I just, like, we ought to take it seriously for what it specifically is as compared to some, you know, sort of extrapolated thing where, like, all intelligence is the same and ultimately destroys everything.
Well, I take the surprise variable there very seriously.
The fact that we wouldn't have anticipated that there's a good philosopher in that box, and all of a sudden we found one.
That, by analogy, is a cause for concern.
And actually, there's another cause for concern here, which... Can I do that one?
Yeah, go for it.
That's a cause for delight.
So that's a cause for delight.
That's an incredibly positive good news outcome.
Because the reason there's a philosopher, and this is actually very important, this is very, I think, this is maybe like the single most profound thing I've realized in the last decade or longer.
This thing is us.
This is not your scenario with alien shows.
This is not that.
This is us.
The reason this thing works, the big breakthrough, Was we loaded us into it.
We loaded the sum total of like human knowledge and expression into this thing and out the other side comes something that it's like a mirror.
Like it's like the world's biggest finest detailed mirror.
And like we walk up to it and it reflects us back at us.
And so it has the complete sum total of every, you know, at the limit, it has a complete sum total of every religious, philosophical, moral, ethical debate argument that anybody has ever had.
It has the complete sum total of all human experience, all lessons that have ever been learned.
That's incredible.
It's incredible.
Just pause for a moment and say that, and then you can talk to it.
Well, let me pause.
How great is that?
Let me pause long enough simply to send this back to you.
Sure.
How does that not nullify the comfort you take in saying that these are not evolved systems?
They're not alive.
They're not primates.
In fact, you've just described the process by which we essentially plowed all of our primate original sin into the system to make it intelligent in the first place.
No, but also all the good stuff, right?
All the good stuff, but also the bad stuff.
The amazing stuff, but, like, what's the moral of every story, right?
The moral of every story is the good guys win, right?
Like, that's the entire, like, the entire thousands of years run.
It's the old Norm Macdonald joke.
It's like, wow, it's amazing.
History books says the good guys always win.
It's all in there.
And then look, there's an aspect of this where it's easy to get kind of whammied by what it's doing, because again, it's very easy to trip the line from what I said into what I would consider to be sort of incorrect anthropomorphizing.
And I realize this gets kind of fuzzy and weird that I think there's a difference here, but I think that there is.
Let me see if I can express this.
Part of it is I know how it works, and so I don't Because I know how it works, I don't romanticize it, I guess, or at least is my own view of how I think about this, which is, I know what it's doing when it does this.
I am surprised that it can do it as well as it can, but now that it exists and I know how it works, it's like, oh, of course, and then therefore it's running this math in this way.
It's doing these probability projections.
It gives me this answer, not that answer.
By the way, you know, look, it makes mistakes.
Right, how amazing, here's the thing, how amazing it is that we built a computer that makes mistakes, right?
Like that's never happened before.
We built a machine that can create, like that's never happened before.
We built a machine that can hallucinate, that's never happened before.
So, but it's a, it's, look, it's, it's a, it's a, it's a large language model.
Like it's a very specific kind of thing.
You know, it sits there and it waits for us to like ask it a question and then it does its damnedest to try to predict the best answer.
And in doing so it reflects back everything wonderful and great that has ever been done by any human in history.
Like, it's like, it's amazing.
Except it also, as you just pointed out, it makes mistakes, it hallucinates.
Sure.
If you ask it, as I'm sure they've fixed this, you know, at least the loopholes that New York Times writer Kevin Roos found early on, I'm sure those have all been plugged.
Oh no, those are not fixed.
Those are very much not fixed.
Oh, really?
Okay.
Well, so if you perseverate in your prompts in certain ways, the thing goes haywire and starts telling you to leave your wife and it's in love with you.
And I mean, so how eager are you for that intelligence to be in control of things when it's Peppering you with insults and, I mean, just imagine, like, this is Hal that can't open the pod bay doors.
It's a nightmare if you discover in this system behavior and thought that is the antithesis of all the good stuff you thought you programmed into it.
So this is really important.
This is really important for understanding how these things work.
And this is really central.
And this is, by the way, this is new and this is amazing.
So I'm very excited about this and I'm excited to talk about it.
So there's no it to tell you to leave your wife, right?
This is what I refer to as a category error.
There's no entity that is like, wow, I wish this guy would leave his wife or I think I should tell him to leave his wife.
If you'd like to continue listening to this conversation, you'll need to subscribe at SamHarris.org.
Once you do, you'll get access to all full-length episodes of the Making Sense Podcast, along with other subscriber-only content, including bonus episodes, NAMAs, and the conversations I've been having on the Waking Up app.
Export Selection