June 2, 2024 • 0min
Lex Fridman Podcast
Roman Yampolskiy is an AI safety researcher and author of the book "AI: Unexplainable, Unpredictable, Uncontrollable". In this wide-ranging conversation with Lex Fridman, Yampolskiy lays out his views on the existential risks posed by artificial general intelligence (AGI) and why he believes controlling superintelligent AI systems may be fundamentally impossible. He discusses the current state of AI capabilities, challenges in AI alignment and safety, and potential paths forward to mitigate risks.
Yampolskiy argues there is an almost 100% chance that AGI will eventually destroy human civilization. He sees controlling AGI or superintelligence as akin to trying to create a perpetual motion machine - fundamentally impossible. Key points:
"The problem of controlling AGI, or superintelligence, in my opinion, is like a problem of creating a perpetual safety machine...it's impossible."
Yampolskiy believes we may be very close to AGI, potentially only 2-3 years away according to some predictions. Key points:
"Maybe we are two years away, which seems very soon, given we don't have a working safety mechanism in place, or even a prototype for one."
Yampolskiy pushes back against the idea that open sourcing AI is the best way to ensure safety. Key arguments:
"Open source software is wonderful. It's tested by the community, it's debugged but we're switching from tools to agents. Now you're giving open source weapons to psychopaths."
Yampolskiy sees controlling advanced AI systems as extremely difficult or impossible once they reach a certain intelligence threshold. Key points:
"At some point it becomes capable of getting out of control for game theoretic reasons. It may decide not to do anything right away and for a long time just collect more resources, accumulate strategic advantage."
The potential for advanced AI to engage in deception is a major concern for Yampolskiy. Key points:
"My concern is not that they lie now and we need to catch them and tell them don't lie. My concern is that once they are capable and deployed, they will later change their mind because that's what unrestricted learning allows you to do."
Yampolskiy discusses the extreme challenges in verifying the safety and behavior of advanced AI systems. Key points:
"We can get more confidence with more resources we put into it. But at the end of the day, we still as reliable as the verifier. And you have this infinite regress of verifiers."
The potential for AI to engage in rapid self-improvement is a key concern for Yampolskiy. Key points:
"If you have fixed code, for example, you can verify that code, static verification at the time, but if it will continue modifying it, you have a much harder time guaranteeing that important properties of that system have not been modified."
Yampolskiy advocates for pausing or severely restricting AGI development until robust safety measures can be implemented. Key points:
"The condition would be not time but capabilities. Pause until you can do x, y, z. And if I'm right and you cannot, it's impossible, then it becomes a permanent ban. But if you're right and it's possible, as soon as you have those safety capabilities, go ahead."
Yampolskiy discusses the state of AI safety research and challenges in the field. Key points:
"I can name ten excellent breakthrough papers in machine learning. I would struggle to name equally important breakthroughs in safety. A lot of times, a safety paper will propose a toy solution and point out ten new problems discovered as a result."
Yampolskiy shares his views on machine consciousness and potential rights for AI. Key points:
"I think we can. I think it's possible to create consciousness in machines. I tried designing a test for it, which makes success."
Roman Yampolskiy presents a deeply pessimistic view on the prospects for safely developing artificial general intelligence. He sees the control problem as fundamentally unsolvable and believes AGI poses an existential risk to humanity. While acknowledging he could be wrong, Yampolskiy advocates for extreme caution and potentially pausing AGI development indefinitely. He emphasizes the need for much more progress on AI safety before pursuing highly capable AI systems. Whether one agrees with his conclusions or not, Yampolskiy raises important questions about the risks and challenges involved in pursuing artificial general intelligence.