Key Takeaways
- Transformers are revolutionizing AI and will likely remain the dominant architecture, becoming a fundamental building block for future AI systems
- Specialized AI chips (ASICs) designed specifically for transformers could dramatically improve performance and efficiency compared to general-purpose GPUs
- Building massive AI data centers may be "the largest infrastructure buildout since the industrial revolution" to support future large language models
- Latency is a key bottleneck for many potential AI applications like real-time conversation, robotics, etc. Specialized chips could help address this
- The AI chip supply chain is complex, involving model companies, cloud providers, chip designers, fabs, equipment makers, etc. There are opportunities at multiple levels
- Faster chip design and manufacturing cycles will be crucial to keep pace with AI progress. Current 4-5 year timelines need to be compressed
- Data and compute are the key drivers of AI progress. Models will continue scaling up dramatically in size and capability
- Strategic leadership and technical expertise are both important for driving innovation in AI hardware
Introduction
In this episode, Patrick O'Shaughnessy interviews 21-year-old Gavin Uberti, who dropped out of Harvard to found Etched, a company developing specialized AI chips. They discuss the ongoing revolution in artificial intelligence, focusing on the chips and technology powering large language models. Gavin provides insights into the future of AI hardware, the potential for purpose-built chips, and the massive infrastructure buildout he believes is coming to support next-generation AI systems.
Topics Discussed
The Transformer Architecture and Its Dominance (9:13)
Gavin explains the basics of transformer models, which have become the dominant architecture for large language models like GPT-3 and GPT-4:
- Transformers are sequence-to-sequence models that take in a sequence of tokens and output another sequence
- They are trained on massive amounts of text data to predict the next word in a sequence
- This "pre-training" allows the model to learn general knowledge and concepts
- Further training with RLHF (Reinforcement Learning from Human Feedback) makes the model more helpful and aligned with human preferences
Gavin believes transformers will remain dominant because:
- They benefit from economies of scale - bigger models perform better
- There is now significant hardware and software optimization for transformers
- The ecosystem and tooling around transformers gives them an advantage
"The transformer is not just the next fad. This is a building block." - Gavin Uberti
The Need for Specialized AI Chips (23:18)
Gavin argues that general-purpose GPUs are not optimal for running large language models and that specialized chips (ASICs) designed specifically for transformers could provide massive performance gains:
- Latency is a key bottleneck for many potential AI applications like real-time conversation, robotics, etc.
- General-purpose GPUs dedicate only a small portion of their transistors to the core matrix multiplication operations needed for transformers
- A specialized chip could have 10x or more raw compute power by optimizing the entire chip for transformer operations
- This could enable 20x better latency than current GPUs
"We will just have so much more raw compute, substantially more than an order of magnitude. That's not just for compute, but also for latency." - Gavin Uberti
The Future AI Infrastructure Buildout (28:38)
Gavin predicts a massive buildout of AI infrastructure to support future large language models:
- Training GPT-5 and beyond may require 10-100x more compute power than current models
- This could mean 2 gigawatt data centers consuming the entire output of a large power plant
- New challenges around cooling, redundancy, and reliability will need to be solved
- The scale may rival semiconductor fabs in complexity and cost ($10-20 billion facilities)
"I really think a whole set of industries is going to evolve around building these massive, massive AI models. We've only seen the beginning of it all." - Gavin Uberti
AI Model Training and Inference (31:54)
Gavin explains some key concepts around training and running (inference) of large language models:
- Training requires running the model forwards and backwards to compute gradients, roughly doubling the compute needed vs inference
- Inference can be optimized by batching multiple queries together to amortize the cost of loading model weights from memory
- Specialized chips could enable much larger batch sizes (2500 vs 64) for inference, dramatically improving efficiency
- Both training and inference may be centralized in massive data centers to leverage economies of scale
The AI Chip Design and Manufacturing Process (35:59)
Gavin walks through the complex process of designing and manufacturing a new chip:
- Defining the problem and overall chip architecture
- Designing individual functional blocks
- Writing RTL code to define the logic
- Extensive verification and testing
- Physical layout and manufacturing preparation
- Fabrication at a foundry like TSMC
He notes this process typically takes 4-5 years but believes it could potentially be compressed to a few months with automation and AI assistance in chip design.
The AI Technology Stack and Key Players (56:05)
Gavin outlines the key layers of the AI technology stack and some of the major players:
- AI Models: OpenAI, Anthropic, Google, etc.
- Cloud/Infrastructure: Microsoft, Google, Amazon
- Chips: Nvidia, AMD, startups like Etched
- Chip Manufacturing: TSMC, Samsung
- Equipment: ASML
- End User Applications: Various startups and incumbents
He sees opportunities for innovation at multiple levels of this stack.
Strategic Leadership in AI Hardware (1:07:33)
Gavin reflects on the role of strategic leadership in driving AI hardware innovation:
- Setting the overall vision
- Recruiting top talent
- Securing adequate funding
- Understanding technical details to make informed decisions
"The role of a good leader is to set the vision, get the right people in those chairs, and not run out of money. That's it." - Gavin Uberti
The Future of AI Models and Data (1:16:49)
Gavin shares his thoughts on how AI models may evolve:
- A few massive, general-purpose models trained by leaders like OpenAI
- Companies fine-tuning these base models on proprietary data for specific use cases
- Small edge models for low-latency tasks on devices
- Little market for "medium-sized" models between edge and massive data center models
He also believes video data will be crucial for future AI training, potentially enabling more advanced reasoning and world modeling capabilities.
Conclusion
Gavin Uberti provides a compelling vision for the future of AI hardware, emphasizing the potential for specialized chips to dramatically improve performance and efficiency. He sees a massive buildout of AI infrastructure on the horizon to support increasingly large and capable models. While acknowledging the complexity and challenges involved, Gavin is optimistic about the transformative potential of AI and excited to be working on the hardware that will help enable it. His insights highlight the critical role that advances in computing technology will play in shaping the future of artificial intelligence.