December 12, 2023 • 1hr 23min

Gavin Uberti - Real-Time AI & The Future of AI Hardware - [Invest Like the Best, EP.356]

Invest Like the Best with Patrick O'Shaughnessy

In this episode, Patrick O'Shaughnessy interviews 21-year-old Gavin Uberti, who dropped out of Harvard to found Etched, a company developing specialized AI chips. They discuss the ongoing revolution in artificial intelligence, focusing on the chips and technology powering large language models. Gavin provides insights into the future of AI hardware, the potential for purpose-built chips, and the massive infrastructure buildout he believes is coming to support next-generation AI systems.

0:00

-0:00

Key Takeaways

Transformers are revolutionizing AI and will likely remain the dominant architecture, becoming a fundamental building block for future AI systems
Specialized AI chips (ASICs) designed specifically for transformers could dramatically improve performance and efficiency compared to general-purpose GPUs
Building massive AI data centers may be "the largest infrastructure buildout since the industrial revolution" to support future large language models
Latency is a key bottleneck for many potential AI applications like real-time conversation, robotics, etc. Specialized chips could help address this
The AI chip supply chain is complex, involving model companies, cloud providers, chip designers, fabs, equipment makers, etc. There are opportunities at multiple levels
Faster chip design and manufacturing cycles will be crucial to keep pace with AI progress. Current 4-5 year timelines need to be compressed
Data and compute are the key drivers of AI progress. Models will continue scaling up dramatically in size and capability
Strategic leadership and technical expertise are both important for driving innovation in AI hardware

Introduction

In this episode, Patrick O'Shaughnessy interviews 21-year-old Gavin Uberti, who dropped out of Harvard to found Etched, a company developing specialized AI chips. They discuss the ongoing revolution in artificial intelligence, focusing on the chips and technology powering large language models. Gavin provides insights into the future of AI hardware, the potential for purpose-built chips, and the massive infrastructure buildout he believes is coming to support next-generation AI systems.

Topics Discussed

The Transformer Architecture and Its Dominance (9:13)

Gavin explains the basics of transformer models, which have become the dominant architecture for large language models like GPT-3 and GPT-4:

Transformers are sequence-to-sequence models that take in a sequence of tokens and output another sequence
They are trained on massive amounts of text data to predict the next word in a sequence
This "pre-training" allows the model to learn general knowledge and concepts
Further training with RLHF (Reinforcement Learning from Human Feedback) makes the model more helpful and aligned with human preferences

Gavin believes transformers will remain dominant because:

They benefit from economies of scale - bigger models perform better
There is now significant hardware and software optimization for transformers
The ecosystem and tooling around transformers gives them an advantage

"The transformer is not just the next fad. This is a building block." - Gavin Uberti

The Need for Specialized AI Chips (23:18)

Gavin argues that general-purpose GPUs are not optimal for running large language models and that specialized chips (ASICs) designed specifically for transformers could provide massive performance gains:

Latency is a key bottleneck for many potential AI applications like real-time conversation, robotics, etc.
General-purpose GPUs dedicate only a small portion of their transistors to the core matrix multiplication operations needed for transformers
A specialized chip could have 10x or more raw compute power by optimizing the entire chip for transformer operations
This could enable 20x better latency than current GPUs

"We will just have so much more raw compute, substantially more than an order of magnitude. That's not just for compute, but also for latency." - Gavin Uberti

The Future AI Infrastructure Buildout (28:38)

Gavin predicts a massive buildout of AI infrastructure to support future large language models:

Training GPT-5 and beyond may require 10-100x more compute power than current models
This could mean 2 gigawatt data centers consuming the entire output of a large power plant
New challenges around cooling, redundancy, and reliability will need to be solved
The scale may rival semiconductor fabs in complexity and cost ($10-20 billion facilities)

"I really think a whole set of industries is going to evolve around building these massive, massive AI models. We've only seen the beginning of it all." - Gavin Uberti

AI Model Training and Inference (31:54)

Gavin explains some key concepts around training and running (inference) of large language models:

Training requires running the model forwards and backwards to compute gradients, roughly doubling the compute needed vs inference
Inference can be optimized by batching multiple queries together to amortize the cost of loading model weights from memory
Specialized chips could enable much larger batch sizes (2500 vs 64) for inference, dramatically improving efficiency
Both training and inference may be centralized in massive data centers to leverage economies of scale

The AI Chip Design and Manufacturing Process (35:59)

Gavin walks through the complex process of designing and manufacturing a new chip:

Defining the problem and overall chip architecture
Designing individual functional blocks
Writing RTL code to define the logic
Extensive verification and testing
Physical layout and manufacturing preparation
Fabrication at a foundry like TSMC

He notes this process typically takes 4-5 years but believes it could potentially be compressed to a few months with automation and AI assistance in chip design.

The AI Technology Stack and Key Players (56:05)

Gavin outlines the key layers of the AI technology stack and some of the major players:

AI Models: OpenAI, Anthropic, Google, etc.
Cloud/Infrastructure: Microsoft, Google, Amazon
Chips: Nvidia, AMD, startups like Etched
Chip Manufacturing: TSMC, Samsung
Equipment: ASML
End User Applications: Various startups and incumbents

He sees opportunities for innovation at multiple levels of this stack.

Strategic Leadership in AI Hardware (1:07:33)

Gavin reflects on the role of strategic leadership in driving AI hardware innovation:

Setting the overall vision
Recruiting top talent
Securing adequate funding
Understanding technical details to make informed decisions

"The role of a good leader is to set the vision, get the right people in those chairs, and not run out of money. That's it." - Gavin Uberti

The Future of AI Models and Data (1:16:49)

Gavin shares his thoughts on how AI models may evolve:

A few massive, general-purpose models trained by leaders like OpenAI
Companies fine-tuning these base models on proprietary data for specific use cases
Small edge models for low-latency tasks on devices
Little market for "medium-sized" models between edge and massive data center models

He also believes video data will be crucial for future AI training, potentially enabling more advanced reasoning and world modeling capabilities.

Conclusion

Gavin Uberti provides a compelling vision for the future of AI hardware, emphasizing the potential for specialized chips to dramatically improve performance and efficiency. He sees a massive buildout of AI infrastructure on the horizon to support increasingly large and capable models. While acknowledging the complexity and challenges involved, Gavin is optimistic about the transformative potential of AI and excited to be working on the hardware that will help enable it. His insights highlight the critical role that advances in computing technology will play in shaping the future of artificial intelligence.