D
Today’s Vergecast: How to train your data.
Training data is the raw material of the AI industry. Claude, ChatGPT, Gemini, and the rest are built on top of oceans of stuff. What is that stuff? Books. Blog posts. YouTube videos. News articles. All of it and more, in virtually incomprehensible quantities. Alex Reisner, a staff writer at The Atlantic who has been investigating training data, explains how AI companies get all this data, why they’d really prefer you not know what’s in it, and whether training data could ever be a fair trade.
Watch | Listen | Get ad-free
Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.
Loading comments
Getting the conversation ready...
Most Popular
Most Popular
- Why is Apple asking me to pay more for Big Tech’s AI obsession?
- Anthropic’s Mythos 5 is back
- After covering Prime Day for 36 hours over four days, this is the one thing I bought
- Meta launches cheaper smart glasses without Ray-Ban
- It’s the last day of Prime Day — here are over 140 great deals to choose from











