When civic hacker Chris Whong got a full year’s worth of taxi logs from New York’s taxi commission, it seemed like a victory for data visualizers everywhere. But a new piece from Vijay Pandurangan suggests the data set is more revealing than the city initially thought. Diving into the data, Pandurangan shows how easy it is to work back from the pickup and dropoff locations and times to create a comprehensive record of each driver’s medallion number, name and a rough guess at their annual salary. The final lesson: anonymization is hard, and it takes a delicate balancing act to respect privacy in a data set big enough to hold every cab driver in New York City.
One data set doxxed every cab driver in New York


Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.
Most Popular
Most Popular
- Apple agrees to pay iPhone owners $250 million for not delivering AI Siri
- Here’s what Microsoft is offering long-serving employees to voluntarily retire
- Nintendo announces a new Star Fox for the Switch 2
- The Remarkable Paper Pure is the best digital notepad I’ve ever used
- Valve just imported 50 tons of game consoles in two days











