Okay, so I’ve been meaning to dive into some sports data, and the 2017 U.S. Open Golf seemed like a good place to start. I’m no pro golfer, just a guy who likes messing around with information.

First, I grabbed the data. I poked around and found some pretty detailed records of the tournament. It was all kind of scattered, though, so I had to spend some time cleaning it up. You know, making sure the names matched, the scores were consistent, that sort of thing. Nothing fancy, just basic data tidying.
Getting My Hands Dirty
- I used a simple spreadsheet program to handle all this.
- I started by making sure all the player names were spelled the same way across all the data sources. You wouldn’t believe how many variations there are!
- Then I double-checked all the scores. It’s easy for a number to get entered wrong, so I wanted to be extra sure.
Next I started play with data.I wanted to see who had the best rounds, who struggled on certain holes, you know that stuff.
It’s pretty cool what you can find when you dig into the numbers.
For example I figured out which players had the biggest improvements and stumbles between rounds.
Wrapping Up
Honestly, the whole process was more about the journey than the destination. I learned a ton about the tournament and about how to handle data, even if it’s just for fun.

It’s not exactly rocket science, but it’s satisfying to take a bunch of messy information and turn it into something you can understand and play with.