my data projects

where I’m getting the data I’m mining

update 10-23-2016

I’ll have a longer post for what I’ve been working on:

I have a good data set from retrosheet.org that I have converted into a database in excel, a pivot table and have been working in Jupyter Notebook.

So: I’ve taken the data through scraping with powershell, Access, Excel, Python and am working on R and KNIME next.

—–

update 10-16-2016

While combing through the data, I’ve been looking for the easiest way to add headers. I needed to add quotes and a comma via vba for an excel macro I was working on. I found the easiest way to do that on this page:

Go to format-> custom

and enter this in the “type” field \”@\”,

————————

The beautiful thing about using Major League Baseball data is that you can cross reference it over a ton of different variables and fields.

I’ve recently downloaded a bunch of tables from http://www.retrosheet.org/gamelogs/index.html

Keep an eye out on what I do with the data via SQL, Python, R, etc.

Leave a Reply