As part of my quest to convert all of my Kaleidic functionality over to using MLB’s Gameday files, I needed to write a script to grab the relevant files so that I can use them. I decided to write this script in Python so that I can run it on the web server directly rather than on my own machine. Here is the script, for anyone who may be interested (under the fold).
January 21, 2008
January 13, 2008
For the next season of Kaleidic, I am thinking of using Python to read the XML files that MLB creates for minor league games. To that purpose, I have discovered PyXML and hope to learn more about it. There is the how-to which is on the same site, but I’m in the hunt for a good, free source of material describing how, basically, to convert XML data into database-friendly data.
January 7, 2008
One of the things that I’m very happy about regarding Major League Baseball is how open they are with their Gameday play-by-play stats. This even applies to minor league data, which is important for me.
In previous years, I have used aggregate data for Kaleidic. This was good enough, but it didn’t allow me to do things like left-right splits, determining where balls were hit, and so forth. As a result, I decided that I would try to parse the game logs for each game, which would at least give me a better idea and more granulated data. As such, I figured that I would have to write a program to parse these logs, all filled with text but in a somewhat-regular format. But after playing around with things a little bit, I discovered that the minor league information is available in XML format. Here, for example, is the first inning of that game. It’s harder to read as a person, but much easier for me to parse it in a program. In addition, I also get pitch-by-pitch data, which I will try to incorporate as much as possible.
I am still in the phase in which I’m trying to figure out what exactly is there and how best to use it, but I have some ideas. I’m really glad that I found this information, though, since the files also contain information on players including things such as pitcher handedness. This is all very exciting stuff, and I plan on writing a fairly detailed guide to everything in these files, but this will make my life much easier.
July 9, 2007
I have decided to purchase my own web domain and hosting plan to bring Kaleidic back up from the dead. I went with localdomain.tv because of three reasons:
- It is extremely easy to remember.
- It is the granddaddy of all bad UNIX jokes (in fact, my e-mail address there will be email@example.com).
- It combines my love of bad UNIX jokes with my love for Tuvalu.
I just purchased the domain and web hosting, so it probably won’t be up until tomorrow, and after that it’ll be a few days before I have a chance to do anything serious with it, but there you go. My Internets presence has made it to the big time!
For those curious, I decided to go with WebFaction for my hosting. They have Django-based hosting and a two-year plan is only $7.50 a month. In addition, they have some nice folks working there.
June 10, 2007
After a bit of unscheduled downtime, Kaleidic is back up and ready for business. I’m in the process of updating a week and a half worth of stats, and then I have to do a week and a half worth of notes. The stats part isn’t so bad, except that I had to do some manual work which took a while. But inserting all of those notes will take some time…
This may or may not impact my ability to write up a couple of posts today, but I have two good ones that I want to get out there, so we’ll see.
May 21, 2007
Lately, I’ve been including some box score synopses of Braves games over at the Kaleidoscope. Well, here’s
a box score that I’m glad I don’t have to cover. If you’re looking at the Lancaster side of things, where’s the positive? Every pitcher gave up at least three runs. Only one guy got on base twice, and he hit two singles. One fellow hit a double. Meanwhile, on the other side, the only guy who didn’t get a hit drew 3 walks, the team smacked 6 home runs and 9 doubles, and scored 30 freaking runs. I guess you can point out the bright side and say that if the game only lasted between innings 5 and 7, Lancaster would have only lost 1-0…
April 14, 2007
Project Kaleidic is now somehwat operational. I have roughly 1/2 of the pages set up, and because I have to write one more program and shell script, this means that I’m at the halfway mark for completion. I am going to celebrate this by going to sleep now knowing that my job is halfway done.
Sadly, I’m not sure how much more I’ll be able to accomplish in the very near term. Tomorrow, I am going to Basel with Jianhong for the day and will not be able to do all that much. Monday is the first day of classes. I have a Hebrew course in the morning and then fencing in the evening. I have some time to program in between, so I hopefully will get a couple of the most important pages done between now and Tuesday morning, but we’ll see. Tuesday I have some free time, though will be limited because my evening is taken up. And Wednesday I’m going to Greece. So basically, I have to make some more progress between now and Wednesday morning. Root for me.