After the deluge … dealing with changes to the gameday xml repository at MLB

I retrieve MLB’s gameday xml files on a daily basis.  Of course, at the start of the season, it’s a good idea to make sure things have worked as expected.  So, when I checked after day 1 of the season, it was a bit of a shock to see that no data had been downloaded.

Turns out, MLB made a very slight change to the syntax of directory listings from:

http://gd.mlb.com/components/game/mlb/year_2018/month_03/day_29/

to:

http://gd.mlb.com/components/game/mlb/year_2018/month_03/day_29

 

The only difference?  Missing the slash at the end, but that broke the contruction of all URLs.

If you use anything based on the scripts in Baseball Hacks, you’ll need to change your PERL scripts.

 

Here are the changes I had to make:

$dayurl = “$baseurl/year_$year/month_$mon/day_$mday/”;

to

$dayurl = “$baseurl/year_$year/month_$mon/day_$mday”;

———-

while($html =~ m/<a href=\”(gid_\w+\/)\”/g ) {

to

while($html =~ m/<a href=\”day\_[0-9]{1,2}\/(gid_\w+\/)\”/g ) {


———-

and then wherever you have:

/filename.xml”

change it to:

” . “filename.xml”

 

———-

Pretty sure that’s it.  Goodness, I thought the world had ended for a bit, but it’s all good now …

 

Published by

cap56cruncher

Long-time resident of London, Ontario - with an all-too-short diversion to Quebec City. Married to my best friend for 38 years and counting, proud father of the five nicest kids on the face of the planet, and father-in-law to a pretty nice young fellow as well.

Leave a Reply