… when different sources (i.e. Retrosheet and MLB) record a play differently.
I’m down to the short strokes with my new parser. Rather than having to interpret myriad text descriptions of plays that don’t involve the batter-runner, I am processing the movement of players from base to base using runner events only. I’ve had to figure out how to do a secondary sort of xml child elements to group the two types of runner events in the correct sequences, and I also had to figure out how to loop through the analysis without writing to the database unless a batter-involved transition state change had occurred. However, the results have been worth it as the results are now achieved with much less code and with little to no ambiguity.
However, results are only as good as the original data … so here’s an interesting play:
In the September 17, 2016 game between the Royals and the WhiteSox, top of 4 … Todd Frazier steals second on the same pitch on which Jason Coats is called out on strikes. Did Coats strike out with a man on first (transition from 100 1 to 100 2) or with a man on second (010 1 to 010 2)? When I get some time, I’m going to try to wade through the Official Rules at MLB to see if there is a description of how this situation should be handled for scoring.
MLB has it all happening as one event, which I think is incorrect, resulting in the transition 100 1 to 010 2. Retrosheet has the strikeout occurring with a man at second, 010 1 to 010 2. Doesn’t sound like much, but to me different is different.
I’ve also found a few events where MLB doesn’t appear to have been consistent with using a separate event ID for runner events. These result in transition state changes that aren’t correct. I’d love to tell MLB about them but there doesn’t seem to be any way to do that … at least not yet.