- Aaron,Baseball-Reference is almost certainly more accurate, there are thousands of discrepancies between the official totals which is largely in the BDB and what actually happened or what the game logs show. B-R uses the Palmer database that represents a lot of additional work that Pete has performed to clarify and correct the official record.sean

---

Sean Forman

Sports Reference LLC, President

http://www.sports-reference.com/

On Mon, Apr 25, 2011 at 8:44 AM, aaron_carlisle <aaron_carlisle@...> wrote:I've googled and checked FAQs, but I can't seem to find the answer to this. If someone could either point me in the direction of an answer, or tell me which one is right, I would appreciate it.

I had a question not easily answered through the baseball-reference.com website, so I downloaded the BDB-sql-2011-03-28 SQL file. There was a discrepancy when I looked on the baseball-reference.com site for details about one of my results, and I narrowed it down to this:

From BDB-sql-2011-03-28:

mysql> select AB from Teams where name = 'New York Yankees' and yearID = '1927';

+------+

| AB |

+------+

| 5347 |

+------+

1 row in set (0.00 sec)

Looking at http://www.baseball-reference.com/teams/NYY/1927.shtml , I get 5354 total AB for the '27 Yankees (in the Team Totals line).

So, which is correct for the '27 Yankees? Did the team have 5354 or 5347 ABs? I doubt Retrosheet would be any help this far back, so I'm not sure how to find this out.

- First of all, I think Clem meant to write 5354 below instead of 5434.

And it is correct that if you look at the batting splits page for the New York Yankees that year, you get one fewer at bat, or 5353. The difference is in their game on April 20th, where officially Mike Gazella had four at-bats while we think he had only three. Other sources for the game agree that he had three at-bats and if he did have four at-bats in that game, it would have meant that his spot in the lineup came up five times, while the spots before and after him had four and three plate appearances. So I'm pretty confident that this is an official error.

In the pre-convention release, we should be releasing discrepancy files for this (and many other years) and incorporating this data on our game, team and player pages. The line in the discrepancy file for this issue looks like:

"1927A0103","gazem101",1927,"NYA","O",,"AB","PHA192704200",3,4,,,,

It contains a discrepancy ID, a retro-sheet player ID, year, team ID, "O" (for an offensive discrepancy), "AB" (stat ID), game ID, our total, the official total, and some note fields (not filled in at present).

Anyway, I figured some of you might be interested in this level of detail.

Tom Ruane

--- In baseball-databank@yahoogroups.com, "Clem Comly" <ccomly@...> wrote:

>

> First, if you sum the players ABs from batting.txt they come to 5434 which matches B-R and Retrosheet. The data Retrosheet (and I believe B-R) shows for the team total for 1927 Yankees is the sum of the players stats from Pete Palmer. I've never seen the raw Palmer data set, but I believe Palmer's data set does have separate team data from the official stats which may not show 5434. The question is which is correct total for 1927 NYA. There is no guarantee 5434 is the correct total.

>

> As a matter of fact, if one goes through the individual box scores on Retrosheet one gets 5353 ABs for 1927 Yankees. For those of us who are lazy, we can accomplish the same thing by looking at the Total line of the Retrosheet batting splits page for 1927 Yankees. By going through the splits of Yankee batters one-by-one we find Mike Gazella has 115 ABs per Palmer and 114 per Retrosheet boxes. Retrosheet hopes to soon post identified discrepancies which would allow the curious to see where the issues are and the industrious to attempt to resolve.

>

> I suspect the problem is Teams.txt is not always updated when Fielding.txt, Pitching.txt or Batting.txt are updated. There are only a few stats in Teams.txt that aren't simply the sums of individuals (Wins, L,ties,ER,ERA,SHO,DP). I see 2 courses: drop the other stat columns from Teams.txt or recalculate them all after revisions to other tables. The ER total is also a sum till the team unearned run was created by what is now rule 10.18 back in the 1960s or 1970s.

>

> Clem Comly

>

> ----- Original Message -----

> From: aaron_carlisle

> To: baseball-databank@yahoogroups.com

> Sent: Monday, April 25, 2011 8:44 AM

> Subject: [baseball-databank] Data Question

>

>

>

> I've googled and checked FAQs, but I can't seem to find the answer to this. If someone could either point me in the direction of an answer, or tell me which one is right, I would appreciate it.

>

> I had a question not easily answered through the baseball-reference.com website, so I downloaded the BDB-sql-2011-03-28 SQL file. There was a discrepancy when I looked on the baseball-reference.com site for details about one of my results, and I narrowed it down to this:

>

> From BDB-sql-2011-03-28:

> mysql> select AB from Teams where name = 'New York Yankees' and yearID = '1927';

> +------+

> | AB |

> +------+

> | 5347 |

> +------+

> 1 row in set (0.00 sec)

>

> Looking at http://www.baseball-reference.com/teams/NYY/1927.shtml , I get 5354 total AB for the '27 Yankees (in the Team Totals line).

>

> So, which is correct for the '27 Yankees? Did the team have 5354 or 5347 ABs? I doubt Retrosheet would be any help this far back, so I'm not sure how to find this out.

>