Loading ...
Sorry, an error occurred while loading the content.

Re: [baseball-databank] Re: Newbie Question on IDs

Expand Messages
  • Sean Forman
    ... I think it can be summarized as not everyone in the HOF was a player and many of these people (Yawkey, Barrow, etc.) were never players. And it is
    Message 1 of 28 , Nov 17, 2004
    • 0 Attachment
      > Whether the key identifier for someone in the master table is an integer or
      > a string, it's gotta be one value that never changes. I propose that the
      > master table be simplified to remove the managerID and hofID fields and
      > consolidate them into one playerID field. If anyone disagrees, please tell
      > me what I'm missing.
      >
      > Regards,
      > Sean Lahman


      I think it can be summarized as not everyone in the HOF was a player and
      many of these people (Yawkey, Barrow, etc.) were never players. And it
      is uncertain how the HOF will run things in the future. The HOFid was
      done to be consistent with ideas for adding additional data such as
      manager data, minor league data, japanese data, korean leagues data,
      cuban league data, independent league data, umpire data, coaches data,
      executives data, broadcaster data, negro league data, winter league
      data, spring training data, aagpbl data, college data, hs data and any
      other possible data sets. HOFers do not fit neatly within any other group.

      Also, why does the hofID violate relational db? If the writers ever
      vote for non-players the system as designed will be fine. The playerID
      is for players who appeared in an mlb game (people in the fielding,
      batting, pitching tables). the HOFID is for a person who received an
      HOF vote, there is no guarantee that one set will always be included in
      the other. The managerID is for the people who have managed a game.
      I'd be mildly surprised if a non-player has never appeared on a writer's
      ballot (even if only as an uncounted protest vote).

      We have an imperfect system for imperfect data. And any system we
      choose will have its flaws. The point is not to have a perfectly
      correct system, but to distribute data to the widest possible audience.
      I have asked for feedback on the 2004 data and have received very
      little feedback (thank you emancip8ed for your comments (any nulls are
      data I'm hoping to have filled in)).

      One other thing. I don't really understand the fanaticism for strictly
      numerical keys.

      I know tangotiger (Tom M??) wants to get the parks data into the db
      asap. What other data would you like to see done in the next two
      months? I'll prioritize, and if I don't get anything done in the next
      two months, I'll set up a web account on my server, step aside, and let
      someone else take the reins. I'm under the impression that that
      postseason data is being produced. Is this correct?

      Sorry for the rant.

      Sincerely,
      Sean Forman

      Baseball Stats! http://www.Baseball-Reference.com/
    • Sean Lahman
      ... Yes. I ve also done updates for the four awards tables, HOF, AllStars, and the SeriesPost table. I fixed the problems with place of birth info for people
      Message 2 of 28 , Nov 17, 2004
      • 0 Attachment
        Sean Forman wrote:
        > I'm under the impression that that
        > postseason data is being produced. Is this correct?

        Yes. I've also done updates for the four awards tables, HOF, AllStars, and
        the SeriesPost table. I fixed the problems with place of birth info for
        people who debuted in 2004. I'll be done and post copies to the site
        sometime Thursday. I believe that will complete all of the updates needed
        for the end of 2004 season.

        I think with that, it's time to prepare the release of the full 2004
        database. Other projects, such as additional data tables and any proposed
        re-designs can be tackled over the winter.

        Regards,
        Sean Lahman
      • Tangotiger
        ... It doesn t violate relational DB, but it probably violates 3NF. You shouldn t have raineti01, and raineti01h in the same record. Every field must stand on
        Message 3 of 28 , Nov 17, 2004
        • 0 Attachment
          --- Sean Forman <sean-forman@...>
          wrote:
          > Also, why does the hofID violate relational db? If

          It doesn't violate relational DB, but it probably
          violates 3NF. You shouldn't have
          raineti01, and raineti01h
          in the same record.
          Every field must stand on its own, and not be a
          concatenation of some other set of fields in the same
          record. Certainly not the key field. You also want a
          numeric key for any table that has a child table,
          which just makes it convenient to have a numeric key
          in all tables. I wouldn't have BOS as a park key
          either. My rule of thumb is to avoid any possibility
          of a key change. This is all from an admin DB
          viewpoint, which brings me to...


          > We have an imperfect system for imperfect data.
          > And any system we
          > choose will have its flaws. The point is not to
          > have a perfectly
          > correct system, but to distribute data to the widest
          > possible audience.

          ...The current proposal:
          http://www.baseball-databank.org/purpose.txt
          specifically says:

          "This databank, once it is fully normalized and
          proofed, will be the standard source for those
          professionals creating new data products... But first
          and foremost, the Baseball Databank is a library of
          authoritative baseball statistics and information
          maintained in a simple-to-access format for
          information providers and baseball researchers."

          Fully normalized is presumably 3NF. Reading the
          above, and my impression is that the DB is for
          developers (information providers and researchers) and
          not the widest audience possible.

          It is the information providers (presumably Sean,
          Sean, Westbay, and others) that will make a form of
          this database available to the widest audience
          possible, with likely redundant tables, alphanumeric
          keys, and other things that improve usability, but
          degrade normalization.


          > I have asked for feedback on the 2004 data and
          > have received very
          > little feedback (thank you emancip8ed for your
          > comments (any nulls are
          > data I'm hoping to have filled in)).

          I loaded the data, and have only come across the NULL
          situation. I have not tested the data, but so far so
          good.

          >
          > One other thing. I don't really understand the
          > fanaticism for strictly
          > numerical keys.

          I hope you do now!!

          >
          > I know tangotiger (Tom M??) wants to get the parks
          > data into the db
          > asap. What other data would you like to see done in
          > the next two
          > months? I'll prioritize, and if I don't get

          Probably Ruane's upload, though I haven't verified if
          that was done yet, since I only downloaded the 04
          data.


          > anything done in the next
          > two months, I'll set up a web account on my server,
          > step aside, and let
          > someone else take the reins. I'm under the
          > impression that that
          > postseason data is being produced. Is this correct?

          I supplied KJOK with the fielding data, and presumably
          he will attach IDs to the names.

          >
          > Sorry for the rant.
          >
          > Sincerely,
          > Sean Forman

          Rants are encouraged.

          Tom
          --------------------------------
          http://www.tangotiger.net








          __________________________________
          Do you Yahoo!?
          The all-new My Yahoo! - Get yours free!
          http://my.yahoo.com
        • Derek Adair
          Tom/Sean/anyone else who s interested, I wasn t sure what I was missing either, so I looked through the archives, since we obviously made the changes for a
          Message 4 of 28 , Nov 17, 2004
          • 0 Attachment
            Tom/Sean/anyone else who's interested,

            I wasn't sure what I was missing either, so I looked through the archives,
            since we obviously made the changes for a reason.

            There are at least two distinct issues here. I refer to:

            http://sports.groups.yahoo.com/group/baseball-databank/message/568 (and
            the following few messages in the thread)

            First is the issue of numeric vs. string ID's. I honestly haven't seen a
            compelling reason to switch. The two reasons I've seen are that numeric
            ID's won't need to change, and that that's the way DB's are generally
            designed. As previously addressed, I think we have enough differences from
            a typical business DB to need to handle things a bit differently, and a
            retrosheet approach where string ID's are fixed could work for us as well.
            On the flip side, I see visual inspection of the data will be much more
            difficult with numeric keys. I'm completely open to being convinced here,
            so if anyone has concrete benefits I'm all ears.

            Second (and most important) is the issue of what ID's we'll use in the
            "core" BDB tables. Sean L.'s proposal was to collapse the ID's in Master.
            If we have only a single primary key across the code DB, then we end up
            with the problem that post 568 describes.

            I think we're getting enough secondary source input that having the person
            table makes sense, but I'd also propose that we handle our own tables the
            same way - each logical grouping has its own primary key set which is
            linked to through the person table. So instead of collapsing master, I'd
            shift everything over and keep it as is (except for possible changes under
            the first issue).

            Regards,
            Derek
          • Tangotiger
            Can I suggest that design issues be brought up with in the BDB Design group? Michael has done some tremendous work, and we ve gone through many of these
            Message 5 of 28 , Nov 18, 2004
            • 0 Attachment
              Can I suggest that design issues be brought up with in
              the BDB Design group? Michael has done some
              tremendous work, and we've gone through many of these
              issues. The group itself has been dormant for a
              while, but it might be time for it to erupt again.

              After we've done a good design, we can propose it
              here, and get feedback as to how to proceed.

              This will also stay in line with Lahman's request that
              we hold off on design issues for the next few months.

              Tom





              --- Derek Adair <dadair@...> wrote:

              > Tom/Sean/anyone else who's interested,
              >
              > I wasn't sure what I was missing either, so I looked
              > through the archives,
              > since we obviously made the changes for a reason.
              >
              > There are at least two distinct issues here. I refer
              > to:
              >
              >
              http://sports.groups.yahoo.com/group/baseball-databank/message/568
              > (and
              > the following few messages in the thread)
              >
              > First is the issue of numeric vs. string ID's. I
              > honestly haven't seen a
              > compelling reason to switch. The two reasons I've
              > seen are that numeric
              > ID's won't need to change, and that that's the way
              > DB's are generally
              > designed. As previously addressed, I think we have
              > enough differences from
              > a typical business DB to need to handle things a bit
              > differently, and a
              > retrosheet approach where string ID's are fixed
              > could work for us as well.
              > On the flip side, I see visual inspection of the
              > data will be much more
              > difficult with numeric keys. I'm completely open to
              > being convinced here,
              > so if anyone has concrete benefits I'm all ears.
              >
              > Second (and most important) is the issue of what
              > ID's we'll use in the
              > "core" BDB tables. Sean L.'s proposal was to
              > collapse the ID's in Master.
              > If we have only a single primary key across the code
              > DB, then we end up
              > with the problem that post 568 describes.
              >
              > I think we're getting enough secondary source input
              > that having the person
              > table makes sense, but I'd also propose that we
              > handle our own tables the
              > same way - each logical grouping has its own primary
              > key set which is
              > linked to through the person table. So instead of
              > collapsing master, I'd
              > shift everything over and keep it as is (except for
              > possible changes under
              > the first issue).
              >
              > Regards,
              > Derek
              >




              __________________________________
              Do you Yahoo!?
              The all-new My Yahoo! - Get yours free!
              http://my.yahoo.com
            • Sean Forman
              ... I really, really think this is an invitation for transposition errors being added into the db. For instance, when the SABR biographical committee sends out
              Message 6 of 28 , Nov 19, 2004
              • 0 Attachment
                > >
                > > One other thing. I don't really understand the
                > > fanaticism for strictly
                > > numerical keys.
                >
                > I hope you do now!!


                I really, really think this is an invitation for transposition errors
                being added into the db.

                For instance, when the SABR biographical committee sends out their
                bi-monthly files. They list about 80 or so players who have died or for
                whom new data has been located. Here are the steps that would need to
                be taken for these changes to be incorporated.

                select ID, nameLast, nameFirst, debut from Master where nameLast='smith'
                and nameFirst='john';

                Then figure out which numeric key is the right one.

                update deathState='NY', deathCountry='USA', deathCity='brooklyn' where
                ID='015324';

                I know with the places table being proposed, I might even have to look
                up numeric values for the state, country and city. Would I have to look
                up new values for the year, month and day as well? ;-)

                I just think the likelihood of errors probably goes up by a factor of
                ten with this scheme.

                For a change to the batting table which often happens when the records
                committee sends out their newsletter.

                select ID, nameLast, nameFirst, debut from Master where nameLast='aaron'
                and nameFirst like 'h%';

                select teamID from Teams where teamCity='Milwaukee' and yearID=1961;

                update Batting set HR=12 where yearID=1961 and teamID=23 and ID='000121'
                and stint=1;

                Maybe I'm just being a whiner, but I see this as being less than ideal.

                --
                Sincerely,
                Sean Forman

                Baseball Stats! http://www.Baseball-Reference.com/
              • Tangotiger
                ... Au contraire! UPDATE PERSONBIO p INNER JOIN LOCATION l SET p.stateid = l.stateid where l.statename = FL ; You could make it even better by having a
                Message 7 of 28 , Nov 19, 2004
                • 0 Attachment
                  --- Sean Forman <sean-forman@...>
                  wrote:
                  > I really, really think this is an invitation for
                  > transposition errors
                  > being added into the db.

                  Au contraire!

                  UPDATE PERSONBIO p
                  INNER JOIN LOCATION l
                  SET p.stateid = l.stateid
                  where l.statename = 'FL';

                  You could make it even better by having a TRANSACTIONS
                  table, and add a third join, instead of the hardcoded
                  'FL'.

                  Tom


                  __________________________________________________
                  Do You Yahoo!?
                  Tired of spam? Yahoo! Mail has the best spam protection around
                  http://mail.yahoo.com
                • Sean Lahman
                  Just noticed today that the stint IDs for Juan Encarnacion s 2004 season are reversed. He was traded from Florida to LA, so Fla should be #1 and LA #2.
                  Message 8 of 28 , Nov 19, 2004
                  • 0 Attachment
                    Just noticed today that the stint IDs for Juan Encarnacion's 2004 season are
                    reversed. He was traded from Florida to LA, so Fla should be #1 and LA #2.
                    Currently, it is reveresed.

                    Regards,
                    Sean Lahman
                  • Richard Shoults
                    Respectfully, no. In 2004 he was traded from LA to FLA where he ended the season. Rick Sean Lahman wrote: Just noticed today that the
                    Message 9 of 28 , Nov 19, 2004
                    • 0 Attachment
                      Respectfully, no.  In 2004 he was traded from LA to FLA where he ended the season.
                       
                      Rick

                      Sean Lahman <slahman@...> wrote:

                      Just noticed today that the stint IDs for Juan Encarnacion's 2004 season are
                      reversed. He was traded from Florida to LA, so Fla should be #1 and LA #2.
                      Currently, it is reveresed.

                      Regards,
                      Sean Lahman




                      ------------------------ Yahoo! Groups Sponsor --------------------~-->
                      $9.95 domain names from Yahoo!. Register anything.
                      http://us.click.yahoo.com/J8kdrA/y20IAA/yQLSAA/VCUolB/TM
                      --------------------------------------------------------------------~->

                      http://www.baseball-databank.org/
                      Yahoo! Groups Links

                      <*> To visit your group on the web, go to:
                      http://groups.yahoo.com/group/baseball-databank/

                      <*> To unsubscribe from this group, send an email to:
                      baseball-databank-unsubscribe@yahoogroups.com

                      <*> Your use of Yahoo! Groups is subject to:
                      http://docs.yahoo.com/info/terms/




                    • Sean Lahman
                      I noticed today that there are about 800 players in the master table who don t have a debut date. On closer inspection, most of these players made their debut
                      Message 10 of 28 , Nov 19, 2004
                      • 0 Attachment
                        I noticed today that there are about 800 players in the master table who
                        don't have a debut date. On closer inspection, most of these players made
                        their debut between 1998 and 2001. Right now, there are less than two dozen
                        players in the table with a value in the debut field for that five year
                        period.

                        I'll take a look at resolving this issue, but just wanted to alert everybody
                        to the problem.

                        Regards,
                        Sean Lahman
                      • Derek Adair
                        Were these in the fixes from Michael Mavriogannis I asked about a couple of days ago? http://sports.groups.yahoo.com/group/baseball-databank/message/2089
                        Message 11 of 28 , Nov 19, 2004
                        • 0 Attachment
                          Were these in the fixes from Michael Mavriogannis I asked about a couple
                          of days ago?

                          http://sports.groups.yahoo.com/group/baseball-databank/message/2089

                          Regards,
                          Derek

                          P.S> This is where CVS/SVN would come in *very* handy.

                          On Fri, 19 Nov 2004, Sean Lahman wrote:

                          >
                          > I noticed today that there are about 800 players in the master table who
                          > don't have a debut date. On closer inspection, most of these players made
                          > their debut between 1998 and 2001. Right now, there are less than two dozen
                          > players in the table with a value in the debut field for that five year
                          > period.
                          >
                          > I'll take a look at resolving this issue, but just wanted to alert everybody
                          > to the problem.
                          >
                          > Regards,
                          > Sean Lahman
                          >
                          >
                          >
                          >
                          >
                          >
                          > http://www.baseball-databank.org/
                          > Yahoo! Groups Links
                          >
                          >
                          >
                          >
                          >
                          >
                          >
                          >
                        • piratefan1@bellsouth.net
                          ... ended the season. ... season are ... LA #2. ... Rick is correct. Encarnacion was traded by Florida to the Dodgers for a PTBNL before the season started,
                          Message 12 of 28 , Nov 19, 2004
                          • 0 Attachment
                            Richard Shoults wrote:
                            > Respectfully, no. In 2004 he was traded from LA to FLA where he
                            ended the season.
                            >
                            > Rick
                            >
                            > Sean Lahman <slahman@b...> wrote:
                            >
                            > Just noticed today that the stint IDs for Juan Encarnacion's 2004
                            season are
                            > reversed. He was traded from Florida to LA, so Fla should be #1 and
                            LA #2.
                            > Currently, it is reveresed.

                            Rick is correct. Encarnacion was traded by Florida to the Dodgers for
                            a PTBNL before the season started, then traded back to the Marlins in
                            the LoDuca trade.

                            Mike Emeigh
                            piratefan1@...
                          • Sean Lahman
                            Thanks, Mike & Rick, for correcting my mistake. --SL
                            Message 13 of 28 , Nov 19, 2004
                            • 0 Attachment
                              Thanks, Mike & Rick, for correcting my mistake.
                              --SL

                              > -----Original Message-----
                              > From: piratefan1@... [mailto:piratefan1@...]
                              > Sent: Friday, November 19, 2004 3:52 PM
                              > To: baseball-databank@yahoogroups.com
                              > Subject: [baseball-databank] Re: ERROR stint ID for Juan Encarnacion
                              >
                              >
                              >
                              >
                              > Richard Shoults wrote:
                              > > Respectfully, no. In 2004 he was traded from LA to FLA where he
                              > ended the season.
                              > >
                              > > Rick
                              > >
                              > > Sean Lahman <slahman@b...> wrote:
                              > >
                              > > Just noticed today that the stint IDs for Juan Encarnacion's 2004
                              > season are
                              > > reversed. He was traded from Florida to LA, so Fla should be #1 and
                              > LA #2.
                              > > Currently, it is reveresed.
                              >
                              > Rick is correct. Encarnacion was traded by Florida to the Dodgers for
                              > a PTBNL before the season started, then traded back to the Marlins in
                              > the LoDuca trade.
                              >
                              > Mike Emeigh
                              > piratefan1@...
                              >
                            • Tangotiger
                              Here s one more thought on alphanumerics: what do you do with Japanese, Chinese, and french players? We ve always been thinking of this as english and
                              Message 14 of 28 , Nov 19, 2004
                              • 0 Attachment
                                Here's one more thought on alphanumerics:

                                what do you do with Japanese, Chinese, and french
                                players? We've always been thinking of this as
                                english and unilingual.

                                This is why having numeric keys is great. Every
                                language accepts the numeral system. Every table will
                                then have translation fields. Say you have a LOCATION
                                table, you'd have something like:
                                locID,locType,engName,japName,chiname,freName
                                673,1,USA,%$#,*&^,Etats Unis
                                or some such.

                                Same thing with people's names. You could potentially
                                have numerics for the names, which maps to a table of
                                names that are translatable to various languages.

                                What we have the potential for here is a fully
                                normalized DB that is expandable across the world.
                                Imagine World Cup 2006 using our database, because
                                it's the only DB that supports all the languages of
                                the countries in the World Cup.

                                Imagine being a french visitor going to b-r.com or
                                Retrosheet.org, and they click the french link, and in
                                one fell swoop, have the page redisplayed in the
                                language of the visitor. HR becomes CC, SP becomes
                                LP, etc.

                                Tom

                                __________________________________________________
                                Do You Yahoo!?
                                Tired of spam? Yahoo! Mail has the best spam protection around
                                http://mail.yahoo.com
                              • Sean Lahman
                                Derek Adair wrote... ... I m beginning to see that. I have Subversion software installed on my server, and I ll take a look over the weekend at setting things
                                Message 15 of 28 , Nov 19, 2004
                                • 0 Attachment
                                  Derek Adair wrote...
                                  >
                                  > This is where CVS/SVN would come in *very* handy.
                                  >

                                  I'm beginning to see that. I have Subversion software installed on my
                                  server, and I'll take a look over the weekend at setting things up for the
                                  baseball database.

                                  Regards,
                                  Sean Lahman
                                • Sean Lahman
                                  Those of you on the SABR-L list may have seen recent discussion about inaccuracies in the vote totals that were released by MLB for the American League MVP
                                  Message 16 of 28 , Nov 19, 2004
                                  • 0 Attachment
                                    Those of you on the SABR-L list may have seen recent discussion about
                                    inaccuracies in the vote totals that were released by MLB for the American
                                    League MVP award on Monday. Simply put, the number of votes and number of
                                    points didn't add up. Several of us have been working on getting an
                                    accurate reporting of the data, and Bill Deane was finally able to get a
                                    defintive response from the source. Bill writes:

                                    > MLB.com had omitted both Curt Schilling (14 points) and Travis Hafner (2)
                                    > from their chart, and my guess about Melvin Mora (one 8th and one 9th,
                                    > 5 points) was correct. That left only the problem with Mariano Rivera's
                                    > and Ivan Rodriguez's point totals not adding up based on their vote
                                    > breakdowns. This error was included in every published list I saw.
                                    >
                                    > A couple of you referred me to BBWAA Secretary-Treasurer Jack McConnell,
                                    > who solved the mystery. The point totals for Rivera (59) and
                                    > Rodriguez (36) were correct, but Mo had 3 ninth-place votes (not 4),
                                    > while I-Rod had 5 (not 4).

                                    For the record, what appears below is an accurate reporting of the vote
                                    totals. I don't believe any of the other published sources (ESPN, USA Today,
                                    CNN/SI) have it right. Totals for Rivera and Rodriguez need to be updated
                                    in the AwardSharePlayers table that I postyed yesterday.

                                    Regards,
                                    Sean Lahman


                                    2004 AL MVP Voting
                                    Player 1 2 3 4 5 6 7 8 9 10 Points
                                    ----------------------------------------------------
                                    V Guerrero 21 5 1 1 0 0 0 0 0 0 354
                                    G Sheffield 5 8 9 4 2 0 0 0 0 0 254
                                    M Ramirez 1 14 9 2 2 0 0 0 0 0 238
                                    D Ortiz 1 0 5 9 2 5 5 0 0 0 174
                                    M Tejada 0 1 0 6 2 7 2 4 2 1 123
                                    J Santana 0 0 2 1 7 5 4 2 2 1 117
                                    I Suzuki 0 0 1 0 6 2 6 4 1 6 98
                                    M Young 0 0 1 4 2 2 4 4 3 0 92
                                    M Rivera 0 0 0 0 2 3 3 3 3 5 59
                                    I Rodriguez 0 0 0 1 1 1 0 2 5 2 36
                                    C Schilling 0 0 0 0 0 0 2 2 0 0 14
                                    J Nathan 0 0 0 0 1 1 0 0 0 1 12
                                    D Jeter 0 0 0 0 1 0 0 1 1 0 11
                                    M Kotsay 0 0 0 0 0 1 0 0 1 1 8
                                    A Rodriguez 0 0 0 0 0 0 0 2 1 0 8
                                    J Damon 0 0 0 0 0 0 0 1 2 0 7
                                    P Konerko 0 0 0 0 0 0 0 0 1 5 7
                                    H Blalock 0 0 0 0 0 1 0 0 0 0 5
                                    M Mora 0 0 0 0 0 0 0 1 1 0 5
                                    M Teixeira 0 0 0 0 0 0 0 1 0 2 5
                                    T Hunter 0 0 0 0 0 0 1 0 0 0 4
                                    V Martinez 0 0 0 0 0 0 1 0 0 0 4
                                    E Durazo 0 0 0 0 0 0 0 1 0 0 3
                                    F Cordero 0 0 0 0 0 0 0 0 1 0 2
                                    L Ford 0 0 0 0 0 0 0 0 1 0 2
                                    C Guillen 0 0 0 0 0 0 0 0 1 0 2
                                    T Hafner 0 0 0 0 0 0 0 0 1 0 2
                                    H Matsui 0 0 0 0 0 0 0 0 1 0 2
                                    C Figgins 0 0 0 0 0 0 0 0 0 2 2
                                    E Chavez 0 0 0 0 0 0 0 0 0 1 1
                                    J Varitek 0 0 0 0 0 0 0 0 0 1 1
                                    -------------------------------------------------
                                  • Sean Forman
                                    ... I ve got all of these taken care of. There *are* however about 258 final games that I m missing. I m also cleaning up some previously erroneous debuts
                                    Message 17 of 28 , Nov 19, 2004
                                    • 0 Attachment
                                      Sean Lahman wrote:
                                      > I noticed today that there are about 800 players in the master table who
                                      > don't have a debut date. On closer inspection, most of these players made
                                      > their debut between 1998 and 2001. Right now, there are less than two dozen
                                      > players in the table with a value in the debut field for that five year
                                      > period.
                                      >
                                      > I'll take a look at resolving this issue, but just wanted to alert everybody
                                      > to the problem.
                                      >
                                      > Regards,
                                      > Sean Lahman


                                      I've got all of these taken care of. There *are* however about 258
                                      final games that I'm missing. I'm also cleaning up some previously
                                      erroneous debuts and final games as well.
                                      --
                                      Sincerely,
                                      Sean Forman

                                      Baseball Stats! http://www.Baseball-Reference.com/
                                    • Paul Wendt
                                      19 Nov 2004, Sean Forman wrote: . . . ... games as well. Once this Spring, the Biographical Database was missing 96 Last Games, all from
                                      Message 18 of 28 , Dec 2, 2004
                                      • 0 Attachment
                                        19 Nov 2004, Sean Forman <sean-forman@b...> wrote:
                                        . . .
                                        > There *are* however about 258 final games that I'm missing.
                                        > I'm also cleaning up some previously erroneous debuts and final
                                        games as well.

                                        Once this Spring, the Biographical Database was missing 96 Last
                                        Games, all from 1876-1880 (not counting 2003). I believe that more
                                        than 50 are still missing.

                                        This Summer and Fall, the Biog. newsletter should be reporting
                                        numerous small corrections in debut dates and perhaps lastgame dates
                                        from the 1870s-80s. I don't know the update lag and I am a few months
                                        behind on
                                        the other SABR research cmte newsletters.

                                        Paul Wendt
                                      Your message has been successfully submitted and would be delivered to recipients shortly.