Loading ...
Sorry, an error occurred while loading the content.
 

Re: [baseball-databank] Re: 2008 pre-release - data error

Expand Messages
  • John H. Rickert
    ... According to Retrosheet his last game was 1883-06-25
    Message 1 of 30 , Nov 14, 2008
      Tangotiger wrote:
      >
      > MASTER.txt
      >
      > Error 1:
      > 9078,"mccafha01","","",1858,11,25,"USA","MO","St.
      > Louis",1928,4,19,"USA","MO","St.Louis","Harry","McCaffery",,"Harry
      > Charles",,185,,"R","R","1882-06-15","1883-00-00","","mccafha01","mccafha01","mccah101","mccafha01","mccafha01"
      >
      > This field contains an invalid date: "1883-00-00"
      >











      According to Retrosheet his last game was 1883-06-25
    • 4seamer
      ... You know Tom, everyone seems to be spinning their wheels with all this freely available data. What we need are API s built so we all get on the same page.
      Message 2 of 30 , Nov 14, 2008
        > Also, please treat this as an open invitation to everyone out there to
        > deliver whatever data they get their hands on, especially "ID mapping"
        > data across the various data sources. It's important to think of this
        > ground as a dumping ground for any and all data.

        You know Tom, everyone seems to be spinning their wheels with all this
        freely available data. What we need are API's built so we all get on the
        same page.

        Just a thought,
        Jake
      • Mike Emeigh
        ... So why don t you build a prototype? -- Mike Emeigh piratefan1@nc.rr.com I think that it s high time that a new school of management emerges that uses
        Message 3 of 30 , Nov 14, 2008
          Jake wrote:
          >
          >
          > > Also, please treat this as an open invitation to everyone out there to
          > > deliver whatever data they get their hands on, especially "ID mapping"
          > > data across the various data sources. It's important to think of this
          > > ground as a dumping ground for any and all data.
          >
          > You know Tom, everyone seems to be spinning their wheels with all this
          > freely available data. What we need are API's built so we all get on the
          > same page.

          So why don't you build a prototype?
          --
          Mike Emeigh
          piratefan1@...

          "I think that it's high time that a new school of management emerges
          that uses Dilbert as it's cautionary statement - if what you are
          proposing as a manager has ever appeared in a Dilbert cartoon, you need
          to re-think your proposal." -- Jonathan House
        • Tangotiger
          ... Jake, you must be new around here :) We ve had a long history here of trying to get a better organization. Despite some of our best efforts, it just
          Message 4 of 30 , Nov 14, 2008
            > You know Tom, everyone seems to be spinning their wheels with all this
            > freely available data. What we need are API's built so we all get on the
            > same page.
            >
            > Just a thought,
            > Jake
            >

            Jake, you must be new around here :)

            We've had a long history here of trying to get a better organization.
            Despite some of our best efforts, it just doesn't seem to work out. As it
            is, this BDB yahoo group is a dumping ground for data, however and
            wherever we can get it.

            Tom
          • KJOK
            Although we ve had those fields in the pitching table for awhile, I don t believe the database has ever had SH, SF or GIDP for pitching.   THANKS, KJOK ...
            Message 5 of 30 , Nov 15, 2008
              Although we've had those fields in the pitching table for awhile, I don't believe the database has ever had SH, SF or GIDP for pitching.
               
              THANKS,
              KJOK

              --- On Fri, 11/14/08, Tangotiger <tom@...> wrote:
              From: Tangotiger <tom@...>
              Subject: Re: [baseball-databank] Re: 2008 pre-release
              To: baseball-databank@yahoogroups.com
              Date: Friday, November 14, 2008, 10:33 AM

              The pitching.sql table is not showing SH, SF, GIDP. Is this an oversight,
              or is it correct?

              Tom


            • wyerscj
              I just checked the prior release (I plan on keeping it around for another month or so), and the SH, SF and GIDP fields are missing from the pitching table
              Message 6 of 30 , Nov 15, 2008
                I just checked the prior release (I plan on keeping it around for
                another month or so), and the SH, SF and GIDP fields are missing from
                the pitching table there as well.

                --- In baseball-databank@yahoogroups.com, KJOK <kjokbaseball@...>
                wrote:
                >
                > Although we've had those fields in the pitching table for awhile, I
                don't believe the database has ever had SH, SF or GIDP for pitching.
                >  
                > THANKS,
                > KJOK
                >
                > --- On Fri, 11/14/08, Tangotiger <tom@...> wrote:
                >
                > From: Tangotiger <tom@...>
                > Subject: Re: [baseball-databank] Re: 2008 pre-release
                > To: baseball-databank@yahoogroups.com
                > Date: Friday, November 14, 2008, 10:33 AM
                >
                >
                >
                >
                >
                >
                > The pitching.sql table is not showing SH, SF, GIDP. Is this an
                oversight,
                > or is it correct?
                >
                > Tom
                >
              • Tangotiger
                The AllStarFull table contains nulls in a key field (GameID) for records in 1945. As a result, I ve changed the key for that table as playerID, yearID,
                Message 7 of 30 , Nov 17, 2008
                  The AllStarFull table contains nulls in a key field (GameID) for records
                  in 1945. As a result, I've changed the key for that table as playerID,
                  yearID, gameNum. (I think someone made a note about this already.)

                  Also, the managerID is not required as a key in the Managers table.
                  Uniqueness is established with yearid, teamid, inseason. Unless we expect
                  co-managers, managerID is extraneous.

                  Otherwise, all the data looks good. I will post the shell database on my
                  blog, and note the corrections required to the datafiles pending release
                  of any updated datafiles.

                  Check back later today here:
                  http://www.insidethebook.com/ee/

                  Thanks again for Sean for the very clean data.

                  Tom
                • Sean Forman
                  I ll do an upload later this week as I cull through these. ... Fixed with John s correction. ... Problem is that we have an award for players that Hemond got.
                  Message 8 of 30 , Nov 17, 2008
                    I'll do an upload later this week as I cull through these.


                    This field contains an invalid date: "1883-00-00"






                    Fixed with John's correction.


                    Error 2:
                    17310,"hemonro99","","",,,,,,,,,,,,,"Roland","Hemond",,,,,,,,,,,,,,,"hemonro99"

                    Roland Hemond is not a player, and should not have a player ID.









                    Problem is that we have an award for players that Hemond got.  I'm open to suggestions as to how to handle this.


                    ***

                    XREF_STATS.txt
                    Error 3:
                    "",,

                    This record (line 44) has an invalid player id and should be removed from
                    the file.













                    Deleted.


                    I will await changes to these records, and a reply to the other one on the
                    pitching table, prior to releasing my script to load the data into MS
                    Access.

                    Tom



                    --
                    Sean Forman
                    President, Sports Reference LLC
                    http://www.sports-reference.com/
                  • Micke Hovmöller
                    ... suggestions as to how to handle this. According to http://www.branchrickeyaward.org/index.html, the award is for professionals in Major League baseball ,
                    Message 9 of 30 , Nov 17, 2008
                      On 11/17/08, Sean Forman <sean-forman@...> wrote:
                       
                      > Problem is that we have an award for players that Hemond got.  I'm open to suggestions as to how to handle this.
                      According to http://www.branchrickeyaward.org/index.html, the award is for "professionals in Major League baseball", which I interpret as possibly other than players.
                       
                      I'm not enough up to date on the current DB scheme to suggest a specific change, but isn't there a masterID for everyone, independently of the role they have ever had? If so, shouldn't that be the key in this table?
                       
                      (If you are referring to another award, my apologies.)
                       
                      /Micke
                    • Sean Forman
                      Good point. LahmanID would work. Perhaps I ll add a column to the AwardPlayers table for LahmanID, or we could add AwardsOther for things like BranchRickey
                      Message 10 of 30 , Nov 17, 2008
                        Good point. LahmanID would work.  Perhaps I'll add a column to the AwardPlayers table for LahmanID, or we could add AwardsOther for things like BranchRickey and the Executive of the year awards.

                        sean

                        On Mon, Nov 17, 2008 at 11:57 AM, Micke Hovmöller <micke.hovmoller@...> wrote:



                        On 11/17/08, Sean Forman <sean-forman@...> wrote:
                         
                        > Problem is that we have an award for players that Hemond got.  I'm open to suggestions as to how to handle this.
                        According to http://www.branchrickeyaward.org/index.html, the award is for "professionals in Major League baseball", which I interpret as possibly other than players.
                         
                        I'm not enough up to date on the current DB scheme to suggest a specific change, but isn't there a masterID for everyone, independently of the role they have ever had? If so, shouldn't that be the key in this table?
                         
                        (If you are referring to another award, my apologies.)
                         
                        /Micke



                        --
                        Sean Forman
                        President, Sports Reference LLC
                        http://www.sports-reference.com/
                      • Tangotiger
                        http://www.insidethebook.com/ee/index.php/site/article/bdb_database_ms_access/ Includes data instructions, pending Sean s next release. *** As for Hemond, the
                        Message 11 of 30 , Nov 17, 2008
                          http://www.insidethebook.com/ee/index.php/site/article/bdb_database_ms_access/

                          Includes data instructions, pending Sean's next release.

                          ***

                          As for Hemond, the Branch Rickey Award is not exclusive to players, so,
                          ideally, we'd have a separate table for "baseball professionals".

                          Realistically, this points to the issue of not having a "Persons" table
                          and a "PersonID" (though the LahmanID functions here, it is not used
                          anywhere), as opposed to what we currently have.

                          So, I'd say for now, you can leave it in there, and just create some
                          "release notes" that points this out, and the user can decide what he
                          wants to do with it.

                          Tom
                        • robert bluestein
                          where can i find stats for Intertional Walks? ... From: Sean Forman Subject: Re: [baseball-databank] Re: 2008 pre-release
                          Message 12 of 30 , Nov 17, 2008
                            where can i find stats for Intertional Walks?

                            --- On Mon, 11/17/08, Sean Forman <sean-forman@...> wrote:
                            From: Sean Forman <sean-forman@...>
                            Subject: Re: [baseball-databank] Re: 2008 pre-release - data error
                            To: baseball-databank@yahoogroups.com
                            Date: Monday, November 17, 2008, 11:03 AM

                            Good point. LahmanID would work.  Perhaps I'll add a column to the AwardPlayers table for LahmanID, or we could add AwardsOther for things like BranchRickey and the Executive of the year awards.

                            sean

                            On Mon, Nov 17, 2008 at 11:57 AM, Micke Hovmöller <micke.hovmoller@ gmail.com> wrote:


                            On 11/17/08, Sean Forman <sean-forman@ baseball- reference. com> wrote:
                             
                            > Problem is that we have an award for players that Hemond got.  I'm open to suggestions as to how to handle this.
                            According to http://www.branchri ckeyaward. org/index. html, the award is for "professionals in Major League baseball", which I interpret as possibly other than players.
                             
                            I'm not enough up to date on the current DB scheme to suggest a specific change, but isn't there a masterID for everyone, independently of the role they have ever had? If so, shouldn't that be the key in this table?
                             
                            (If you are referring to another award, my apologies.)
                             
                            /Micke



                            --
                            Sean Forman
                            President, Sports Reference LLC
                            http://www.sports- reference. com/

                          • KJOK
                            Intentional Walks should be in the BATTING Table, column name IBB, in between SO and HBP. ... From: robert bluestein
                            Message 13 of 30 , Nov 17, 2008
                              Intentional Walks should be in the BATTING Table, column name IBB, in between SO and HBP.

                              --- On Mon, 11/17/08, robert bluestein <robertbluesteinphotography@...> wrote:
                              From: robert bluestein <robertbluesteinphotography@...>
                              Subject: Re: [baseball-databank] Re: 2008 pre-release - data error
                              To: baseball-databank@yahoogroups.com
                              Date: Monday, November 17, 2008, 11:13 AM

                              where can i find stats for Intertional Walks?

                              --- On Mon, 11/17/08, Sean Forman <sean-forman@ baseball- reference. com> wrote:
                              From: Sean Forman <sean-forman@ baseball- reference. com>
                              Subject: Re: [baseball-databank] Re: 2008 pre-release - data error
                              To: baseball-databank@ yahoogroups. com
                              Date: Monday, November 17, 2008, 11:03 AM

                              Good point. LahmanID would work.  Perhaps I'll add a column to the AwardPlayers table for LahmanID, or we could add AwardsOther for things like BranchRickey and the Executive of the year awards.

                              sean

                              On Mon, Nov 17, 2008 at 11:57 AM, Micke Hovmöller <micke.hovmoller@ gmail.com> wrote:


                              On 11/17/08, Sean Forman <sean-forman@ baseball- reference. com> wrote:
                               
                              > Problem is that we have an award for players that Hemond got.  I'm open to suggestions as to how to handle this.
                              According to http://www.branchri ckeyaward. org/index. html, the award is for "professionals in Major League baseball", which I interpret as possibly other than players.
                               
                              I'm not enough up to date on the current DB scheme to suggest a specific change, but isn't there a masterID for everyone, independently of the role they have ever had? If so, shouldn't that be the key in this table?
                               
                              (If you are referring to another award, my apologies.)
                               
                              /Micke



                              --
                              Sean Forman
                              President, Sports Reference LLC
                              http://www.sports- reference. com/


                            • Tangotiger
                              I updated the DB shell scripts to update the RetroID for all new 2008 players. Just follow the revised instructions (which will include the mapping file of
                              Message 14 of 30 , Nov 17, 2008
                                I updated the DB shell scripts to update the RetroID for all new 2008
                                players.

                                Just follow the revised instructions (which will include the mapping file
                                of retro/BDB IDs for new players).

                                You should then be ready to go.

                                Tom

                                > http://www.insidethebook.com/ee/index.php/site/article/bdb_database_ms_access/
                                >
                                > Includes data instructions, pending Sean's next release.
                                >
                                > ***
                                >
                                > As for Hemond, the Branch Rickey Award is not exclusive to players, so,
                                > ideally, we'd have a separate table for "baseball professionals".
                                >
                                > Realistically, this points to the issue of not having a "Persons" table
                                > and a "PersonID" (though the LahmanID functions here, it is not used
                                > anywhere), as opposed to what we currently have.
                                >
                                > So, I'd say for now, you can leave it in there, and just create some
                                > "release notes" that points this out, and the user can decide what he
                                > wants to do with it.
                                >
                                > Tom
                                >
                                >


                                ---------------------------------------------
                                The Book--Playing The Percentages In Baseball
                                http://www.InsideTheBook.com
                              • Sean Forman
                                In a case that no one would have ever imagined, it appears you have a duplicate in here. fukuk001 for both fukudko01 and fukumka01 Couple of other dupes as
                                Message 15 of 30 , Nov 20, 2008
                                  In a case that no one would have ever imagined, it appears you have a
                                  duplicate in here.

                                  fukuk001 for both fukudko01 and fukumka01

                                  Couple of other dupes as well.
                                  | millj004 | 2 | milleja04--2008-06-22,milleji02--2008-09-01 |
                                  | montl001 | 2 | montalu01--2008-08-05,montzlu01--2008-09-04 |

                                  Dates are debuts.

                                  Looks like
                                  fukumka01 => fukuk002
                                  milleji02 => millj005
                                  montzlu01 => montl002

                                  sean
                                • Tangotiger
                                  Hmmm... I made the announcement on my blog with the updated file, I think I made the announcement at Retrolist, and I guess I overlooked making the
                                  Message 16 of 30 , Nov 20, 2008
                                    Hmmm... I made the announcement on my blog with the updated file, I think
                                    I made the announcement at Retrolist, and I guess I overlooked making the
                                    announcement here. Two out of three and all that...

                                    Thanks to Sean for the alert.

                                    The correct IDs, as well as the most up-to-date DB shell script, will
                                    always be found here:

                                    http://tangotiger.net/bdb/

                                    Tom
                                  • Sean Forman
                                    Sorry, I just missed that note. sean ... -- Sean Forman President, Sports Reference LLC http://www.sports-reference.com/
                                    Message 17 of 30 , Nov 20, 2008
                                      Sorry, I just missed that note.

                                      sean




                                      On Thu, Nov 20, 2008 at 11:06 AM, Tangotiger <tom@...> wrote:

                                      Hmmm... I made the announcement on my blog with the updated file, I think
                                      I made the announcement at Retrolist, and I guess I overlooked making the
                                      announcement here. Two out of three and all that...

                                      Thanks to Sean for the alert.

                                      The correct IDs, as well as the most up-to-date DB shell script, will
                                      always be found here:

                                      http://tangotiger.net/bdb/

                                      Tom




                                      --
                                      Sean Forman
                                      President, Sports Reference LLC
                                      http://www.sports-reference.com/
                                    • Tangotiger
                                      No need. Like I said, I sent it out somewhere, probably not here. Plus, I sent out so many notes in those few days, who knows exactly what I was saying. I
                                      Message 18 of 30 , Nov 20, 2008
                                        No need. Like I said, I sent it out somewhere, probably not here. Plus, I
                                        sent out so many notes in those few days, who knows exactly what I was
                                        saying. I should have been more economical with my posts.

                                        In any case, I'm glad that you gave it a second review.

                                        Tom

                                        > Sorry, I just missed that note.
                                        >
                                        > sean
                                        >
                                        >
                                        >
                                        >
                                        > On Thu, Nov 20, 2008 at 11:06 AM, Tangotiger <tom@...> wrote:
                                        >
                                        >> Hmmm... I made the announcement on my blog with the updated file, I
                                        >> think
                                        >> I made the announcement at Retrolist, and I guess I overlooked making
                                        >> the
                                        >> announcement here. Two out of three and all that...
                                        >>
                                        >> Thanks to Sean for the alert.
                                        >>
                                        >> The correct IDs, as well as the most up-to-date DB shell script, will
                                        >> always be found here:
                                        >>
                                        >> http://tangotiger.net/bdb/
                                        >>
                                        >> Tom
                                        >>
                                        >>
                                        >>
                                        >
                                        >
                                        >
                                        > --
                                        > Sean Forman
                                        > President, Sports Reference LLC
                                        > http://www.sports-reference.com/
                                        >


                                        ---------------------------------------------
                                        The Book--Playing The Percentages In Baseball
                                        http://www.InsideTheBook.com
                                      • Tangotiger
                                        I should highlight that in that folder, I have the primary positions file for every player/season. What I did *not* do was for the BDB shell script to import
                                        Message 19 of 30 , Nov 21, 2008
                                          I should highlight that in that folder, I have the primary positions file
                                          for every player/season.

                                          What I did *not* do was for the BDB shell script to import that file
                                          automatically. I could, but I didn't. The reason was for that shell
                                          script to only import the data that directly corresponds to the "official"
                                          tables in the BDB. (Note: it is very easy to import it manually, in
                                          Access: just click NEW/Import Table.)

                                          I could expand, for example, by including wOBA or LWTS, or creating a
                                          "BattingNoStint" table to group the records to get rid of the stint field.
                                          Really, there's no end to what we can do in terms of making the DB more
                                          friendly.

                                          Perhaps I will make an exception for this particular case, simply because
                                          it's a fairly involved process to try to get the primary position. If
                                          there are other useful things that can be generated (that would require a
                                          fairly involved process), please post it, and I'll consider it.

                                          Thanks, Tom


                                          > The correct IDs, as well as the most up-to-date DB shell script, will
                                          > always be found here:
                                          >
                                          > http://tangotiger.net/bdb/
                                          >
                                          > Tom
                                          >
                                          >
                                          >
                                        Your message has been successfully submitted and would be delivered to recipients shortly.