Loading ...
Sorry, an error occurred while loading the content.

SF and SH for pitchers?

Expand Messages
  • 8c94a593bdc4ec42695b5a88005274c5
    I have joined this group in an attempt to find out a bit more about using baseball statistics for analysis. One of the initial stats to catch my eye has been
    Message 1 of 24 , Nov 26, 2013
    • 0 Attachment
      I have joined this group in an attempt to find out a bit more about using baseball statistics for analysis.  One of the initial stats to catch my eye has been BABIP.  I am wanting to calculate this for pitchers.  Unfortunately SF and SH are not in BDB-sql-2011-03-28.sql.  Can you tell me where I would find this data?  I would be willing to update the database to include this information if this would be helpful.


      Thank you,

      Shane McCarthy
    • anson2995
      We added fields in the pitching table for SF, SH, and GIDP several years ago but never populated them. If someone would like to compile that data, I m happy to
      Message 2 of 24 , Nov 27, 2013
      • 0 Attachment

        We added fields in the pitching table for SF, SH, and GIDP several years ago but never populated them. If someone would like to compile that data, I'm happy to add it.


        FYI Shane, you're working with an older version of the database, as you might have guessed from the filename.  More recent versions are available at http://seanlahman.com/baseball-archive/statistics  The newer versions would not only have 2011 and 2012 stats but other fixes and updates to the historical data..


        And an alert for everyone else...  I'll have a beta version of the 2013 version available sometime next week.  I'll post a note here when it's available and also post a link on twitter -- @SeanLahman. It will include both the 2013 data and a handful of fixes for errors that people have let me know about.  If you have any questions in the mean time, let me know.


        Regards,

        Sean Lahman 



        ---In baseball-databank@yahoogroups.com, <shane.mccarthy@...> wrote:

        I have joined this group in an attempt to find out a bit more about using baseball statistics for analysis.  One of the initial stats to catch my eye has been BABIP.  I am wanting to calculate this for pitchers.  Unfortunately SF and SH are not in BDB-sql-2011-03-28.sql.  Can you tell me where I would find this data?  I would be willing to update the database to include this information if this would be helpful.

        Thank you,

        Shane McCarthy
      • Shane McCarthy
        I currently have time to devote to compilation of this data. However, I will need some guidance. I have looked and been unable to locate this data on the
        Message 3 of 24 , Nov 28, 2013
        • 0 Attachment
          I currently have time to devote to compilation of this data.  However,
          I will need some guidance.  I have looked and been unable to locate
          this data on the web.  

          Preliminary questions I have are:

          Where do I find the data?  
          How do you want the compiled data formatted?  
          How is the data be checked?

          Shane McCarthy



          On Wed, Nov 27, 2013 at 5:48 PM, <seanlahman@...> wrote:
           

          We added fields in the pitching table for SF, SH, and GIDP several years ago but never populated them. If someone would like to compile that data, I'm happy to add it.


          FYI Shane, you're working with an older version of the database, as you might have guessed from the filename.  More recent versions are available at http://seanlahman.com/baseball-archive/statistics  The newer versions would not only have 2011 and 2012 stats but other fixes and updates to the historical data..


          And an alert for everyone else...  I'll have a beta version of the 2013 version available sometime next week.  I'll post a note here when it's available and also post a link on twitter -- @SeanLahman. It will include both the 2013 data and a handful of fixes for errors that people have let me know about.  If you have any questions in the mean time, let me know.


          Regards,

          Sean Lahman 



          ---In baseball-databank@yahoogroups.com, <shane.mccarthy@...> wrote:

          I have joined this group in an attempt to find out a bit more about using baseball statistics for analysis.  One of the initial stats to catch my eye has been BABIP.  I am wanting to calculate this for pitchers.  Unfortunately SF and SH are not in BDB-sql-2011-03-28.sql.  Can you tell me where I would find this data?  I would be willing to update the database to include this information if this would be helpful.

          Thank you,

          Shane McCarthy


        • KJOK
          1.    The data is in the Retrosheet data (http://www.retrosheet.org/) 2.    I would think exactly how it s in the pitching table, minus the IP, GS,
          Message 4 of 24 , Nov 28, 2013
          • 0 Attachment
            1.    The data is in the Retrosheet data (http://www.retrosheet.org/)
            2.    I would think exactly how it's in the pitching table, minus the IP, GS, etc.
            3.    A season of SH SF batting data should match a season of SH SF allowed pitching data.  Not sure if you can do much other checking beyond that.
             
            THANKS,
            Kevin

            From: Shane McCarthy <shane.mccarthy@...>
            To: baseball-databank@yahoogroups.com
            Sent: Thursday, November 28, 2013 9:24 AM
            Subject: Re: [baseball-databank] RE: SF and SH for pitchers?

             
            I currently have time to devote to compilation of this data.  However,
            I will need some guidance.  I have looked and been unable to locate
            this data on the web.  

            Preliminary questions I have are:

            Where do I find the data?  
            How do you want the compiled data formatted?  
            How is the data be checked?

            Shane McCarthy



            On Wed, Nov 27, 2013 at 5:48 PM, <seanlahman@...> wrote:
             
            We added fields in the pitching table for SF, SH, and GIDP several years ago but never populated them. If someone would like to compile that data, I'm happy to add it.

            FYI Shane, you're working with an older version of the database, as you might have guessed from the filename.  More recent versions are available at http://seanlahman.com/baseball-archive/statistics  The newer versions would not only have 2011 and 2012 stats but other fixes and updates to the historical data..

            And an alert for everyone else...  I'll have a beta version of the 2013 version available sometime next week.  I'll post a note here when it's available and also post a link on twitter -- @SeanLahman. It will include both the 2013 data and a handful of fixes for errors that people have let me know about.  If you have any questions in the mean time, let me know.

            Regards,
            Sean Lahman 


            ---In baseball-databank@yahoogroups.com, <shane.mccarthy@...> wrote:

            I have joined this group in an attempt to find out a bit more about using baseball statistics for analysis.  One of the initial stats to catch my eye has been BABIP.  I am wanting to calculate this for pitchers.  Unfortunately SF and SH are not in BDB-sql-2011-03-28.sql.  Can you tell me where I would find this data?  I would be willing to update the database to include this information if this would be helpful.

            Thank you,

            Shane McCarthy



          • Shane McCarthy
            Thank you for the information Kevin. I have a follow up question: Is the RetrosheetID unique? I know it would be mad if it wasn t but I also think it mad I
            Message 5 of 24 , Nov 30, 2013
            • 0 Attachment

              Thank you for the information Kevin.  I have a follow up question:

              Is the RetrosheetID unique?  I know it would be mad if it wasn't but I also think it mad I cannot find such a statement at the Retrosheet website.  The script I have written to extract the data depends on it being unique.  The SH, SF, and GIDP match with the batting for the 2010-2012 seasons except for SH in 2010 is off by 1.  I will continue checking and possibly uncover a problem.



            • Mike Emeigh
              Retrosheet ID is unique, otherwise a lot of other things would break. Mike Emeigh Sent from my iPhone
              Message 6 of 24 , Dec 1, 2013
              • 0 Attachment
                Retrosheet ID is unique, otherwise a lot of other things would break.

                Mike Emeigh

                Sent from my iPhone

                On Nov 30, 2013, at 11:21, Shane McCarthy <shane.mccarthy@...> wrote:

                 


                Thank you for the information Kevin.  I have a follow up question:

                Is the RetrosheetID unique?  I know it would be mad if it wasn't but I also think it mad I cannot find such a statement at the Retrosheet website.  The script I have written to extract the data depends on it being unique.  The SH, SF, and GIDP match with the batting for the 2010-2012 seasons except for SH in 2010 is off by 1.  I will continue checking and possibly uncover a problem.



              • opsahlbr
                Yes retrosheets ID is unique. Baseballheatmaps.com has sql file to download to that is a lot easier to download and import at least it was for me.  On Sunday,
                Message 7 of 24 , Dec 1, 2013
                • 0 Attachment
                  Yes retrosheets ID is unique. Baseballheatmaps.com has sql file to download to that is a lot easier to download and import at least it was for me. 


                  On Sunday, December 1, 2013 12:27 PM, Mike Emeigh <mwe55innc@...> wrote:
                   
                  Retrosheet ID is unique, otherwise a lot of other things would break.

                  Mike Emeigh

                  Sent from my iPhone

                  On Nov 30, 2013, at 11:21, Shane McCarthy <shane.mccarthy@...> wrote:

                   

                  Thank you for the information Kevin.  I have a follow up question:

                  Is the RetrosheetID unique?  I know it would be mad if it wasn't but I also think it mad I cannot find such a statement at the Retrosheet website.  The script I have written to extract the data depends on it being unique.  The SH, SF, and GIDP match with the batting for the 2010-2012 seasons except for SH in 2010 is off by 1.  I will continue checking and possibly uncover a problem.





                • Shane McCarthy
                  My initial assessment of having a script close to complete was proven incorrect the further back in time I went. Lahman s Baseball Database is used as the
                  Message 8 of 24 , Dec 1, 2013
                  • 0 Attachment
                    My initial assessment of having a script close to complete was
                    proven incorrect the further back in time I went.  Lahman's Baseball
                    Database is used as the reference.

                    I simply search the seventh field in the play
                    records of retrosheet event files for /SF, /SH, or /GDP  and
                    increment by one the pitcher's count for the appropriate statistic.
                    Seasonal sums for SH and SF are off by no more than 1 until
                    1971; GDP are worse but still less than 13 away.  Earlier than 1971, the
                    results go off the rails. Around these dates I begin to notice 
                    event files with the extension .edn present in the directories.

                    What is incorrect with the pattern I am searching for?  The /SF, /SH,
                    and /GDP patterns are giving yearly results both
                    greater and less than in Lahman's Baseball database.  This
                    means the pattern I am searching for is incorrect because it is either too
                    restrictive or inclusive.

                    What are .edn files?  Should I include them?  Where is the appropriate 
                    place to direct Retrosheet questions? 

                    Thank you,

                    Shane McCarthy



                    On Sun, Dec 1, 2013 at 2:55 PM, opsahlbr <opsahlbr@...> wrote:
                     

                    Yes retrosheets ID is unique. Baseballheatmaps.com has sql file to download to that is a lot easier to download and import at least it was for me. 


                    On Sunday, December 1, 2013 12:27 PM, Mike Emeigh <mwe55innc@...> wrote:
                     
                    Retrosheet ID is unique, otherwise a lot of other things would break.

                    Mike Emeigh

                    Sent from my iPhone

                    On Nov 30, 2013, at 11:21, Shane McCarthy <shane.mccarthy@...> wrote:

                     

                    Thank you for the information Kevin.  I have a follow up question:

                    Is the RetrosheetID unique?  I know it would be mad if it wasn't but I also think it mad I cannot find such a statement at the Retrosheet website.  The script I have written to extract the data depends on it being unique.  The SH, SF, and GIDP match with the batting for the 2010-2012 seasons except for SH in 2010 is off by 1.  I will continue checking and possibly uncover a problem.






                  • Tangotiger
                    RetroList RetroSQL And I HIGHLY recommend you download CWEVENT to parse through the files. Google Chadwick Bureau. Tom ... The Book--Playing The Percentages In
                    Message 9 of 24 , Dec 2, 2013
                    • 0 Attachment
                      RetroList
                      RetroSQL

                      And I HIGHLY recommend you download CWEVENT to parse through the files.
                      Google Chadwick Bureau.

                      Tom


                      > My initial assessment of having a script close to complete was
                      > proven incorrect the further back in time I went. Lahman's Baseball
                      > Database is used as the reference.
                      >
                      > I simply search the seventh field in the play
                      > records of retrosheet event files for /SF, /SH, or /GDP and
                      > increment by one the pitcher's count for the appropriate statistic.
                      > Seasonal sums for SH and SF are off by no more than 1 until
                      > 1971; GDP are worse but still less than 13 away. Earlier than 1971, the
                      > results go off the rails. Around these dates I begin to notice
                      > event files with the extension .edn present in the directories.
                      >
                      > What is incorrect with the pattern I am searching for? The /SF, /SH,
                      > and /GDP patterns are giving yearly results both
                      > greater and less than in Lahman's Baseball database. This
                      > means the pattern I am searching for is incorrect because it is either too
                      > restrictive or inclusive.
                      >
                      > What are .edn files? Should I include them? Where is the appropriate
                      > place to direct Retrosheet questions?
                      >
                      > Thank you,
                      >
                      > Shane McCarthy
                      >
                      >
                      >
                      > On Sun, Dec 1, 2013 at 2:55 PM, opsahlbr <opsahlbr@...> wrote:
                      >
                      >>
                      >>
                      >> Yes retrosheets ID is unique. Baseballheatmaps.com has sql file to
                      >> download to that is a lot easier to download and import at least it was
                      >> for
                      >> me.
                      >>
                      >>
                      >> On Sunday, December 1, 2013 12:27 PM, Mike Emeigh
                      >> <mwe55innc@...>
                      >> wrote:
                      >>
                      >> Retrosheet ID is unique, otherwise a lot of other things would break.
                      >>
                      >> Mike Emeigh
                      >>
                      >> Sent from my iPhone
                      >>
                      >> On Nov 30, 2013, at 11:21, Shane McCarthy <shane.mccarthy@...>
                      >> wrote:
                      >>
                      >>
                      >>
                      >> Thank you for the information Kevin. I have a follow up question:
                      >>
                      >> Is the RetrosheetID unique? I know it would be mad if it wasn't but I
                      >> also think it mad I cannot find such a statement at the Retrosheet
                      >> website.
                      >> The script I have written to extract the data depends on it being
                      >> unique.
                      >> The SH, SF, and GIDP match with the batting for the 2010-2012 seasons
                      >> except for SH in 2010 is off by 1. I will continue checking and
                      >> possibly
                      >> uncover a problem.
                      >>
                      >>
                      >>
                      >>
                      >>
                      >>
                      >>
                      >


                      ---------------------------------------------
                      The Book--Playing The Percentages In Baseball
                      http://www.TangoTiger.com
                    • Mike Emeigh
                      ... Tango s response is excellent - you should be on RetroList and RetroSQL and you should absolutely use CWEVENT to parse the event files - but since I can
                      Message 10 of 24 , Dec 2, 2013
                      • 0 Attachment

                        On 12/1/2013 7:58 PM, Shane McCarthy wrote:
                         


                        What are .edn files?  Should I include them?  Where is the appropriate 
                        place to direct Retrosheet questions? 


                        Tango's response is excellent - you should be on RetroList and RetroSQL and you should absolutely use CWEVENT to parse the event files - but since I can answer the query about the .EDN files...

                        Prior to 1973, Retrosheet does not have complete PBPs for every game. To cover for these, Retrosheet has created .EDx, or deduced event files - they have the same format as the .EVx files but the flow of these games has been "deduced" from partial play-by-plays that Retrosheet has as well as box scores newspaper accounts of the games (which usually reconstruct scoring innings accurately and allow researchers to make some conclusions about the events that occurred leading up to that point). SF would very likely be accurate, but SH and GDP may not be (the actual timing of a SH or GDP may be off because the deduced game flow can support multiple occasions where such a play could have occurred, and some DPs may not be correctly identified as GDPs - for example, a 63 DP with no other available information from the partial data and game accounts could be either a GDP or LDP, but would probably be treated as a GDP in the deduced account).

                        For more information see http://www.retrosheet.org/GameFiles.pdf and http://www.retrosheet.org/datause.txt.
                        -- 
                        Mike Emeigh
                        MWE55inNC@...
                      • Sean Lahman
                        Just to follow up on what Mike and Tango said, I wanted to post these links for Shane and others.. There is a Retrosheet group that has general information,
                        Message 11 of 24 , Dec 3, 2013
                        • 0 Attachment

                          Just to follow up on what Mike and Tango said, I wanted to post these links for Shane and others..


                          There is a Retrosheet group that has general information, and one called RetroSQL that focuses discussion on working with the Retrosheet data files.  Both have a robust group of contributors, and you;re more likely to get good help for those types of questions there than here.  Links are:


                          http://groups.yahoo.com/neo/groups/RetroList/

                          http://groups.yahoo.com/neo/groups/RetroSQL/



                          The Chadwick tools are also a great resource, and might help prevent you from reinventing the wheel.  They can be downloaded at:

                          http://chadwick.sourceforge.net/doc/index.html


                          Regards,

                          Sean Lahman

                          ---
                          Sean Lahman
                          http://seanlahman.com
                        • Mike Emeigh
                          I created a spreadsheet with the SH/SF/GDP data from 1951 through 2013 extracted from Retrosheet - those seasons have all games included in either event files
                          Message 12 of 24 , Dec 3, 2013
                          • 0 Attachment

                            I created a spreadsheet with the SH/SF/GDP data from 1951 through 2013 extracted from Retrosheet – those seasons have all games included in either event files or deduced event files.

                             

                            https://drive.google.com/file/d/0B83JMDbpXRQsZHY1dU1sZE5wMkU/edit?usp=sharing

                             

                            Mike Emeigh

                            MWE55inNC@...

                             

                            From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of Sean Lahman
                            Sent: Tuesday, December 3, 2013 11:06 AM
                            To: BB-Databank (yahoo)
                            Subject: [baseball-databank] Re: SF and SH for pitchers?

                             

                             

                            Just to follow up on what Mike and Tango said, I wanted to post these links for Shane and others..

                             

                            There is a Retrosheet group that has general information, and one called RetroSQL that focuses discussion on working with the Retrosheet data files.  Both have a robust group of contributors, and you;re more likely to get good help for those types of questions there than here.  Links are:

                             

                            http://groups.yahoo.com/neo/groups/RetroList/

                            http://groups.yahoo.com/neo/groups/RetroSQL/

                             

                             

                            The Chadwick tools are also a great resource, and might help prevent you from reinventing the wheel.  They can be downloaded at:

                            http://chadwick.sourceforge.net/doc/index.html

                             

                            Regards,

                            Sean Lahman

                            ---
                            Sean Lahman
                            http://seanlahman.com

                          • Sean Lahman
                            Mike, This is fantastic. I should be able to integrate this with the release I was planning to make this week. Thanks so much! --Sean ... Sean Lahman
                            Message 13 of 24 , Dec 3, 2013
                            • 0 Attachment
                              Mike,
                              This is fantastic.  I should be able to integrate this with the release I was planning to make this week.  Thanks so much!

                              --Sean

                              ---
                              Sean Lahman
                              http://seanlahman.com


                              On Tue, Dec 3, 2013 at 12:40 PM, Mike Emeigh <mwe55innc@...> wrote:
                               

                              I created a spreadsheet with the SH/SF/GDP data from 1951 through 2013 extracted from Retrosheet – those seasons have all games included in either event files or deduced event files.

                               

                              https://drive.google.com/file/d/0B83JMDbpXRQsZHY1dU1sZE5wMkU/edit?usp=sharing

                               

                              Mike Emeigh

                              MWE55inNC@...

                               

                              From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of Sean Lahman
                              Sent: Tuesday, December 3, 2013 11:06 AM
                              To: BB-Databank (yahoo)
                              Subject: [baseball-databank] Re: SF and SH for pitchers?

                               

                               

                              Just to follow up on what Mike and Tango said, I wanted to post these links for Shane and others..

                               

                              There is a Retrosheet group that has general information, and one called RetroSQL that focuses discussion on working with the Retrosheet data files.  Both have a robust group of contributors, and you;re more likely to get good help for those types of questions there than here.  Links are:

                               

                              http://groups.yahoo.com/neo/groups/RetroList/

                              http://groups.yahoo.com/neo/groups/RetroSQL/

                               

                               

                              The Chadwick tools are also a great resource, and might help prevent you from reinventing the wheel.  They can be downloaded at:

                              http://chadwick.sourceforge.net/doc/index.html

                               

                              Regards,

                              Sean Lahman

                              ---
                              Sean Lahman
                              http://seanlahman.com


                            • Mike Emeigh
                              Sean, you re welcome. GDP may be overstated. I treated every DP with BIP type of G as GDP in the query, realizing that some of those might not be true GDP
                              Message 14 of 24 , Dec 3, 2013
                              • 0 Attachment

                                Sean, you’re welcome.

                                 

                                GDP may be overstated. I treated every DP with BIP type of “G” as GDP in the query, realizing that some of those might not be “true” GDP but not having the time to do a more detailed check to filter out non-force DPs on GB.

                                 

                                Mike Emeigh

                                MWE55inNC@...

                                919-389-1013

                                mike.emeigh on Skype

                                http://www.linkedin.com/in/mwemeigh

                                 

                                From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of Sean Lahman
                                Sent: Tuesday, December 3, 2013 1:13 PM
                                To: BB-Databank (yahoo)
                                Subject: Re: [baseball-databank] Re: SF and SH for pitchers?

                                 

                                 

                                Mike,

                                This is fantastic.  I should be able to integrate this with the release I was planning to make this week.  Thanks so much!

                                 

                                --Sean


                                ---
                                Sean Lahman
                                http://seanlahman.com

                                 

                                On Tue, Dec 3, 2013 at 12:40 PM, Mike Emeigh <mwe55innc@...> wrote:

                                 

                                I created a spreadsheet with the SH/SF/GDP data from 1951 through 2013 extracted from Retrosheet – those seasons have all games included in either event files or deduced event files.

                                 

                                https://drive.google.com/file/d/0B83JMDbpXRQsZHY1dU1sZE5wMkU/edit?usp=sharing

                                 

                                Mike Emeigh

                                MWE55inNC@...

                                 

                                From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of Sean Lahman
                                Sent: Tuesday, December 3, 2013 11:06 AM
                                To: BB-Databank (yahoo)
                                Subject: [baseball-databank] Re: SF and SH for pitchers?

                                 

                                 

                                Just to follow up on what Mike and Tango said, I wanted to post these links for Shane and others..

                                 

                                There is a Retrosheet group that has general information, and one called RetroSQL that focuses discussion on working with the Retrosheet data files.  Both have a robust group of contributors, and you;re more likely to get good help for those types of questions there than here.  Links are:

                                 

                                http://groups.yahoo.com/neo/groups/RetroList/

                                http://groups.yahoo.com/neo/groups/RetroSQL/

                                 

                                 

                                The Chadwick tools are also a great resource, and might help prevent you from reinventing the wheel.  They can be downloaded at:

                                http://chadwick.sourceforge.net/doc/index.html

                                 

                                Regards,

                                Sean Lahman

                                ---
                                Sean Lahman
                                http://seanlahman.com

                                 

                              • Shane McCarthy
                                Tango, MIke, Rod, and Sean thank you for the direction to those newsgroups. I have signed up so pending approval I will see what is happening there. The tools
                                Message 15 of 24 , Dec 3, 2013
                                • 0 Attachment
                                  Tango, MIke, Rod, and Sean thank you for the direction to those newsgroups.  I have signed up so pending approval I will see what is happening there.  

                                  The tools at Chadwick Bureau definitely allowed for a shorter script to extract the desired data.  However, the numerical values did not change.  So the same questions arise.  I will see if I can find an answer on one of the Retrosheet newsgroups.  I certainly didn't think baseball statistics would prove so complex when I thought I had asked a simple question.  And don't you fellows slow down in the offseason?

                                  I just wish to alert the same discrepancy between the Lahman Baseball database and the value provided by Mike for total SH in 2010 remains;  I had the same value as Mike.  I have not checked other differences but I will.


                                  On Tue, Dec 3, 2013 at 2:44 PM, Mike Emeigh <mwe55innc@...> wrote:
                                   

                                  Sean, you’re welcome.

                                   

                                  GDP may be overstated. I treated every DP with BIP type of “G” as GDP in the query, realizing that some of those might not be “true” GDP but not having the time to do a more detailed check to filter out non-force DPs on GB.

                                   

                                   

                                  From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of Sean Lahman
                                  Sent: Tuesday, December 3, 2013 1:13 PM
                                  To: BB-Databank (yahoo)
                                  Subject: Re: [baseball-databank] Re: SF and SH for pitchers?

                                   

                                   

                                  Mike,

                                  This is fantastic.  I should be able to integrate this with the release I was planning to make this week.  Thanks so much!

                                   

                                  --Sean


                                  ---
                                  Sean Lahman
                                  http://seanlahman.com

                                   

                                  On Tue, Dec 3, 2013 at 12:40 PM, Mike Emeigh <mwe55innc@...> wrote:

                                   

                                  I created a spreadsheet with the SH/SF/GDP data from 1951 through 2013 extracted from Retrosheet – those seasons have all games included in either event files or deduced event files.

                                   

                                  https://drive.google.com/file/d/0B83JMDbpXRQsZHY1dU1sZE5wMkU/edit?usp=sharing

                                   

                                  Mike Emeigh

                                  MWE55inNC@...

                                   

                                  From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of Sean Lahman
                                  Sent: Tuesday, December 3, 2013 11:06 AM
                                  To: BB-Databank (yahoo)
                                  Subject: [baseball-databank] Re: SF and SH for pitchers?

                                   

                                   

                                  Just to follow up on what Mike and Tango said, I wanted to post these links for Shane and others..

                                   

                                  There is a Retrosheet group that has general information, and one called RetroSQL that focuses discussion on working with the Retrosheet data files.  Both have a robust group of contributors, and you;re more likely to get good help for those types of questions there than here.  Links are:

                                   

                                  http://groups.yahoo.com/neo/groups/RetroList/

                                  http://groups.yahoo.com/neo/groups/RetroSQL/

                                   

                                   

                                  The Chadwick tools are also a great resource, and might help prevent you from reinventing the wheel.  They can be downloaded at:

                                  http://chadwick.sourceforge.net/doc/index.html

                                   

                                  Regards,

                                  Sean Lahman

                                  ---
                                  Sean Lahman
                                  http://seanlahman.com

                                   


                                • Clem Comly
                                  Shane, can you use the team totals on Retroheet (http://www.retrosheet.org/boxesetc/2010/YT_2010.htm) to ID which team is the problem then look at that teamÆs
                                  Message 16 of 24 , Dec 3, 2013
                                  • 0 Attachment
                                    Shane, can you use the team totals on Retroheet (http://www.retrosheet.org/boxesetc/2010/YT_2010.htm) to ID which team
                                    is the problem then look at that team’s pitchers to see who has wrong SH count?
                                     
                                    Clem Comly
                                  • Clem Comly
                                    Sean, could the release include an updated set of consistency check results? Thanks. Clem Comly
                                    Message 17 of 24 , Dec 3, 2013
                                    • 0 Attachment
                                      Sean, could the release include an updated set of consistency check results?  Thanks.
                                       
                                      Clem Comly
                                    • Shane McCarthy
                                      I have compared Mikes values for SH/SF/GDP with what I had. As Mike described his gidp include more than a simple search for GDP would reveal but our SH and
                                      Message 18 of 24 , Dec 4, 2013
                                      • 0 Attachment
                                        I have compared Mikes values for SH/SF/GDP with what I had.  As Mike
                                        described his gidp include more than a simple search for GDP would reveal
                                        but our SH and SF agree for 1973-2012.  I will download the 2013 event
                                        files today but I would assume 2013 would also agree.  The earlier years
                                        have .edx files present.  I need to learn more about these files before I
                                        include them.

                                        Below is a list of discrepencies between Mike's SH and SF 
                                        values and Lahman's Baseball Database""

                                        Year  SH, SF
                                        1973 1 1 
                                        1985 1 0
                                        1991 1 0
                                        1995 1 0
                                        1999 -1 0
                                        2010 -1 0

                                        The numbers in the table where calculated by Mike - Lahman.  A positive
                                        number mean Lahman's Baseball Database has a larger entry.  Is there a
                                        more convenient abbreviation used for Lahman's Baseball Database;  I would
                                        expect or suggest LBD.



                                        On Tue, Dec 3, 2013 at 7:59 PM, Clem Comly <ccomly@...> wrote:
                                         

                                        Shane, can you use the team totals on Retroheet (http://www.retrosheet.org/boxesetc/2010/YT_2010.htm) to ID which team
                                        is the problem then look at that team’s pitchers to see who has wrong SH count?
                                         
                                        Clem Comly


                                      • Shane McCarthy
                                        Clem I will see if I can figure out what the problem is using the resource you have pointed to for 2010 and the five other years as well. Thanks, Shane
                                        Message 19 of 24 , Dec 4, 2013
                                        • 0 Attachment
                                          Clem I will see if I can figure out what the problem is using the resource you have pointed to for 2010 and the five other years as well.

                                          Thanks,

                                          Shane McCarthy


                                          On Wed, Dec 4, 2013 at 8:36 AM, Shane McCarthy <shane.mccarthy@...> wrote:
                                          I have compared Mikes values for SH/SF/GDP with what I had.  As Mike
                                          described his gidp include more than a simple search for GDP would reveal
                                          but our SH and SF agree for 1973-2012.  I will download the 2013 event
                                          files today but I would assume 2013 would also agree.  The earlier years
                                          have .edx files present.  I need to learn more about these files before I
                                          include them.

                                          Below is a list of discrepencies between Mike's SH and SF 
                                          values and Lahman's Baseball Database""

                                          Year  SH, SF
                                          1973 1 1 
                                          1985 1 0
                                          1991 1 0
                                          1995 1 0
                                          1999 -1 0
                                          2010 -1 0

                                          The numbers in the table where calculated by Mike - Lahman.  A positive
                                          number mean Lahman's Baseball Database has a larger entry.  Is there a
                                          more convenient abbreviation used for Lahman's Baseball Database;  I would
                                          expect or suggest LBD.



                                          On Tue, Dec 3, 2013 at 7:59 PM, Clem Comly <ccomly@...> wrote:
                                           

                                          Shane, can you use the team totals on Retroheet (http://www.retrosheet.org/boxesetc/2010/YT_2010.htm) to ID which team
                                          is the problem then look at that team’s pitchers to see who has wrong SH count?
                                           
                                          Clem Comly



                                        • Shane McCarthy
                                          Below is a list of the differences I found on retrosheet web pages and the values from the spread sheet Mike posted. I did not understand the naming
                                          Message 20 of 24 , Dec 4, 2013
                                          • 0 Attachment
                                            Below is a list of the differences I found on retrosheet web pages and the
                                            values from the spread sheet Mike posted. 

                                            I did not understand the naming conventions for the
                                            web pages so I simply give the address of the site I found the data on.
                                            The values of SH and SF below are the same as from the table Mike posted yesterday.

                                            Year   retroID   SH    SF   web address showing different value

                                            The difference in SH in 1999 was between the yearly totals from LBD and Mike posted.  The

                                            The difference in SF in 1973 was between LBD and Mike. Mike's data agrees

                                            A rather alarming discovery was that in 1985 there were three entries differing.
                                            Two of which cancel each other out.  I had only been using yearly
                                            statistics to check what was found.  Even if those totals agree there
                                            could be within that season difference that cancel each other out.  Despite the 
                                            fact that agreement between Mike and the RetroSheet web pages is nice; it does 
                                            not ensure the individual player's have the correct SH and SF.


                                            On Wed, Dec 4, 2013 at 8:38 AM, Shane McCarthy <shane.mccarthy@...> wrote:
                                            Clem I will see if I can figure out what the problem is using the resource you have pointed to for 2010 and the five other years as well.

                                            Thanks,

                                            Shane McCarthy


                                            On Wed, Dec 4, 2013 at 8:36 AM, Shane McCarthy <shane.mccarthy@...> wrote:
                                            I have compared Mikes values for SH/SF/GDP with what I had.  As Mike
                                            described his gidp include more than a simple search for GDP would reveal
                                            but our SH and SF agree for 1973-2012.  I will download the 2013 event
                                            files today but I would assume 2013 would also agree.  The earlier years
                                            have .edx files present.  I need to learn more about these files before I
                                            include them.

                                            Below is a list of discrepencies between Mike's SH and SF 
                                            values and Lahman's Baseball Database""

                                            Year  SH, SF
                                            1973 1 1 
                                            1985 1 0
                                            1991 1 0
                                            1995 1 0
                                            1999 -1 0
                                            2010 -1 0

                                            The numbers in the table where calculated by Mike - Lahman.  A positive
                                            number mean Lahman's Baseball Database has a larger entry.  Is there a
                                            more convenient abbreviation used for Lahman's Baseball Database;  I would
                                            expect or suggest LBD.



                                            On Tue, Dec 3, 2013 at 7:59 PM, Clem Comly <ccomly@...> wrote:
                                             

                                            Shane, can you use the team totals on Retroheet (http://www.retrosheet.org/boxesetc/2010/YT_2010.htm) to ID which team
                                            is the problem then look at that team’s pitchers to see who has wrong SH count?
                                             
                                            Clem Comly




                                          • Sean Lahman
                                            Sure. Good suggestion. --S ... Sean Lahman http://seanlahman.com
                                            Message 21 of 24 , Dec 4, 2013
                                            • 0 Attachment
                                              Sure.  Good suggestion.
                                              --S
                                              ---
                                              Sean Lahman
                                              http://seanlahman.com


                                              On Tue, Dec 3, 2013 at 7:07 PM, Clem Comly <ccomly@...> wrote:
                                               

                                              Sean, could the release include an updated set of consistency check results?  Thanks.
                                               
                                              Clem Comly

                                              __._,_._

                                            • Mike Emeigh
                                              My data had better agree with Retrosheet; that s where it came from!! Mike Emeigh MWE55inNC@gmail.com 919-389-1013 mike.emeigh on
                                              Message 22 of 24 , Dec 4, 2013
                                              • 0 Attachment

                                                My data had better agree with Retrosheet; that’s where it came from!!

                                                 

                                                Mike Emeigh

                                                MWE55inNC@...

                                                919-389-1013

                                                mike.emeigh on Skype

                                                http://www.linkedin.com/in/mwemeigh

                                                 

                                                From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of Shane McCarthy
                                                Sent: Wednesday, December 4, 2013 4:41 PM
                                                To: baseball-databank@yahoogroups.com
                                                Subject: Re: [baseball-databank] Re: SF and SH for pitchers?

                                                 

                                                 

                                                Below is a list of the differences I found on retrosheet web pages and the

                                                values from the spread sheet Mike posted. 

                                                 

                                                I did not understand the naming conventions for the

                                                web pages so I simply give the address of the site I found the data on.

                                                The values of SH and SF below are the same as from the table Mike posted yesterday.

                                                 

                                                Year   retroID   SH    SF   web address showing different value

                                                 

                                                The difference in SH in 1999 was between the yearly totals from LBD and Mike posted.  The

                                                 

                                                The difference in SF in 1973 was between LBD and Mike. Mike's data agrees

                                                 

                                                A rather alarming discovery was that in 1985 there were three entries differing.

                                                Two of which cancel each other out.  I had only been using yearly

                                                statistics to check what was found.  Even if those totals agree there

                                                could be within that season difference that cancel each other out.  Despite the 

                                                fact that agreement between Mike and the RetroSheet web pages is nice; it does 

                                                not ensure the individual player's have the correct SH and SF.

                                                 

                                                On Wed, Dec 4, 2013 at 8:38 AM, Shane McCarthy <shane.mccarthy@...> wrote:

                                                Clem I will see if I can figure out what the problem is using the resource you have pointed to for 2010 and the five other years as well.

                                                 

                                                Thanks,

                                                 

                                                Shane McCarthy

                                                 

                                                On Wed, Dec 4, 2013 at 8:36 AM, Shane McCarthy <shane.mccarthy@...> wrote:

                                                I have compared Mikes values for SH/SF/GDP with what I had.  As Mike

                                                described his gidp include more than a simple search for GDP would reveal

                                                but our SH and SF agree for 1973-2012.  I will download the 2013 event

                                                files today but I would assume 2013 would also agree.  The earlier years

                                                have .edx files present.  I need to learn more about these files before I

                                                include them.

                                                 

                                                Below is a list of discrepencies between Mike's SH and SF 

                                                values and Lahman's Baseball Database""

                                                 

                                                Year  SH, SF

                                                1973 1 1 

                                                1985 1 0

                                                1991 1 0

                                                1995 1 0

                                                1999 -1 0

                                                2010 -1 0

                                                 

                                                The numbers in the table where calculated by Mike - Lahman.  A positive

                                                number mean Lahman's Baseball Database has a larger entry.  Is there a

                                                more convenient abbreviation used for Lahman's Baseball Database;  I would

                                                expect or suggest LBD.

                                                 

                                                 

                                                On Tue, Dec 3, 2013 at 7:59 PM, Clem Comly <ccomly@...> wrote:

                                                 

                                                Shane, can you use the team totals on Retroheet (http://www.retrosheet.org/boxesetc/2010/YT_2010.htm) to ID which team

                                                is the problem then look at that team’s pitchers to see who has wrong SH count?

                                                 

                                                Clem Comly

                                                 

                                                 

                                                 

                                              • Aidan Shealy
                                                But couldn t retro sheet update their data? Wouldn t that result in possible miscalculations? Sent from my iPhone ... But couldn t retro sheet update their
                                                Message 23 of 24 , Dec 4, 2013
                                                • 0 Attachment
                                                  But couldn't retro sheet update their data? Wouldn't that result in possible miscalculations?

                                                  Sent from my iPhone

                                                  On Dec 4, 2013, at 5:17 PM, Mike Emeigh <mwe55innc@...> wrote:

                                                   

                                                  My data had better agree with Retrosheet; that’s where it came from!!

                                                   

                                                  Mike Emeigh

                                                  MWE55inNC@...

                                                  919-389-1013

                                                  mike.emeigh on Skype

                                                  http://www.linkedin.com/in/mwemeigh

                                                   

                                                  From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of Shane McCarthy
                                                  Sent: Wednesday, December 4, 2013 4:41 PM
                                                  To: baseball-databank@yahoogroups.com
                                                  Subject: Re: [baseball-databank] Re: SF and SH for pitchers?

                                                   

                                                   

                                                  Below is a list of the differences I found on retrosheet web pages and the

                                                  values from the spread sheet Mike posted. 

                                                   

                                                  I did not understand the naming conventions for the

                                                  web pages so I simply give the address of the site I found the data on.

                                                  The values of SH and SF below are the same as from the table Mike posted yesterday.

                                                   

                                                  Year   retroID   SH    SF   web address showing different value

                                                   

                                                  The difference in SH in 1999 was between the yearly totals from LBD and Mike posted.  The

                                                   

                                                  The difference in SF in 1973 was between LBD and Mike. Mike's data agrees

                                                   

                                                  A rather alarming discovery was that in 1985 there were three entries differing.

                                                  Two of which cancel each other out.  I had only been using yearly

                                                  statistics to check what was found.  Even if those totals agree there

                                                  could be within that season difference that cancel each other out.  Despite the 

                                                  fact that agreement between Mike and the RetroSheet web pages is nice; it does 

                                                  not ensure the individual player's have the correct SH and SF.

                                                   

                                                  On Wed, Dec 4, 2013 at 8:38 AM, Shane McCarthy <shane.mccarthy@...> wrote:

                                                  Clem I will see if I can figure out what the problem is using the resource you have pointed to for 2010 and the five other years as well.

                                                   

                                                  Thanks,

                                                   

                                                  Shane McCarthy

                                                   

                                                  On Wed, Dec 4, 2013 at 8:36 AM, Shane McCarthy <shane.mccarthy@...> wrote:

                                                  I have compared Mikes values for SH/SF/GDP with what I had.  As Mike

                                                  described his gidp include more than a simple search for GDP would reveal

                                                  but our SH and SF agree for 1973-2012.  I will download the 2013 event

                                                  files today but I would assume 2013 would also agree.  The earlier years

                                                  have .edx files present.  I need to learn more about these files before I

                                                  include them.

                                                   

                                                  Below is a list of discrepencies between Mike's SH and SF 

                                                  values and Lahman's Baseball Database""

                                                   

                                                  Year  SH, SF

                                                  1973 1 1 

                                                  1985 1 0

                                                  1991 1 0

                                                  1995 1 0

                                                  1999 -1 0

                                                  2010 -1 0

                                                   

                                                  The numbers in the table where calculated by Mike - Lahman.  A positive

                                                  number mean Lahman's Baseball Database has a larger entry.  Is there a

                                                  more convenient abbreviation used for Lahman's Baseball Database;  I would

                                                  expect or suggest LBD.

                                                   

                                                   

                                                  On Tue, Dec 3, 2013 at 7:59 PM, Clem Comly <ccomly@...> wrote:

                                                   

                                                  Shane, can you use the team totals on Retroheet (http://www.retrosheet.org/boxesetc/2010/YT_2010.htm) to ID which team

                                                  is the problem then look at that team’s pitchers to see who has wrong SH count?

                                                   

                                                  Clem Comly

                                                   

                                                   

                                                   

                                                • Shane McCarthy
                                                  That is probably the explanation. The sites listed previously by me were updated most recently on 10 December 2012. Shane ... That is probably the
                                                  Message 24 of 24 , Dec 4, 2013
                                                  • 0 Attachment
                                                    That is probably the explanation.  The sites listed previously by me were updated most recently on 10 December 2012.

                                                    Shane


                                                    On Wed, Dec 4, 2013 at 6:31 PM, Aidan Shealy <aidanshealy@...> wrote:
                                                     

                                                    But couldn't retro sheet update their data? Wouldn't that result in possible miscalculations?

                                                    Sent from my iPhone

                                                    On Dec 4, 2013, at 5:17 PM, Mike Emeigh <mwe55innc@...> wrote:

                                                     

                                                    My data had better agree with Retrosheet; that’s where it came from!!

                                                     

                                                     

                                                    From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of Shane McCarthy
                                                    Sent: Wednesday, December 4, 2013 4:41 PM
                                                    To: baseball-databank@yahoogroups.com
                                                    Subject: Re: [baseball-databank] Re: SF and SH for pitchers?

                                                     

                                                     

                                                    Below is a list of the differences I found on retrosheet web pages and the

                                                    values from the spread sheet Mike posted. 

                                                     

                                                    I did not understand the naming conventions for the

                                                    web pages so I simply give the address of the site I found the data on.

                                                    The values of SH and SF below are the same as from the table Mike posted yesterday.

                                                     

                                                    Year   retroID   SH    SF   web address showing different value

                                                     

                                                    The difference in SH in 1999 was between the yearly totals from LBD and Mike posted.  The

                                                     

                                                    The difference in SF in 1973 was between LBD and Mike. Mike's data agrees

                                                     

                                                    A rather alarming discovery was that in 1985 there were three entries differing.

                                                    Two of which cancel each other out.  I had only been using yearly

                                                    statistics to check what was found.  Even if those totals agree there

                                                    could be within that season difference that cancel each other out.  Despite the 

                                                    fact that agreement between Mike and the RetroSheet web pages is nice; it does 

                                                    not ensure the individual player's have the correct SH and SF.

                                                     

                                                    On Wed, Dec 4, 2013 at 8:38 AM, Shane McCarthy <shane.mccarthy@...> wrote:

                                                    Clem I will see if I can figure out what the problem is using the resource you have pointed to for 2010 and the five other years as well.

                                                     

                                                    Thanks,

                                                     

                                                    Shane McCarthy

                                                     

                                                    On Wed, Dec 4, 2013 at 8:36 AM, Shane McCarthy <shane.mccarthy@...> wrote:

                                                    I have compared Mikes values for SH/SF/GDP with what I had.  As Mike

                                                    described his gidp include more than a simple search for GDP would reveal

                                                    but our SH and SF agree for 1973-2012.  I will download the 2013 event

                                                    files today but I would assume 2013 would also agree.  The earlier years

                                                    have .edx files present.  I need to learn more about these files before I

                                                    include them.

                                                     

                                                    Below is a list of discrepencies between Mike's SH and SF 

                                                    values and Lahman's Baseball Database""

                                                     

                                                    Year  SH, SF

                                                    1973 1 1 

                                                    1985 1 0

                                                    1991 1 0

                                                    1995 1 0

                                                    1999 -1 0

                                                    2010 -1 0

                                                     

                                                    The numbers in the table where calculated by Mike - Lahman.  A positive

                                                    number mean Lahman's Baseball Database has a larger entry.  Is there a

                                                    more convenient abbreviation used for Lahman's Baseball Database;  I would

                                                    expect or suggest LBD.

                                                     

                                                     

                                                    On Tue, Dec 3, 2013 at 7:59 PM, Clem Comly <ccomly@...> wrote:

                                                     

                                                    Shane, can you use the team totals on Retroheet (http://www.retrosheet.org/boxesetc/2010/YT_2010.htm) to ID which team

                                                    is the problem then look at that team’s pitchers to see who has wrong SH count?

                                                     

                                                    Clem Comly

                                                     

                                                     

                                                     


                                                  Your message has been successfully submitted and would be delivered to recipients shortly.