Loading ...
Sorry, an error occurred while loading the content.

The Future

Expand Messages
  • Tangotiger
    The question is now how to ensure the transition to a new ownership group so that the public continues to have access to a quality database. I put up a thread
    Message 1 of 16 , Mar 30, 2011
    View Source
    • 0 Attachment
      The question is now how to ensure the transition to a new ownership group
      so that the public continues to have access to a quality database.

      I put up a thread on my blog (start at post #5):
      http://www.insidethebook.com/ee/index.php/site/comments/future_baseball_databank_updates/

      And SABR came up as a possible alternative. Please read that thread. In
      responding to FX, I said this:

      "If the issue was just to provide an annual update (say the 2011 season
      data), then dozens of people here can do that in a matter of minutes, by
      using the latest BDB build plus using Retrosheet.

      There are two main areas that SABR provides a huge benefit, and that is
      the bio data and minor league data. Another strong area is that there is
      a (presumably) dedicated machine [note: as in a well-oiled process] in
      place to ensure continued viability.

      What is unclear to me from your post, and perhaps you can elaborate, is
      what is this “community of interest”, and does this guarantee that the
      public will have unfettered access to the data? And if not, what
      limitations are imposed on the public? "

      Here are the questions that you have to answer for yourselves:
      1. Do I really care about getting all the non-Retrosheet (basically
      pre-1950) data corrected in the future? (i.e., some new guy in 1914 was
      just found; some guy's birth year was changed from 1888 to 1890, etc)

      2. Do I care at all about getting minor league data?

      3. Do I care about having one place that I know will be around for the
      next 5 years?

      If the answer is "no" to all of these, then it seems that the members here
      are happy to just wing it year-by-year, as some fine folks will simply
      supply the annual updates, without caring much about prior seasons.

      If the answer is "yes" to any of these, then it becomes a question of a
      more dedicated organization (like SABR) to control the data.

      What is it that you guys want?

      Tom

      ---------------------------------------------
      The Book--Playing The Percentages In Baseball
      http://www.InsideTheBook.com
    • paulriker
      I had a plan to one day build a transformable online database that we could all share. I was thinking wikipedia but a database. Transformable because we could
      Message 2 of 16 , Mar 30, 2011
      View Source
      • 0 Attachment
        I had a plan to one day build a transformable online database that we could all share. I was thinking wikipedia but a database. Transformable because we could add different datasets and relate them to existing datasets with ease. Expand data to include pitch by pitch data through the season. My hopes were, if I built the foundation, that everyone would contribute to the administration of the data, just like wikipedia. There is a lot of more interesting data out there on top of what BDB has provided that I'd love for us to share, specifically contract and team transactional data.

        I think if we build a central database that we can share we can eliminate duplicate efforts. We can focus on the analysis and tool development instead of the administration of the data.

        If this is something people are interested in please let me know and I can move forward with it. I'd hate to develop something like this and not have it used.

        Paul


        --- In baseball-databank@yahoogroups.com, "Tangotiger" <tom@...> wrote:
        >
        > The question is now how to ensure the transition to a new ownership group
        > so that the public continues to have access to a quality database.
        >
        > I put up a thread on my blog (start at post #5):
        > http://www.insidethebook.com/ee/index.php/site/comments/future_baseball_databank_updates/
        >
        > And SABR came up as a possible alternative. Please read that thread. In
        > responding to FX, I said this:
        >
        > "If the issue was just to provide an annual update (say the 2011 season
        > data), then dozens of people here can do that in a matter of minutes, by
        > using the latest BDB build plus using Retrosheet.
        >
        > There are two main areas that SABR provides a huge benefit, and that is
        > the bio data and minor league data. Another strong area is that there is
        > a (presumably) dedicated machine [note: as in a well-oiled process] in
        > place to ensure continued viability.
        >
        > What is unclear to me from your post, and perhaps you can elaborate, is
        > what is this "community of interest", and does this guarantee that the
        > public will have unfettered access to the data? And if not, what
        > limitations are imposed on the public? "
        >
        > Here are the questions that you have to answer for yourselves:
        > 1. Do I really care about getting all the non-Retrosheet (basically
        > pre-1950) data corrected in the future? (i.e., some new guy in 1914 was
        > just found; some guy's birth year was changed from 1888 to 1890, etc)
        >
        > 2. Do I care at all about getting minor league data?
        >
        > 3. Do I care about having one place that I know will be around for the
        > next 5 years?
        >
        > If the answer is "no" to all of these, then it seems that the members here
        > are happy to just wing it year-by-year, as some fine folks will simply
        > supply the annual updates, without caring much about prior seasons.
        >
        > If the answer is "yes" to any of these, then it becomes a question of a
        > more dedicated organization (like SABR) to control the data.
        >
        > What is it that you guys want?
        >
        > Tom
        >
        > ---------------------------------------------
        > The Book--Playing The Percentages In Baseball
        > http://www.InsideTheBook.com
        >
      • anson2995
        Sorry I m late to the conversation. I have many thoughts on this, but foremost is to express my desire to keep an open source database of baseball stats
        Message 3 of 16 , Mar 30, 2011
        View Source
        • 0 Attachment
          Sorry I'm late to the conversation. I have many thoughts on this, but foremost is to express my desire to keep an open source database of baseball stats available to all who want it. That goal is at odds with the folks who are commercial stat providers, and I suspect that's why Palmer (and now Foreman) aren't active proponents. That's part of the reason why it's been hard to garner much support for the Databank effort, both because a free database costs them licensing opportunities, but also because it creates competitors. That's an entirely reasonable view, just not one that helps the idea of an open source database.

          I've always thought SABR was a natural home for such a project since an open source database fosters new research, but have always meet with resistance to that idea. How could SABR help the databank project? By providing access to and allowing integration of its datasets... such as the biographical data. The databank doesnt need SABR's help with infrastructure support or storage space.

          The fact is that the databank project can survive without Foreman's support. 99% of the work of maintaining it since the mid-1990s has been done by three or four people, and there's no reason that model couldn't continue. Mechanisms already exist for folks who want to integrate outside datasets -- Retrosheet, F/X, etc. I think the core audience for this database is not interested in those things.

          Regards,
          Sean Lahman
        • F. X. Flinn
          SABR s thinking has been along the lines of the creative commons type license, where non-commercial use is OK but if put to commercial uses then you have to
          Message 4 of 16 , Mar 30, 2011
          View Source
          • 0 Attachment
            SABR's thinking has been along the lines of the creative commons type license, where non-commercial use is OK but if put to commercial uses then you have to strike a deal. That approach has facilitated monetizing Pete's efforts, the efforts of various SABR committees working on demographics, minor leagues, etc., while at the same time enabling members to use the data. So its a win-win all around (the sums aren't large, by the way).

            I guess my bottom line is that sentiment inside SABR exists for providing an equivalent to BDB, and if we could somehow join forces that would be important for the long term. I'd rather see SABR do something with you than suddenly show up online one day saying "hey, guys, here's BDB+" without any prior attempt to bring you aboard.

            BTW I do want to stress that I am only speaking for myself, albeit as a long time board member.

            F. X. Flinn
            802-369-0069 | fb:f.x.flinn | t:fxflinn | fxflinn@...


            On Wed, Mar 30, 2011 at 1:43 PM, anson2995 <slahman@...> wrote:
             

            Sorry I'm late to the conversation. I have many thoughts on this, but foremost is to express my desire to keep an open source database of baseball stats available to all who want it. That goal is at odds with the folks who are commercial stat providers, and I suspect that's why Palmer (and now Foreman) aren't active proponents. That's part of the reason why it's been hard to garner much support for the Databank effort, both because a free database costs them licensing opportunities, but also because it creates competitors. That's an entirely reasonable view, just not one that helps the idea of an open source database.

            I've always thought SABR was a natural home for such a project since an open source database fosters new research, but have always meet with resistance to that idea. How could SABR help the databank project? By providing access to and allowing integration of its datasets... such as the biographical data. The databank doesnt need SABR's help with infrastructure support or storage space.

            The fact is that the databank project can survive without Foreman's support. 99% of the work of maintaining it since the mid-1990s has been done by three or four people, and there's no reason that model couldn't continue. Mechanisms already exist for folks who want to integrate outside datasets -- Retrosheet, F/X, etc. I think the core audience for this database is not interested in those things.

            Regards,
            Sean Lahman




            --
            F. X. Flinn
            FXFlinn@gmail | 802-369-0069

          • Jacob Drew
            I m interested. If you get rolling, I can help with development, possibly even offer a mirror host.
            Message 5 of 16 , Mar 30, 2011
            View Source
            • 0 Attachment
              I'm interested. If you get rolling, I can help with development, possibly even offer a mirror host.

              On 3/30/2011 10:32 AM, paulriker wrote:
               

              I had a plan to one day build a transformable online database that we could all share. I was thinking wikipedia but a database. Transformable because we could add different datasets and relate them to existing datasets with ease. Expand data to include pitch by pitch data through the season. My hopes were, if I built the foundation, that everyone would contribute to the administration of the data, just like wikipedia. There is a lot of more interesting data out there on top of what BDB has provided that I'd love for us to share, specifically contract and team transactional data.

              I think if we build a central database that we can share we can eliminate duplicate efforts. We can focus on the analysis and tool development instead of the administration of the data.

              If this is something people are interested in please let me know and I can move forward with it. I'd hate to develop something like this and not have it used.

              Paul

              --- In baseball-databank@yahoogroups.com, "Tangotiger" <tom@...> wrote:
              >
              > The question is now how to ensure the transition to a new ownership group
              > so that the public continues to have access to a quality database.
              >
              > I put up a thread on my blog (start at post #5):
              > http://www.insidethebook.com/ee/index.php/site/comments/future_baseball_databank_updates/
              >
              > And SABR came up as a possible alternative. Please read that thread. In
              > responding to FX, I said this:
              >
              > "If the issue was just to provide an annual update (say the 2011 season
              > data), then dozens of people here can do that in a matter of minutes, by
              > using the latest BDB build plus using Retrosheet.
              >
              > There are two main areas that SABR provides a huge benefit, and that is
              > the bio data and minor league data. Another strong area is that there is
              > a (presumably) dedicated machine [note: as in a well-oiled process] in
              > place to ensure continued viability.
              >
              > What is unclear to me from your post, and perhaps you can elaborate, is
              > what is this "community of interest", and does this guarantee that the
              > public will have unfettered access to the data? And if not, what
              > limitations are imposed on the public? "
              >
              > Here are the questions that you have to answer for yourselves:
              > 1. Do I really care about getting all the non-Retrosheet (basically
              > pre-1950) data corrected in the future? (i.e., some new guy in 1914 was
              > just found; some guy's birth year was changed from 1888 to 1890, etc)
              >
              > 2. Do I care at all about getting minor league data?
              >
              > 3. Do I care about having one place that I know will be around for the
              > next 5 years?
              >
              > If the answer is "no" to all of these, then it seems that the members here
              > are happy to just wing it year-by-year, as some fine folks will simply
              > supply the annual updates, without caring much about prior seasons.
              >
              > If the answer is "yes" to any of these, then it becomes a question of a
              > more dedicated organization (like SABR) to control the data.
              >
              > What is it that you guys want?
              >
              > Tom
              >
              > ---------------------------------------------
              > The Book--Playing The Percentages In Baseball
              > http://www.InsideTheBook.com
              >

            • Paul Riker
              One thing I want to add to my previous post. I would develop the data structure so it would support multiple sports. Instead of AB being a field, AB would be a
              Message 6 of 16 , Mar 30, 2011
              View Source
              • 0 Attachment

                One thing I want to add to my previous post. I would develop the data structure so it would support multiple sports. Instead of AB being a field, AB would be a record. So if we needed to add a stat or a different sport it would just be a matter of adding records and not redesigning the database. This wouldn’t affect the end user at all because a query or view would exist to present the data to the users in a tabular format.

                 

                Paul

                 

                From: baseball-databank@yahoogroups.com [mailto:baseball-databank@yahoogroups.com] On Behalf Of F. X. Flinn
                Sent: Wednesday, March 30, 2011 1:57 PM
                To: baseball-databank@yahoogroups.com
                Subject: Re: [baseball-databank] Re: The Future

                 

                 

                SABR's thinking has been along the lines of the creative commons type license, where non-commercial use is OK but if put to commercial uses then you have to strike a deal. That approach has facilitated monetizing Pete's efforts, the efforts of various SABR committees working on demographics, minor leagues, etc., while at the same time enabling members to use the data. So its a win-win all around (the sums aren't large, by the way).

                I guess my bottom line is that sentiment inside SABR exists for providing an equivalent to BDB, and if we could somehow join forces that would be important for the long term. I'd rather see SABR do something with you than suddenly show up online one day saying "hey, guys, here's BDB+" without any prior attempt to bring you aboard.

                BTW I do want to stress that I am only speaking for myself, albeit as a long time board member.

                F. X. Flinn
                802-369-0069 | fb:f.x.flinn | t:fxflinn | fxflinn@...

                On Wed, Mar 30, 2011 at 1:43 PM, anson2995 <slahman@...> wrote:

                 

                Sorry I'm late to the conversation. I have many thoughts on this, but foremost is to express my desire to keep an open source database of baseball stats available to all who want it. That goal is at odds with the folks who are commercial stat providers, and I suspect that's why Palmer (and now Foreman) aren't active proponents. That's part of the reason why it's been hard to garner much support for the Databank effort, both because a free database costs them licensing opportunities, but also because it creates competitors. That's an entirely reasonable view, just not one that helps the idea of an open source database.

                I've always thought SABR was a natural home for such a project since an open source database fosters new research, but have always meet with resistance to that idea. How could SABR help the databank project? By providing access to and allowing integration of its datasets... such as the biographical data. The databank doesnt need SABR's help with infrastructure support or storage space.

                The fact is that the databank project can survive without Foreman's support. 99% of the work of maintaining it since the mid-1990s has been done by three or four people, and there's no reason that model couldn't continue. Mechanisms already exist for folks who want to integrate outside datasets -- Retrosheet, F/X, etc. I think the core audience for this database is not interested in those things.

                Regards,
                Sean Lahman




                --
                F. X. Flinn
                FXFlinn@gmail | 802-369-0069

              • Sean Forman
                ... Sean (Lahman) , It appears you misread my initial note regarding my stepping back from support of the BDB, and took it to mean that I was scared about the
                Message 7 of 16 , Mar 30, 2011
                View Source
                • 0 Attachment
                  On Wed, Mar 30, 2011 at 1:43 PM, anson2995 <slahman@...> wrote:

                  Sorry I'm late to the conversation. I have many thoughts on this, but foremost is to express my desire to keep an open source database of baseball stats available to all who want it. That goal is at odds with the folks who are commercial stat providers, and I suspect that's why Palmer (and now Foreman) aren't active proponents. That's part of the reason why it's been hard to garner much support for the Databank effort, both because a free database costs them licensing opportunities, but also because it creates competitors. That's an entirely reasonable view, just not one that helps the idea of an open source database.


                  Sean (Lahman) ,

                  It appears you misread my initial note regarding my stepping back from support of the BDB, and took it to mean that I was scared about the BDB creating competition for Baseball-Reference.com.  

                  That isn't the case.  And if it was, I really wish I would have come to that conclusion much earlier and saved the countless hours from the last nine years I spent maintaining the Databank and producing datasets for you to use in the Baseball Archive and others to use in their iphone apps and other places.  The reason I'm stopping support is that I'm tired and believe it's a good time to make a change for the future.

                  So let me clarify what my position is vis a vis the BDB.  I believe strongly in this project, but I lack the time to manage it single-handedly as I have for the last nine years.

                  What I hope will happen is that SABR and Palmer/Gillette will make a non-commercial open sourced version of the major league and bio data (which I currently license from them for use on B-R (I don't use core bdb data on b-r. I maintain it separately)) available to the general public.  I will gladly pay my fee to them and gladly point the hobbyist or enthusiast to the full data they can use for free.

                  I have been trying to put this bug in their ear for many years now.  I'm only a lowly SABR member and have no pull beyond my bully pulpit and relationship as a customer of theirs.

                  My personal view (and the view of probably anyone who knows anything about this) is that the Palmer DB is the gold standard for the encyclopedia numbers and the SABR bio committee is the gold standard for bio.  I would urge Palmer and SABR to make the db open source for non-commercial use, and provide a way for the community to create new datasets with this as a platform instead of continually having to stamp out LF-CF-RF discrepancies or update death dates.

                  sean forman
                  and steward of The Baseball DataBank since July 2002
                • Theodore Turocy
                  ... I m not sure how widely it s known, but for several years now I have been serving as SABR s dataczar, with remit to manage all the various datasets SABR
                  Message 8 of 16 , Mar 30, 2011
                  View Source
                  • 0 Attachment
                    On 30 Mar 2011, at 19:50 , Sean Forman wrote:

                    > What I hope will happen is that SABR and Palmer/Gillette will make a non-commercial open sourced version of the major league and bio data (which I currently license from them for use on B-R (I don't use core bdb data on b-r. I maintain it separately)) available to the general public. I will gladly pay my fee to them and gladly point the hobbyist or enthusiast to the full data they can use for free.
                    >
                    > I have been trying to put this bug in their ear for many years now. I'm only a lowly SABR member and have no pull beyond my bully pulpit and relationship as a customer of theirs.
                    >
                    > My personal view (and the view of probably anyone who knows anything about this) is that the Palmer DB is the gold standard for the encyclopedia numbers and the SABR bio committee is the gold standard for bio. I would urge Palmer and SABR to make the db open source for non-commercial use, and provide a way for the community to create new datasets with this as a platform instead of continually having to stamp out LF-CF-RF discrepancies or update death dates.


                    I'm not sure how widely it's known, but for several years now I have been serving as SABR's "dataczar, with remit to manage all the various datasets SABR has or licenses.

                    My personal position on the matter matches Sean F.'s. SABR ought to be releasing datasets under, e.g., a Creative Commons license, and ought to be providing resources to maintain datasets for the benefit of the community as a whole.

                    In addition to sharing Sean's assessment about the quality of the SABR demographics and the Palmer/Gillette MLB statistics, a further argument for SABR being involved is logistical. I already maintain the equivalent of basically all of baseball-databank's data - plus significantly more - on a day-to-day basis, as part of a regularized workflow, with tools I've developed over several years of experience. To output the data in the format of baseball-databank or similar, would take maybe two hours to write and test the queries as a one-off. In other words, the ongoing cost of me managing this data would be essentially zero.

                    Where I have been able to, SABR has already started making datasets available under a CC license, for instance, as part of the Baseball ID Working Group.

                    The only thing stopping me from volunteering to take on providing the baseball-databank under the same terms straightaway is that the MLB statistics, which are no doubt the core of the dataset, aren't currently mine to release. I firmly support Open Source and Open Data principles. It is worth remembering that one of those underlying principles is respect for copyright and licensing terms.

                    I am actively working to try to make what Sean F. is proposing a reality. Those of you in the group who are SABR members, I encourage you to write your Board members and tell them the same. :) The baseball community really needs to be spending its time on doing analysis and discovering new information -- not the grunge work of putting together clean datasets.

                    Ted
                  • Derek Adair
                    Gang, I have been rather quiet in this community for a long time for personal reasons, but this issue is something I feel very strongly about. Plus this thread
                    Message 9 of 16 , Mar 30, 2011
                    View Source
                    • 0 Attachment
                      Gang,

                      I have been rather quiet in this community for a long time for personal
                      reasons, but this issue is something I feel very strongly about. Plus this
                      thread is full of throw-back names, so I had to chime in :-)

                      Over the years, I have been involved in a number of efforts, and witnessed
                      several more where grass roots labor built up some store of data or a
                      product of value, and then that data or project got rerouted to something
                      commercial and/or closed to the public. Two obvious examples that come to
                      mind are CDDB (a store of compact disc information) and ICS (Internet
                      Chess Server). I don't want that to happen to this data, and I don't think
                      it has to.

                      One of the reasons for the success of this databank is its resilience. The
                      data source has had multiple shepherds over the years, but once it went
                      public, it hasn't looked back. Strides towards more inclusive formats have
                      been the norm, and each of us can currently download the entire dataset to
                      munge to our heart's content. Each new shepherd (usually named Sean) added
                      a layer of support for the data, and it fleshed out over time to what it
                      is today.

                      There's absolutely further work that could be done with the data. Each
                      spring, grand plans and ideas are raised and some take fruit while others
                      die off. That's fine and good and natural. But it all comes back to the
                      data being available to all of us.

                      A disclaimer is necessary. I am not a fan of SABR, primarily because of
                      the way it has handled its data. The SABR I am familiar with (five plus
                      years ago when I was a member) had closed committees with NDA's, datasets
                      only viewable record by record on the web, and the exact opposite of the
                      spirit this group has had.

                      True, that was a long while ago. The fact that SABR has a "data czar" with
                      the approach that Ted has goes a long way. They have done some great work
                      releasing data sets. But still, with the history there, I can't help but
                      feel like handing over the proverbial keys to the data, including the
                      ability to determine licensing back to us, is a scary step in the wrong
                      direction. "We" own this data now. I can download the data set and munge
                      away. None of us can say for sure that we will be able to do that in two
                      years if we give that away. If we take that risk, the gain must be
                      overwhelmingly worth it. I personally don't see it.

                      I understand this may be a bit of a doom-and-gloom view of where we're
                      going, but as I mentioned, my viewpoint is one of someone who has seen
                      their contributions to CDDB turn write-only. The impact here would be
                      worse, because of the reporting and research use for the data we have
                      collected over the years.

                      Regards,
                      Derek


                      On Wed, 30 Mar 2011, Theodore Turocy wrote:

                      >
                      > On 30 Mar 2011, at 19:50 , Sean Forman wrote:
                      >
                      >> What I hope will happen is that SABR and Palmer/Gillette will make a non-commercial open sourced version of the major league and bio data (which I currently license from them for use on B-R (I don't use core bdb data on b-r. I maintain it separately)) available to the general public. I will gladly pay my fee to them and gladly point the hobbyist or enthusiast to the full data they can use for free.
                      >>
                      >> I have been trying to put this bug in their ear for many years now. I'm only a lowly SABR member and have no pull beyond my bully pulpit and relationship as a customer of theirs.
                      >>
                      >> My personal view (and the view of probably anyone who knows anything about this) is that the Palmer DB is the gold standard for the encyclopedia numbers and the SABR bio committee is the gold standard for bio. I would urge Palmer and SABR to make the db open source for non-commercial use, and provide a way for the community to create new datasets with this as a platform instead of continually having to stamp out LF-CF-RF discrepancies or update death dates.
                      >
                      >
                      > I'm not sure how widely it's known, but for several years now I have been serving as SABR's "dataczar, with remit to manage all the various datasets SABR has or licenses.
                      >
                      > My personal position on the matter matches Sean F.'s. SABR ought to be releasing datasets under, e.g., a Creative Commons license, and ought to be providing resources to maintain datasets for the benefit of the community as a whole.
                      >
                      > In addition to sharing Sean's assessment about the quality of the SABR demographics and the Palmer/Gillette MLB statistics, a further argument for SABR being involved is logistical. I already maintain the equivalent of basically all of baseball-databank's data - plus significantly more - on a day-to-day basis, as part of a regularized workflow, with tools I've developed over several years of experience. To output the data in the format of baseball-databank or similar, would take maybe two hours to write and test the queries as a one-off. In other words, the ongoing cost of me managing this data would be essentially zero.
                      >
                      > Where I have been able to, SABR has already started making datasets available under a CC license, for instance, as part of the Baseball ID Working Group.
                      >
                      > The only thing stopping me from volunteering to take on providing the baseball-databank under the same terms straightaway is that the MLB statistics, which are no doubt the core of the dataset, aren't currently mine to release. I firmly support Open Source and Open Data principles. It is worth remembering that one of those underlying principles is respect for copyright and licensing terms.
                      >
                      > I am actively working to try to make what Sean F. is proposing a reality. Those of you in the group who are SABR members, I encourage you to write your Board members and tell them the same. :) The baseball community really needs to be spending its time on doing analysis and discovering new information -- not the grunge work of putting together clean datasets.
                      >
                      > Ted
                      >
                      >
                      >
                      >
                      > ------------------------------------
                      >
                      > http://www.baseball-databank.org/Yahoo! Groups Links
                      >
                      >
                      >
                      >
                    • F. X. Flinn
                      Derek, I never heard of any committee having any kind of NDA agreement, and I ve been on the board since July 2001. The lack of accessibility of the data was a
                      Message 10 of 16 , Mar 30, 2011
                      View Source
                      • 0 Attachment
                        Derek, I never heard of any committee having any kind of NDA agreement, and I've been on the board since July 2001. The lack of accessibility of the data was a problem we first tried to address by contracting with XMLTeam to build out a system that would make the data truly useful to a larger audience without dba skills, but that didn't work out. Meanwhile bbref had become the defacto place to go, so we felt less compelled to compete with them or with BDB.

                        Bottom line is that SABR could start producing BDB type products tied to a creative commons license in fairly short order, and it's definitely something we have in the hopper once the dust settles on the new move, new staff, new website that's all rolling out as this discussion takes place. If we just went ahead and did that, would all be forgiven?

                        FXF

                        On Wed, Mar 30, 2011 at 6:21 PM, Derek Adair <dadair@...> wrote:
                         

                        Gang,

                        I have been rather quiet in this community for a long time for personal
                        reasons, but this issue is something I feel very strongly about. Plus this
                        thread is full of throw-back names, so I had to chime in :-)

                        Over the years, I have been involved in a number of efforts, and witnessed
                        several more where grass roots labor built up some store of data or a
                        product of value, and then that data or project got rerouted to something
                        commercial and/or closed to the public. Two obvious examples that come to
                        mind are CDDB (a store of compact disc information) and ICS (Internet
                        Chess Server). I don't want that to happen to this data, and I don't think
                        it has to.

                        One of the reasons for the success of this databank is its resilience. The
                        data source has had multiple shepherds over the years, but once it went
                        public, it hasn't looked back. Strides towards more inclusive formats have
                        been the norm, and each of us can currently download the entire dataset to
                        munge to our heart's content. Each new shepherd (usually named Sean) added
                        a layer of support for the data, and it fleshed out over time to what it
                        is today.

                        There's absolutely further work that could be done with the data. Each
                        spring, grand plans and ideas are raised and some take fruit while others
                        die off. That's fine and good and natural. But it all comes back to the
                        data being available to all of us.

                        A disclaimer is necessary. I am not a fan of SABR, primarily because of
                        the way it has handled its data. The SABR I am familiar with (five plus
                        years ago when I was a member) had closed committees with NDA's, datasets
                        only viewable record by record on the web, and the exact opposite of the
                        spirit this group has had.

                        True, that was a long while ago. The fact that SABR has a "data czar" with
                        the approach that Ted has goes a long way. They have done some great work
                        releasing data sets. But still, with the history there, I can't help but
                        feel like handing over the proverbial keys to the data, including the
                        ability to determine licensing back to us, is a scary step in the wrong
                        direction. "We" own this data now. I can download the data set and munge
                        away. None of us can say for sure that we will be able to do that in two
                        years if we give that away. If we take that risk, the gain must be
                        overwhelmingly worth it. I personally don't see it.

                        I understand this may be a bit of a doom-and-gloom view of where we're
                        going, but as I mentioned, my viewpoint is one of someone who has seen
                        their contributions to CDDB turn write-only. The impact here would be
                        worse, because of the reporting and research use for the data we have
                        collected over the years.

                        Regards,
                        Derek



                        On Wed, 30 Mar 2011, Theodore Turocy wrote:

                        >
                        > On 30 Mar 2011, at 19:50 , Sean Forman wrote:
                        >
                        >> What I hope will happen is that SABR and Palmer/Gillette will make a non-commercial open sourced version of the major league and bio data (which I currently license from them for use on B-R (I don't use core bdb data on b-r. I maintain it separately)) available to the general public. I will gladly pay my fee to them and gladly point the hobbyist or enthusiast to the full data they can use for free.
                        >>
                        >> I have been trying to put this bug in their ear for many years now. I'm only a lowly SABR member and have no pull beyond my bully pulpit and relationship as a customer of theirs.
                        >>
                        >> My personal view (and the view of probably anyone who knows anything about this) is that the Palmer DB is the gold standard for the encyclopedia numbers and the SABR bio committee is the gold standard for bio. I would urge Palmer and SABR to make the db open source for non-commercial use, and provide a way for the community to create new datasets with this as a platform instead of continually having to stamp out LF-CF-RF discrepancies or update death dates.
                        >
                        >
                        > I'm not sure how widely it's known, but for several years now I have been serving as SABR's "dataczar, with remit to manage all the various datasets SABR has or licenses.
                        >
                        > My personal position on the matter matches Sean F.'s. SABR ought to be releasing datasets under, e.g., a Creative Commons license, and ought to be providing resources to maintain datasets for the benefit of the community as a whole.
                        >
                        > In addition to sharing Sean's assessment about the quality of the SABR demographics and the Palmer/Gillette MLB statistics, a further argument for SABR being involved is logistical. I already maintain the equivalent of basically all of baseball-databank's data - plus significantly more - on a day-to-day basis, as part of a regularized workflow, with tools I've developed over several years of experience. To output the data in the format of baseball-databank or similar, would take maybe two hours to write and test the queries as a one-off. In other words, the ongoing cost of me managing this data would be essentially zero.
                        >
                        > Where I have been able to, SABR has already started making datasets available under a CC license, for instance, as part of the Baseball ID Working Group.
                        >
                        > The only thing stopping me from volunteering to take on providing the baseball-databank under the same terms straightaway is that the MLB statistics, which are no doubt the core of the dataset, aren't currently mine to release. I firmly support Open Source and Open Data principles. It is worth remembering that one of those underlying principles is respect for copyright and licensing terms.
                        >
                        > I am actively working to try to make what Sean F. is proposing a reality. Those of you in the group who are SABR members, I encourage you to write your Board members and tell them the same. :) The baseball community really needs to be spending its time on doing analysis and discovering new information -- not the grunge work of putting together clean datasets.
                        >
                        > Ted
                        >
                        >
                        >
                        >
                        > ------------------------------------
                        >
                        > http://www.baseball-databank.org/Yahoo! Groups Links
                        >
                        >
                        >
                        >



                        --
                        F. X. Flinn
                        FXFlinn@gmail | 802-369-0069

                      • Theodore Turocy
                        ... To move the discussion forward: How would having the data licensed under a Creative Commons license not address this concern -- remembering that licenses
                        Message 11 of 16 , Mar 30, 2011
                        View Source
                        • 0 Attachment
                          On 30 Mar 2011, at 23:21 , Derek Adair wrote:

                          >
                          >
                          > True, that was a long while ago. The fact that SABR has a "data czar" with
                          > the approach that Ted has goes a long way. They have done some great work
                          > releasing data sets. But still, with the history there, I can't help but
                          > feel like handing over the proverbial keys to the data, including the
                          > ability to determine licensing back to us, is a scary step in the wrong
                          > direction.

                          To move the discussion forward: How would having the data licensed under a Creative Commons license not address this concern -- remembering that licenses cannot be retroactively changed.

                          TLT
                        • Derek Adair
                          FX, Well, it s obviously not my call to forgive SABR. The good thing is you can go ahead and release those BDB-like products anyway (at least to my
                          Message 12 of 16 , Mar 30, 2011
                          View Source
                          • 0 Attachment
                            FX,

                            Well, it's obviously not my call to forgive SABR. The good thing is you
                            can go ahead and release those BDB-like products anyway (at least to my
                            understanding; I-Am-Not-a-Lawyer). There are multiple ways to do what you
                            said successfully, and there are a number of ways you can botch it. There
                            are a half-dozen variants of the creative commons license, and there are
                            varying kinds of "products" you could provide back. There's also the open
                            question of whether this data would be hosted just for SABR members, or
                            for everyone. I am honestly hoping for your efforts here to be successful,
                            but that's in your organization's hands.

                            For what it's worth (not much), I would recommend you not get too hung up
                            on meeting the needs of those missing dba skills, and stick with making
                            the data available as the first pass. Zipped csv/xls file storage is much
                            less expensive and much quicker to market than an ambitious contract
                            project.

                            My advice to the current crop of databank contributors is to be open about
                            contributing but guarded about risk. If something does go wrong, and the
                            new maintainer takes it closed, being able to roll back and only do a year
                            of updates, vs. catch up from this point in time is critical. I'd like to
                            avoid a fork, but we owe it to current and future consumers of the data to
                            ensure the set is available to the masses.

                            Regards,
                            Derek

                            P.S. On the note re: NDA's, the committee in question was the Negro
                            Leagues committee. In order to access their data, you were asked to sign
                            an NDA because they were working on a book at the time. I can't remember
                            exact dates, but it could have been earlier than your board service.

                            On Wed, 30 Mar 2011, F. X. Flinn wrote:

                            > Derek, I never heard of any committee having any kind of NDA agreement, and
                            > I've been on the board since July 2001. The lack of accessibility of the
                            > data was a problem we first tried to address by contracting with XMLTeam to
                            > build out a system that would make the data truly useful to a larger
                            > audience without dba skills, but that didn't work out. Meanwhile bbref had
                            > become the defacto place to go, so we felt less compelled to compete with
                            > them or with BDB.
                            >
                            > Bottom line is that SABR could start producing BDB type products tied to a
                            > creative commons license in fairly short order, and it's definitely
                            > something we have in the hopper once the dust settles on the new move, new
                            > staff, new website that's all rolling out as this discussion takes place. If
                            > we just went ahead and did that, would all be forgiven?
                            >
                            > FXF
                            >
                            > On Wed, Mar 30, 2011 at 6:21 PM, Derek Adair <dadair@...> wrote:
                            >
                            >>
                            >>
                            >> Gang,
                            >>
                            >> I have been rather quiet in this community for a long time for personal
                            >> reasons, but this issue is something I feel very strongly about. Plus this
                            >> thread is full of throw-back names, so I had to chime in :-)
                            >>
                            >> Over the years, I have been involved in a number of efforts, and witnessed
                            >> several more where grass roots labor built up some store of data or a
                            >> product of value, and then that data or project got rerouted to something
                            >> commercial and/or closed to the public. Two obvious examples that come to
                            >> mind are CDDB (a store of compact disc information) and ICS (Internet
                            >> Chess Server). I don't want that to happen to this data, and I don't think
                            >> it has to.
                            >>
                            >> One of the reasons for the success of this databank is its resilience. The
                            >> data source has had multiple shepherds over the years, but once it went
                            >> public, it hasn't looked back. Strides towards more inclusive formats have
                            >> been the norm, and each of us can currently download the entire dataset to
                            >> munge to our heart's content. Each new shepherd (usually named Sean) added
                            >> a layer of support for the data, and it fleshed out over time to what it
                            >> is today.
                            >>
                            >> There's absolutely further work that could be done with the data. Each
                            >> spring, grand plans and ideas are raised and some take fruit while others
                            >> die off. That's fine and good and natural. But it all comes back to the
                            >> data being available to all of us.
                            >>
                            >> A disclaimer is necessary. I am not a fan of SABR, primarily because of
                            >> the way it has handled its data. The SABR I am familiar with (five plus
                            >> years ago when I was a member) had closed committees with NDA's, datasets
                            >> only viewable record by record on the web, and the exact opposite of the
                            >> spirit this group has had.
                            >>
                            >> True, that was a long while ago. The fact that SABR has a "data czar" with
                            >> the approach that Ted has goes a long way. They have done some great work
                            >> releasing data sets. But still, with the history there, I can't help but
                            >> feel like handing over the proverbial keys to the data, including the
                            >> ability to determine licensing back to us, is a scary step in the wrong
                            >> direction. "We" own this data now. I can download the data set and munge
                            >> away. None of us can say for sure that we will be able to do that in two
                            >> years if we give that away. If we take that risk, the gain must be
                            >> overwhelmingly worth it. I personally don't see it.
                            >>
                            >> I understand this may be a bit of a doom-and-gloom view of where we're
                            >> going, but as I mentioned, my viewpoint is one of someone who has seen
                            >> their contributions to CDDB turn write-only. The impact here would be
                            >> worse, because of the reporting and research use for the data we have
                            >> collected over the years.
                            >>
                            >> Regards,
                            >> Derek
                            >>
                            >>
                            >> On Wed, 30 Mar 2011, Theodore Turocy wrote:
                            >>
                            >>>
                            >>> On 30 Mar 2011, at 19:50 , Sean Forman wrote:
                            >>>
                            >>>> What I hope will happen is that SABR and Palmer/Gillette will make a
                            >> non-commercial open sourced version of the major league and bio data (which
                            >> I currently license from them for use on B-R (I don't use core bdb data on
                            >> b-r. I maintain it separately)) available to the general public. I will
                            >> gladly pay my fee to them and gladly point the hobbyist or enthusiast to the
                            >> full data they can use for free.
                            >>>>
                            >>>> I have been trying to put this bug in their ear for many years now. I'm
                            >> only a lowly SABR member and have no pull beyond my bully pulpit and
                            >> relationship as a customer of theirs.
                            >>>>
                            >>>> My personal view (and the view of probably anyone who knows anything
                            >> about this) is that the Palmer DB is the gold standard for the encyclopedia
                            >> numbers and the SABR bio committee is the gold standard for bio. I would
                            >> urge Palmer and SABR to make the db open source for non-commercial use, and
                            >> provide a way for the community to create new datasets with this as a
                            >> platform instead of continually having to stamp out LF-CF-RF discrepancies
                            >> or update death dates.
                            >>>
                            >>>
                            >>> I'm not sure how widely it's known, but for several years now I have been
                            >> serving as SABR's "dataczar, with remit to manage all the various datasets
                            >> SABR has or licenses.
                            >>>
                            >>> My personal position on the matter matches Sean F.'s. SABR ought to be
                            >> releasing datasets under, e.g., a Creative Commons license, and ought to be
                            >> providing resources to maintain datasets for the benefit of the community as
                            >> a whole.
                            >>>
                            >>> In addition to sharing Sean's assessment about the quality of the SABR
                            >> demographics and the Palmer/Gillette MLB statistics, a further argument for
                            >> SABR being involved is logistical. I already maintain the equivalent of
                            >> basically all of baseball-databank's data - plus significantly more - on a
                            >> day-to-day basis, as part of a regularized workflow, with tools I've
                            >> developed over several years of experience. To output the data in the format
                            >> of baseball-databank or similar, would take maybe two hours to write and
                            >> test the queries as a one-off. In other words, the ongoing cost of me
                            >> managing this data would be essentially zero.
                            >>>
                            >>> Where I have been able to, SABR has already started making datasets
                            >> available under a CC license, for instance, as part of the Baseball ID
                            >> Working Group.
                            >>>
                            >>> The only thing stopping me from volunteering to take on providing the
                            >> baseball-databank under the same terms straightaway is that the MLB
                            >> statistics, which are no doubt the core of the dataset, aren't currently
                            >> mine to release. I firmly support Open Source and Open Data principles. It
                            >> is worth remembering that one of those underlying principles is respect for
                            >> copyright and licensing terms.
                            >>>
                            >>> I am actively working to try to make what Sean F. is proposing a reality.
                            >> Those of you in the group who are SABR members, I encourage you to write
                            >> your Board members and tell them the same. :) The baseball community really
                            >> needs to be spending its time on doing analysis and discovering new
                            >> information -- not the grunge work of putting together clean datasets.
                            >>>
                            >>> Ted
                            >>>
                            >>>
                            >>>
                            >>>
                            >>> ------------------------------------
                            >>>
                            >>> http://www.baseball-databank.org/Yahoo! Groups Links
                            >>>
                            >>>
                            >>>
                            >>>
                            >>
                            >>
                            >
                            >
                            >
                            > --
                            > F. X. Flinn
                            > FXFlinn@gmail | 802-369-0069
                            >
                          • Jeff Zimmerman
                            I feel the data should be available with absolutely no restrictions on the data. I am not a lawyer and others aren t either. I am just a baseball fan and
                            Message 13 of 16 , Mar 30, 2011
                            View Source
                            • 0 Attachment
                              I feel the data should be available with absolutely no restrictions on the data.  I am not a lawyer and others aren't either.  I am just a baseball fan and just want to use the data.  I don't want to have to find out if I can/can't use any/all of the data in anything I do that may/may not make a dollar.  I would just like to know what the Royals winning percentage was from 1986 to current and be able to post in an article and not worry if the any laws were/weren't broken.  No restrictions, no worries.  

                              Jeff Zimmerman




                              To: baseball-databank@yahoogroups.com
                              From: dadair@...
                              Date: Wed, 30 Mar 2011 19:47:51 -0400
                              Subject: Re: [baseball-databank] Re: The Future

                               
                              FX,

                              Well, it's obviously not my call to forgive SABR. The good thing is you
                              can go ahead and release those BDB-like products anyway (at least to my
                              understanding; I-Am-Not-a-Lawyer). There are multiple ways to do what you
                              said successfully, and there are a number of ways you can botch it. There
                              are a half-dozen variants of the creative commons license, and there are
                              varying kinds of "products" you could provide back. There's also the open
                              question of whether this data would be hosted just for SABR members, or
                              for everyone. I am honestly hoping for your efforts here to be successful,
                              but that's in your organization's hands.

                              For what it's worth (not much), I would recommend you not get too hung up
                              on meeting the needs of those missing dba skills, and stick with making
                              the data available as the first pass. Zipped csv/xls file storage is much
                              less expensive and much quicker to market than an ambitious contract
                              project.

                              My advice to the current crop of databank contributors is to be open about
                              contributing but guarded about risk. If something does go wrong, and the
                              new maintainer takes it closed, being able to roll back and only do a year
                              of updates, vs. catch up from this point in time is critical. I'd like to
                              avoid a fork, but we owe it to current and future consumers of the data to
                              ensure the set is available to the masses.

                              Regards,
                              Derek

                              P.S. On the note re: NDA's, the committee in question was the Negro
                              Leagues committee. In order to access their data, you were asked to sign
                              an NDA because they were working on a book at the time. I can't remember
                              exact dates, but it could have been earlier than your board service.

                              On Wed, 30 Mar 2011, F. X. Flinn wrote:

                              > Derek, I never heard of any committee having any kind of NDA agreement, and
                              > I've been on the board since July 2001. The lack of accessibility of the
                              > data was a problem we first tried to address by contracting with XMLTeam to
                              > build out a system that would make the data truly useful to a larger
                              > audience without dba skills, but that didn't work out. Meanwhile bbref had
                              > become the defacto place to go, so we felt less compelled to compete with
                              > them or with BDB.
                              >
                              > Bottom line is that SABR could start producing BDB type products tied to a
                              > creative commons license in fairly short order, and it's definitely
                              > something we have in the hopper once the dust settles on the new move, new
                              > staff, new website that's all rolling out as this discussion takes place. If
                              > we just went ahead and did that, would all be forgiven?
                              >
                              > FXF
                              >
                              > On Wed, Mar 30, 2011 at 6:21 PM, Derek Adair <dadair@...> wrote:
                              >
                              >>
                              >>
                              >> Gang,
                              >>
                              >> I have been rather quiet in this community for a long time for personal
                              >> reasons, but this issue is something I feel very strongly about. Plus this
                              >> thread is full of throw-back names, so I had to chime in :-)
                              >>
                              >> Over the years, I have been involved in a number of efforts, and witnessed
                              >> several more where grass roots labor built up some store of data or a
                              >> product of value, and then that data or project got rerouted to something
                              >> commercial and/or closed to the public. Two obvious examples that come to
                              >> mind are CDDB (a store of compact disc information) and ICS (Internet
                              >> Chess Server). I don't want that to happen to this data, and I don't think
                              >> it has to.
                              >>
                              >> One of the reasons for the success of this databank is its resilience. The
                              >> data source has had multiple shepherds over the years, but once it went
                              >> public, it hasn't looked back. Strides towards more inclusive formats have
                              >> been the norm, and each of us can currently download the entire dataset to
                              >> munge to our heart's content. Each new shepherd (usually named Sean) added
                              >> a layer of support for the data, and it fleshed out over time to what it
                              >> is today.
                              >>
                              >> There's absolutely further work that could be done with the data. Each
                              >> spring, grand plans and ideas are raised and some take fruit while others
                              >> die off. That's fine and good and natural. But it all comes back to the
                              >> data being available to all of us.
                              >>
                              >> A disclaimer is necessary. I am not a fan of SABR, primarily because of
                              >> the way it has handled its data. The SABR I am familiar with (five plus
                              >> years ago when I was a member) had closed committees with NDA's, datasets
                              >> only viewable record by record on the web, and the exact opposite of the
                              >> spirit this group has had.
                              >>
                              >> True, that was a long while ago. The fact that SABR has a "data czar" with
                              >> the approach that Ted has goes a long way. They have done some great work
                              >> releasing data sets. But still, with the history there, I can't help but
                              >> feel like handing over the proverbial keys to the data, including the
                              >> ability to determine licensing back to us, is a scary step in the wrong
                              >> direction. "We" own this data now. I can download the data set and munge
                              >> away. None of us can say for sure that we will be able to do that in two
                              >> years if we give that away. If we take that risk, the gain must be
                              >> overwhelmingly worth it. I personally don't see it.
                              >>
                              >> I understand this may be a bit of a doom-and-gloom view of where we're
                              >> going, but as I mentioned, my viewpoint is one of someone who has seen
                              >> their contributions to CDDB turn write-only. The impact here would be
                              >> worse, because of the reporting and research use for the data we have
                              >> collected over the years.
                              >>
                              >> Regards,
                              >> Derek
                              >>
                              >>
                              >> On Wed, 30 Mar 2011, Theodore Turocy wrote:
                              >>
                              >>>
                              >>> On 30 Mar 2011, at 19:50 , Sean Forman wrote:
                              >>>
                              >>>> What I hope will happen is that SABR and Palmer/Gillette will make a
                              >> non-commercial open sourced version of the major league and bio data (which
                              >> I currently license from them for use on B-R (I don't use core bdb data on
                              >> b-r. I maintain it separately)) available to the general public. I will
                              >> gladly pay my fee to them and gladly point the hobbyist or enthusiast to the
                              >> full data they can use for free.
                              >>>>
                              >>>> I have been trying to put this bug in their ear for many years now. I'm
                              >> only a lowly SABR member and have no pull beyond my bully pulpit and
                              >> relationship as a customer of theirs.
                              >>>>
                              >>>> My personal view (and the view of probably anyone who knows anything
                              >> about this) is that the Palmer DB is the gold standard for the encyclopedia
                              >> numbers and the SABR bio committee is the gold standard for bio. I would
                              >> urge Palmer and SABR to make the db open source for non-commercial use, and
                              >> provide a way for the community to create new datasets with this as a
                              >> platform instead of continually having to stamp out LF-CF-RF discrepancies
                              >> or update death dates.
                              >>>
                              >>>
                              >>> I'm not sure how widely it's known, but for several years now I have been
                              >> serving as SABR's "dataczar, with remit to manage all the various datasets
                              >> SABR has or licenses.
                              >>>
                              >>> My personal position on the matter matches Sean F.'s. SABR ought to be
                              >> releasing datasets under, e.g., a Creative Commons license, and ought to be
                              >> providing resources to maintain datasets for the benefit of the community as
                              >> a whole.
                              >>>
                              >>> In addition to sharing Sean's assessment about the quality of the SABR
                              >> demographics and the Palmer/Gillette MLB statistics, a further argument for
                              >> SABR being involved is logistical. I already maintain the equivalent of
                              >> basically all of baseball-databank's data - plus significantly more - on a
                              >> day-to-day basis, as part of a regularized workflow, with tools I've
                              >> developed over several years of experience. To output the data in the format
                              >> of baseball-databank or similar, would take maybe two hours to write and
                              >> test the queries as a one-off. In other words, the ongoing cost of me
                              >> managing this data would be essentially zero.
                              >>>
                              >>> Where I have been able to, SABR has already started making datasets
                              >> available under a CC license, for instance, as part of the Baseball ID
                              >> Working Group.
                              >>>
                              >>> The only thing stopping me from volunteering to take on providing the
                              >> baseball-databank under the same terms straightaway is that the MLB
                              >> statistics, which are no doubt the core of the dataset, aren't currently
                              >> mine to release. I firmly support Open Source and Open Data principles. It
                              >> is worth remembering that one of those underlying principles is respect for
                              >> copyright and licensing terms.
                              >>>
                              >>> I am actively working to try to make what Sean F. is proposing a reality.
                              >> Those of you in the group who are SABR members, I encourage you to write
                              >> your Board members and tell them the same. :) The baseball community really
                              >> needs to be spending its time on doing analysis and discovering new
                              >> information -- not the grunge work of putting together clean datasets.
                              >>>
                              >>> Ted
                              >>>
                              >>>
                              >>>
                              >>>
                              >>> ------------------------------------
                              >>>
                              >>> http://www.baseball-databank.org/Yahoo! Groups Links
                              >>>
                              >>>
                              >>>
                              >>>
                              >>
                              >>
                              >
                              >
                              >
                              > --
                              > F. X. Flinn
                              > FXFlinn@gmail | 802-369-0069
                              >

                            • Sean Forman
                              I think people are talking about two different things here and probably with no real need. 1) Can a group of BDB stalwarts (or non-stalwarts) or multiple
                              Message 14 of 16 , Mar 31, 2011
                              View Source
                              • 0 Attachment
                                I think people are talking about two different things here and probably with no real need.

                                1) Can a group of BDB stalwarts (or non-stalwarts) or multiple independent groups take what has been released so far (which is inferior to the sabr/palmer dataset) and maintain and update new databases and allow people to use it for whatever purpose they see fit?

                                The answer to that is yes on all accounts.  SABR can release whatever they want and non-sabr entities can release whatever they want.  There is no license posted for any of the data I produced, so it's out there in the wild for everyone to use.  It people want to produce the Tango-Lahman-Adair db or the Tango DB, the Lahman DB, and Adair DB, there is nothing stopping them.

                                2) Can and should SABR release a database that in effect serves as a replacement for the BDB? 
                                   And can they put a licensing agreement on it that allows unlimited personal use and limited commercial use?  

                                The answer to both of those is yes, they can do that, and they don't really need anybody here or anywhere else to be on board or give them permission.  They should just do it if they want to.  And like I said the BDB data is in the wild, so if they want something just take it and use it like anybody else would.

                                So IMO, it's nice of FX and Ted to stop in and say, "We may have a solution for future releases, so don't be too worried." and is probably good marketing, but there is no need for them to co-opt or bring on board BDB listserv members or incorporate previous BDB work or ethos.

                                SABR can have their discussion and this yahoo group (which is all it really is) can have its discussion, but at the end of the day whoever ships bits is going to have people using their data.  And everyone in this discussion is free to do whatever they want in terms of shipping bits.

                                sean
                                ---
                                Sean Forman
                                Sports Reference LLC, President
                                http://www.sports-reference.com/

                              • mwe55innc@gmail.com
                                On Mar 31, 2011 8:57am, Sean Forman wrote: (snip) ... I don t normally do this, but... Well said. Mike Emeigh MWE55inNC@gmail.com
                                Message 15 of 16 , Mar 31, 2011
                                View Source
                                • 0 Attachment
                                  On Mar 31, 2011 8:57am, Sean Forman wrote:
                                  (snip)
                                  >
                                  > SABR can have their discussion and this yahoo group (which is all it really is) can have its discussion, but at the end of the day whoever ships bits is going to have people using their data.  And everyone in this discussion is free to do whatever they want in terms of shipping bits.

                                  I don't normally do this, but...

                                  Well said.

                                  Mike Emeigh
                                  MWE55inNC@...
                                • Tangotiger
                                  ... I d actually be quite happy if SABR did do something and said exactly that, and then published it for the general public with a non-commercial license.
                                  Message 16 of 16 , Mar 31, 2011
                                  View Source
                                  • 0 Attachment
                                    > I'd rather see SABR do something with you
                                    > than
                                    > suddenly show up online one day saying "hey, guys, here's BDB+" without
                                    > any
                                    > prior attempt to bring you aboard.
                                    >

                                    I'd actually be quite happy if SABR did do something and said exactly
                                    that, and then published it for the general public with a non-commercial
                                    license.

                                    Preceding that, SABR could just as well do the same thing with its bio
                                    data, its minor league data, and its HR data. Let SABR have the reins,
                                    and then publish for the general public with the constraint of a
                                    non-commercial license.

                                    As for providing a commercial license, SABR can solicit micropayments like
                                    here:
                                    http://www.donorschoose.org/fallon-colbert-project

                                    And once enough people contribute (whatever SABR decides... 1000$, 2000$,
                                    5000$), then a commercial license has been provided.

                                    Tom
                                  Your message has been successfully submitted and would be delivered to recipients shortly.