Loading ...
Sorry, an error occurred while loading the content.

The BdB purpose

Expand Messages
  • Peter Kreutzer
    I call everyone s attention to the BdB Statement of Purpose: http://www.baseball-databank.org/purpose.txt. I m sorry to say that s my main contribution here
    Message 1 of 19 , Mar 31, 2011
    View Source
    • 0 Attachment
      I call everyone's attention to the BdB Statement of Purpose: http://www.baseball-databank.org/purpose.txt. I'm sorry to say that's my main contribution here over many years, but I am very appreciative of the hard work of others.

      It sounds to me like the issue here is how best to advance and protect that purpose for those of us who have served some function at Baseball Databank over the years.

      Are our energies best served maintaining and correcting datasets and records, or are our energies best poured into supporting SABR's efforts to do the same?

      It sounds as if SABR wants to release the major league dataset under a Creative Commons license for non-commercial use. That's more restrictive than the BdB license, I believe. Are we okay with that?

      And more importantly, to my mind, is what happens to other datasets? Will (or can) SABR release, under a similar license Negro and Minor league sets? Their biographical database? Is the all important BBID project going to be available for everyone's free use?

      I don't know the answer to these questions, but it seems to me that SABR is the perfect home for our efforts if our overall goals can be furthered there. But if the restrictions SABR already has and has to live with are going to prevent free public development of and access to ancillary datasets, then we're better off maintaining the core BdB on our own, so that some time in the future developers and volunteers can compile the missing and ancillary data.

      I don't mean this as a slight to SABR, and I think we all hope they release all the data they can whether BdB joins up or not, but we should be as pragmatic as possible protecting the ideal of getting all the data out there, one way or the other.

      Cheers,
      Peter
    • F. X. Flinn
      I d like to suggest that you all turn the question around in your heads and imagine a scenario where BDB was a research committee or community of interest
      Message 2 of 19 , Mar 31, 2011
      View Source
      • 0 Attachment
        I'd like to suggest that you all turn the question around in your heads and imagine a scenario where BDB was a research committee or community of interest ("chartered community" is the nomenclature others preferred). It tells SABR's ED and board, look we need these resources x,y,z and this is what we do for the baseball research community.

        Then you get to be part of the conversation about the whys and hows of some of the organizational barriers we've been contending with. Instead of SABR appearing as a black box, you're inside helping reduce the size of the black box, providing more push to support those of us in leadership positions who've been trying to get to Yes on all the questions you're asking.

        And if it doesn't work out, you take your marbles and leave.

        Instead BDB has always just assumed it couldn't work out and never bothered to give it the old college try. Like Bobby Kennedy said, see things as they could be and ask, hey, why not?

        FXF

        On Thu, Mar 31, 2011 at 8:40 AM, Peter Kreutzer <askrotoman@...> wrote:
         

        I call everyone's attention to the BdB Statement of Purpose: http://www.baseball-databank.org/purpose.txt. I'm sorry to say that's my main contribution here over many years, but I am very appreciative of the hard work of others.


        It sounds to me like the issue here is how best to advance and protect that purpose for those of us who have served some function at Baseball Databank over the years.

        Are our energies best served maintaining and correcting datasets and records, or are our energies best poured into supporting SABR's efforts to do the same?

        It sounds as if SABR wants to release the major league dataset under a Creative Commons license for non-commercial use. That's more restrictive than the BdB license, I believe. Are we okay with that?

        And more importantly, to my mind, is what happens to other datasets? Will (or can) SABR release, under a similar license Negro and Minor league sets? Their biographical database? Is the all important BBID project going to be available for everyone's free use?

        I don't know the answer to these questions, but it seems to me that SABR is the perfect home for our efforts if our overall goals can be furthered there. But if the restrictions SABR already has and has to live with are going to prevent free public development of and access to ancillary datasets, then we're better off maintaining the core BdB on our own, so that some time in the future developers and volunteers can compile the missing and ancillary data.

        I don't mean this as a slight to SABR, and I think we all hope they release all the data they can whether BdB joins up or not, but we should be as pragmatic as possible protecting the ideal of getting all the data out there, one way or the other.

        Cheers,
        Peter



        --
        F. X. Flinn
        FXFlinn@gmail | 802-369-0069

      • Rod Nelson
        ... Indeed. Three words: SABR Chartered Community. I will note that the SABR Negro League Committee, per se, does not maintain a database for use by SABR
        Message 3 of 19 , Mar 31, 2011
        View Source
        • 0 Attachment
          On Thu, Mar 31, 2011 at 8:40 AM, Peter Kreutzer <askrotoman@...> wrote:

          I don't know the answer to these questions, but it seems to me that SABR is the perfect home for our efforts if our overall goals can be furthered there.

          Indeed.  Three words: SABR Chartered Community.

          I will note that the SABR Negro League Committee, per se, does not maintain a database for use by SABR members or others.  Several of its members did participate in the Baseball Hall of Fame sponsored project that was awarded to Negro League Researchers and Authors Group (NLRAG), but those members do not own the rights to distribute the dataset. We in SABR would also like to see those restrictions lifted, as well. It is feasible that the restrictions may someday (soon) be lifted, because SABR (and not Baseball Databank) does have an institutional relationship with The National Baseball Library and Hall of Fame, and for that matter, also Major League Baseball. The fact the NLRAG database is not among SABR's knowledge assets should be understood by all concerned so that it is not used as a strawman in this discussion.


          Rod Nelson
        • Mike
          Here is Mike Crain s .02 on the issue. First, I d like to say that I enjoyed being part of SABR.But time, money and life had changed where that is no longer
          Message 4 of 19 , Apr 8 7:10 PM
          View Source
          • 0 Attachment
            Here is Mike Crain's .02 on the issue.

            First, I'd like to say that I enjoyed being part of SABR.But time, money and life had changed where that is no longer possible. I established some friendships (like Rod and FX) that I'll always cherish.

            That being said, the key issue is the free distribution of information. I have a copy of the BDB from last year, integrated with other projects I was working on (Baseball cards, transactions, other hall of fames, etc). I had also slung together coaches, owners and umpires from other locations (mainly Retrosheet). Ideally, I'd love one downloadable database that has all that data, and then yearly updates. I think that we are a big community, but our work is sometimes fragmented because we work in our own worlds. Sure we share information, but it may take some legwork to get to.

            If SABR wants to take it, and release it with other data that "they" have, I say it's a good idea. And if they want to have some data (I.e. Negro leagues, minor leagues) that is available for members only, then place that into a members area until the time comes to publicly release it (But set a date or say "We can not release this until the Hall of Fame (or MLB, or who ever else) says it's OK". Two of my main concerns is what the transition would be like from "BDB 2011" to "SABR DB 2012" and how do we treat non-members who are doing some of the work for us?

            I'm not in a position to volunteer much time, but I think a joining of forces with the big 3 (BDB, retrosheet, SABR) could be great for everybody.

            Looks like that was more like ten cents. :-)




            --- In baseball-databank@yahoogroups.com, "F. X. Flinn" <fxflinn@...> wrote:
            >
            > I'd like to suggest that you all turn the question around in your heads and
            > imagine a scenario where BDB was a research committee or community of
            > interest ("chartered community" is the nomenclature others preferred). It
            > tells SABR's ED and board, look we need these resources x,y,z and this is
            > what we do for the baseball research community.
            >
            > Then you get to be part of the conversation about the whys and hows of some
            > of the organizational barriers we've been contending with. Instead of SABR
            > appearing as a black box, you're inside helping reduce the size of the black
            > box, providing more push to support those of us in leadership positions
            > who've been trying to get to Yes on all the questions you're asking.
            >
            > And if it doesn't work out, you take your marbles and leave.
            >
            > Instead BDB has always just assumed it couldn't work out and never bothered
            > to give it the old college try. Like Bobby Kennedy said, see things as they
            > could be and ask, hey, why not?
            >
            > FXF
            >
            > On Thu, Mar 31, 2011 at 8:40 AM, Peter Kreutzer <askrotoman@...>wrote:
            >
            > >
            > >
            > > I call everyone's attention to the BdB Statement of Purpose:
            > > http://www.baseball-databank.org/purpose.txt. I'm sorry to say that's my
            > > main contribution here over many years, but I am very appreciative of the
            > > hard work of others.
            > >
            > > It sounds to me like the issue here is how best to advance and protect that
            > > purpose for those of us who have served some function at Baseball Databank
            > > over the years.
            > >
            > > Are our energies best served maintaining and correcting datasets and
            > > records, or are our energies best poured into supporting SABR's efforts to
            > > do the same?
            > >
            > > It sounds as if SABR wants to release the major league dataset under a
            > > Creative Commons license for non-commercial use. That's more restrictive
            > > than the BdB license, I believe. Are we okay with that?
            > >
            > > And more importantly, to my mind, is what happens to other datasets? Will
            > > (or can) SABR release, under a similar license Negro and Minor league sets?
            > > Their biographical database? Is the all important BBID project going to be
            > > available for everyone's free use?
            > >
            > > I don't know the answer to these questions, but it seems to me that SABR is
            > > the perfect home for our efforts if our overall goals can be furthered
            > > there. But if the restrictions SABR already has and has to live with are
            > > going to prevent free public development of and access to ancillary
            > > datasets, then we're better off maintaining the core BdB on our own, so that
            > > some time in the future developers and volunteers can compile the missing
            > > and ancillary data.
            > >
            > > I don't mean this as a slight to SABR, and I think we all hope they release
            > > all the data they can whether BdB joins up or not, but we should be as
            > > pragmatic as possible protecting the ideal of getting all the data out
            > > there, one way or the other.
            > >
            > > Cheers,
            > > Peter
            > >
            > >
            >
            >
            >
            > --
            > F. X. Flinn
            > FXFlinn@gmail | 802-369-0069
            >
          • Tangotiger
            ... I agree. I asked a point-blank question on what it would take for SABR to release, right now, the bio data or minors data, to the general public.
            Message 5 of 19 , Apr 8 8:11 PM
            View Source
            • 0 Attachment
              > That being said, the key issue is the free distribution of information. I

              I agree.

              I asked a point-blank question on what it would take for SABR to release,
              right now, the bio data or minors data, to the general public. Hopefully
              FX will have an answer to that specific question.

              If that can't be done, for whatever reason, then I don't see how we're
              going to get what we need from SABR in a larger sense.

              Tom
            • F. X. Flinn
              Until I saw this I didn t realize Tom had replied to my message on 4/1. I think I was busy with work and getting ready for my roti league auction and just
              Message 6 of 19 , Apr 9 7:30 AM
              View Source
              • 0 Attachment
                Until I saw this I didn't realize Tom had replied to my message on 4/1. I think I was busy with work and getting ready for my roti league auction and just overlooked it because the thread has a dumb subject line referencing a digest. I responded on the other thread.



                FXF

                On Fri, Apr 8, 2011 at 11:11 PM, Tangotiger <tom@...> wrote:
                 

                > That being said, the key issue is the free distribution of information. I

                I agree.

                I asked a point-blank question on what it would take for SABR to release,
                right now, the bio data or minors data, to the general public. Hopefully
                FX will have an answer to that specific question.

                If that can't be done, for whatever reason, then I don't see how we're
                going to get what we need from SABR in a larger sense.

                Tom




                --
                F. X. Flinn
                FXFlinn@gmail | 802-369-0069

              • Tangotiger
                Someone sent me an email asking what the main issue is. It seems that with all the posts going around, that those on the periphery aren t following along too
                Message 7 of 19 , Apr 9 9:22 AM
                View Source
                • 0 Attachment
                  Someone sent me an email asking what the main issue is. It seems that
                  with all the posts going around, that those on the periphery aren't
                  following along too closely. So, this post goes out to all those who want
                  a quick status update.

                  1. Hosting the data is not the issue. My site, Lahman's site, someone
                  else. We'll find a place easily enough at no cost.

                  2. Nor is the annual update an issue. There are plenty of places to get
                  that, not the least of which is Retrosheet's annual update. It's a small
                  issue in that no one has yet decided to take ownership of this.

                  3. Nor is the updating of data that Retrosheet has as issue. Plenty of
                  people here could handle that as well. Again, a somewhat minor issue, in
                  that no one has ownership of this.

                  4. The main issue is the quality of the non-Retrosheet data, and how best
                  to get those updated. And how much added value there is to this data
                  (compared to what's already in BDB).

                  5. A side issue is extending BDB to include minor league and other league
                  data.

                  So, that's where we are.

                  Also, I don't really want to be involved as the point person at all.
                  Ideally, with Forman stepping aside, a group of young volunteers will step
                  up and own this. I'm speaking now only to make sure the conversation
                  happens, so that someone will eventually take over.

                  Tom
                • Matthew Gargano
                  Can you provide some elucidation on *non-Retrosheet data *as referred to in: 4. The main issue is the quality of the *non-Retrosheet data*, and how best to get
                  Message 8 of 19 , Apr 9 9:45 AM
                  View Source
                  • 0 Attachment
                    Can you provide some elucidation on non-Retrosheet data as referred to in:


                    4. The main issue is the quality of the non-Retrosheet data, and how best
                    to get those updated.  And how much added value there is to this data
                    (compared to what's already in BDB).


                    I'd love to contribute however I can, and if the project is broken down into small digestible pieces I would think the desired desired end state would be achieved far quicker.  That's the project manager in me. :)

                    Thanks,

                    Matthew Gargano

                    On Sat, Apr 9, 2011 at 12:22 PM, Tangotiger <tom@...> wrote:
                    Someone sent me an email asking what the main issue is.  It seems that
                    with all the posts going around, that those on the periphery aren't
                    following along too closely.  So, this post goes out to all those who want
                    a quick status update.

                    1. Hosting the data is not the issue.  My site, Lahman's site, someone
                    else.  We'll find a place easily enough at no cost.

                    2. Nor is the annual update an issue.  There are plenty of places to get
                    that, not the least of which is Retrosheet's annual update.  It's a small
                    issue in that no one has yet decided to take ownership of this.

                    3. Nor is the updating of data that Retrosheet has as issue.  Plenty of
                    people here could handle that as well.  Again, a somewhat minor issue, in
                    that no one has ownership of this.

                    4. The main issue is the quality of the non-Retrosheet data, and how best
                    to get those updated.  And how much added value there is to this data
                    (compared to what's already in BDB).

                    5. A side issue is extending BDB to include minor league and other league
                    data.

                    So, that's where we are.

                    Also, I don't really want to be involved as the point person at all.
                    Ideally, with Forman stepping aside, a group of young volunteers will step
                    up and own this.  I'm speaking now only to make sure the conversation
                    happens, so that someone will eventually take over.

                    Tom





                    ------------------------------------

                    http://www.baseball-databank.org/Yahoo! Groups Links

                    <*> To visit your group on the web, go to:
                       http://groups.yahoo.com/group/baseball-databank/

                    <*> Your email settings:
                       Individual Email | Traditional

                    <*> To change settings online go to:
                       http://groups.yahoo.com/group/baseball-databank/join
                       (Yahoo! ID required)

                    <*> To change settings via email:
                       baseball-databank-digest@yahoogroups.com
                       baseball-databank-fullfeatured@yahoogroups.com

                    <*> To unsubscribe from this group, send an email to:
                       baseball-databank-unsubscribe@yahoogroups.com

                    <*> Your use of Yahoo! Groups is subject to:
                       http://docs.yahoo.com/info/terms/


                  • Tangotiger
                    ... Retrosheet can be broken down into 2 parts: 1. Play-by-play (PBP) data 2. Boxscore data For the PBP data, I think it s considered 100% complete from
                    Message 9 of 19 , Apr 9 10:01 AM
                    View Source
                    • 0 Attachment
                      > Can you provide some elucidation on *non-Retrosheet data *as referred to
                      > in:
                      >

                      Retrosheet can be broken down into 2 parts:
                      1. Play-by-play (PBP) data
                      2. Boxscore data

                      For the PBP data, I think it's considered 100% complete from 1974-onwards,
                      and 95 or 99% complete since 1950.

                      The boxscore data goes back even further, but of course, we won't be able
                      to 100% get the seasonal data like we are used to, to confirm against.
                      We'd be able to confirm RBI, etc, but not innings played, or other things
                      that are harder to infer.

                      So, it becomes a process in terms of using the Retro data to generate what
                      we want, and then fill in the gaps with reliable data somewhere.

                      As Forman noted, Palmer is the gold standard for the non-Retro (gap) data.

                      Tom
                    • Sean Forman
                      Tom, I m not sure that Dave Smith would agree with you. There are thousands of discrepancies between retrosheet and the official data and you seem to be
                      Message 10 of 19 , Apr 11 6:38 AM
                      View Source
                      • 0 Attachment
                        Tom,

                        I'm not sure that Dave Smith would agree with you.  There are thousands of discrepancies between retrosheet and the official data and you seem to be thinking the retrosheet data is right in every one of those cases.

                        sean
                        ---
                        Sean Forman
                        Sports Reference LLC, President
                        http://www.sports-reference.com/



                        On Sat, Apr 9, 2011 at 1:01 PM, Tangotiger <tom@...> wrote:
                         


                        > Can you provide some elucidation on *non-Retrosheet data *as referred to
                        > in:
                        >

                        Retrosheet can be broken down into 2 parts:
                        1. Play-by-play (PBP) data
                        2. Boxscore data

                        For the PBP data, I think it's considered 100% complete from 1974-onwards,
                        and 95 or 99% complete since 1950.

                        The boxscore data goes back even further, but of course, we won't be able
                        to 100% get the seasonal data like we are used to, to confirm against.
                        We'd be able to confirm RBI, etc, but not innings played, or other things
                        that are harder to infer.

                        So, it becomes a process in terms of using the Retro data to generate what
                        we want, and then fill in the gaps with reliable data somewhere.

                        As Forman noted, Palmer is the gold standard for the non-Retro (gap) data.

                        Tom


                      • Tangotiger
                        ... Good point. I suppose the question is what would be the required or gold standard. For my personal preference, and I could very well be in the minority,
                        Message 11 of 19 , Apr 11 6:44 AM
                        View Source
                        • 0 Attachment
                          > Tom,
                          >
                          > I'm not sure that Dave Smith would agree with you. There are thousands of
                          > discrepancies between retrosheet and the official data and you seem to be
                          > thinking the retrosheet data is right in every one of those cases.
                          >
                          > sean

                          Good point. I suppose the question is what would be the required or gold
                          standard.

                          For my personal preference, and I could very well be in the minority, it's
                          whatever we have a best-record for. Others may prefer something
                          "official".

                          I was chatting with Pete Palmer a couple of weeks back, and he was
                          pointing out how with the Retro event and boxscores, thousands of errors
                          were uncovered, even on something like Lou Gehrig seasonal RBI total.
                          (Something like that.) That the game-by-game totals did not match to the
                          official dailies, but that the overall seasonal total happened (by luck)
                          to match to the overall official totals.

                          Basically, the "official" data is going to be replete with errors, even on
                          such a thing and popular and basic as RBIs. So, I can imagine walks,
                          strikeouts, stolen bases, and the like being wrong in hundreds or
                          thousands of cases in non-Retro years. This is why I don't care too much
                          about non-Retro data, and I'm happy with whatever BDB says on that front.

                          But, yes, Sean has a good point, and the user of the data has to decide
                          what makes the data the gold standard for his own use.

                          Tom
                        • anson2995
                          F.X. Flinn wrote... ... That s not accurate, first of all, and your focus on licensing illustrates the fundamental divide here. ... We all acknowledge that
                          Message 12 of 19 , Apr 11 9:33 AM
                          View Source
                          • 0 Attachment
                            F.X. Flinn wrote...

                            > the avowed purpose of BDB is to just put the data out there
                            > unconditionally. How could we license that kind of use?

                            That's not accurate, first of all, and your focus on licensing illustrates the fundamental divide here.


                            > Why isn't BDB data being used by commercial outfits? Why do
                            > they license stuff through SABR, bb-ref, 24/7, Retrosheet?

                            We all acknowledge that Palmer/Gillette/SABR sets the gold standard for baseball data. But those databases are available through a commercial license, which is great for commercial content providers but shuts out researchers and other value-adders. It has to, by its nature. You can exploit licensing opportunities or you can be open source: you can't do both.

                            The BDB was never intended to supersede or compete with Palmer or STATS or others in the commercial arena, but rather to serve an audience which they did not, by making open datasets. There's no reason that both can't co-exist, and there's no reason to expect that either should fulfill both objectives.

                            So, I think the questions about access to existing proprietary databases are ultimately not germane to a discussion about the future of the Databank. At the end of the day, I don't see how an organization can support the development and maintenance of commercial content while at the same time fostering the development of open source content (or vice versa).

                            Let's not devolve into a discussion of IP theory. The central question with respect to the future of the Databank is whether there exists a community of folks willing to maintain an opensource baseball database, and how they (we) ought to proceed.

                            Regards,
                            Sean Lahman
                          • F. X. Flinn
                            I think if anyone took a look at the license SABR had for use of the encyclopedia data (which was on the entry page to the SABR encyclopedia the first time a
                            Message 13 of 19 , Apr 11 10:06 AM
                            View Source
                            • 0 Attachment
                              I think if anyone took a look at the license SABR had for use of the encyclopedia data (which was on the entry page to the SABR encyclopedia the first time a member logged into it on the old site) and looked at our plans for what we asked XML team to develop for us you would know that SABR had already decided that researchers needed to be able to get full access to data. The fact that that project blew up doesn't change the goal, it just delays it.

                              I've been answering Tom's questions for a couple of weeks now, explaining the situation as best I can. One of the problems is that he's not a member, so he doesn't have ready access to things like the policy manual, meeting minutes, brsp discussion group, things of that nature, and it looks very binary to him when it's fairly nuanced and complicated.

                              Anyway, bottom line is that in an ideal world BDB would be under the SABR umbrella. After all, SABR is owned and operated by baseball researchers. 600 votes over two elections is all you need to control it.

                              F. X. Flinn

                              On Mon, Apr 11, 2011 at 12:33 PM, anson2995 <slahman@...> wrote:
                               

                              F.X. Flinn wrote...

                              > the avowed purpose of BDB is to just put the data out there
                              > unconditionally. How could we license that kind of use?

                              That's not accurate, first of all, and your focus on licensing illustrates the fundamental divide here.

                              > Why isn't BDB data being used by commercial outfits? Why do
                              > they license stuff through SABR, bb-ref, 24/7, Retrosheet?

                              We all acknowledge that Palmer/Gillette/SABR sets the gold standard for baseball data. But those databases are available through a commercial license, which is great for commercial content providers but shuts out researchers and other value-adders. It has to, by its nature. You can exploit licensing opportunities or you can be open source: you can't do both.

                              The BDB was never intended to supersede or compete with Palmer or STATS or others in the commercial arena, but rather to serve an audience which they did not, by making open datasets. There's no reason that both can't co-exist, and there's no reason to expect that either should fulfill both objectives.

                              So, I think the questions about access to existing proprietary databases are ultimately not germane to a discussion about the future of the Databank. At the end of the day, I don't see how an organization can support the development and maintenance of commercial content while at the same time fostering the development of open source content (or vice versa).

                              Let's not devolve into a discussion of IP theory. The central question with respect to the future of the Databank is whether there exists a community of folks willing to maintain an opensource baseball database, and how they (we) ought to proceed.

                              Regards,
                              Sean Lahman




                              --
                              F. X. Flinn
                              FXFlinn@gmail | 802-369-0069

                            • Mike Emeigh
                              On Mon, 11 Apr 2011 13:06:03 -0400, F. X. Flinn wrote: (snip) ... It s going to look binary to anyone outside the organization - either SABR can and will share
                              Message 14 of 19 , Apr 11 12:44 PM
                              View Source
                              • 0 Attachment
                                On Mon, 11 Apr 2011 13:06:03 -0400, F. X. Flinn wrote:

                                (snip)
                                >
                                > I've been answering Tom's questions for a couple of weeks now, explaining
                                > the situation as best I can. One of the problems is that he's not a
                                > member,
                                > so he doesn't have ready access to things like the policy manual, meeting
                                > minutes, brsp discussion group, things of that nature, and it looks very
                                > binary to him when it's fairly nuanced and complicated.

                                It's going to look binary to anyone outside the organization - either SABR
                                can and will share under an appropriate licensing agreement, or it won't
                                (or can't). What Tom's trying to find out (as I understand it) is if there
                                are reasons why SABR "can't" share in an open-source model, and whether
                                those issues can be addressed in an open-source model. The perception
                                among people outside isn't that SABR *can't* share in an open-source model
                                - but that it *won't*.

                                >
                                > Anyway, bottom line is that in an ideal world BDB would be under the SABR
                                > umbrella.

                                In *my* ideal world SABR would embrace an open-source model without
                                reservation, and would be prepared to support people who wanted to do it
                                regardless of their status vis-a-vis the organization's so-called
                                "umbrella".

                                We do have this specific objective included in our bylaws, after all: "To
                                encourage further research and literary efforts to establish and maintain
                                the accurate historical record of baseball".
                                --
                                Mike Emeigh
                                MWE55inNC@...
                              • F. X. Flinn
                                Mike, SABR can and eventually it will. FXF ... -- F. X. Flinn FXFlinn@gmail | 802-369-0069
                                Message 15 of 19 , Apr 11 12:46 PM
                                View Source
                                • 0 Attachment
                                  Mike, SABR can and eventually it will.

                                  FXF

                                  On Mon, Apr 11, 2011 at 3:44 PM, Mike Emeigh <mwe55innc@...> wrote:
                                  On Mon, 11 Apr 2011 13:06:03 -0400, F. X. Flinn wrote:

                                  (snip)
                                  >
                                  > I've been answering Tom's questions for a couple of weeks now, explaining
                                  > the situation as best I can. One of the problems is that he's not a
                                  > member,
                                  > so he doesn't have ready access to things like the policy manual, meeting
                                  > minutes, brsp discussion group, things of that nature, and it looks very
                                  > binary to him when it's fairly nuanced and complicated.

                                  It's going to look binary to anyone outside the organization - either SABR
                                  can and will share under an appropriate licensing agreement, or it won't
                                  (or can't). What Tom's trying to find out (as I understand it) is if there
                                  are reasons why SABR "can't" share in an open-source model, and whether
                                  those issues can be addressed in an open-source model. The perception
                                  among people outside isn't that SABR *can't* share in an open-source model
                                  - but that it *won't*.

                                  >
                                  > Anyway, bottom line is that in an ideal world BDB would be under the SABR
                                  > umbrella.

                                  In *my* ideal world SABR would embrace an open-source model without
                                  reservation, and would be prepared to support people who wanted to do it
                                  regardless of their status vis-a-vis the organization's so-called
                                  "umbrella".

                                  We do have this specific objective included in our bylaws, after all: "To
                                  encourage further research and literary efforts to establish and maintain
                                  the accurate historical record of baseball".
                                  --
                                  Mike Emeigh
                                  MWE55inNC@...


                                  ------------------------------------

                                  http://www.baseball-databank.org/Yahoo! Groups Links

                                  <*> To visit your group on the web, go to:
                                     http://groups.yahoo.com/group/baseball-databank/

                                  <*> Your email settings:
                                     Individual Email | Traditional

                                  <*> To change settings online go to:
                                     http://groups.yahoo.com/group/baseball-databank/join
                                     (Yahoo! ID required)

                                  <*> To change settings via email:
                                     baseball-databank-digest@yahoogroups.com
                                     baseball-databank-fullfeatured@yahoogroups.com

                                  <*> To unsubscribe from this group, send an email to:
                                     baseball-databank-unsubscribe@yahoogroups.com

                                  <*> Your use of Yahoo! Groups is subject to:
                                     http://docs.yahoo.com/info/terms/




                                  --
                                  F. X. Flinn
                                  FXFlinn@gmail | 802-369-0069

                                • Mike Emeigh
                                  ... OK, then to come back to Tom s questions: 1. What, specifically, is preventing the bio data from being released made available an open-source license, and
                                  Message 16 of 19 , Apr 11 2:17 PM
                                  View Source
                                  • 0 Attachment
                                    On Mon, 11 Apr 2011 15:46:37 -0400, F. X. Flinn <fxflinn@...> wrote:

                                    > Mike, SABR can and eventually it will.

                                    OK, then to come back to Tom's questions:

                                    1. What, specifically, is preventing the bio data from being released made
                                    available an open-source license, and what can this group do, if anything,
                                    to help expedite the process?

                                    2. What, specifically, is preventing non-Retrosheet data from being made
                                    available under an open-source license, and what can this group do, if
                                    anything, to help expedite the process?

                                    3. What, specifically, is preventing minor league data from being made
                                    available under an open-source license, and what can this group do, if
                                    anything, to help expedite the process?

                                    Because this group's goal, as I understand it - and anyone, please feel
                                    free to jump back in if I am misinterpreting or overstating the case (I do
                                    have a tendency to do that) - is to make everything available to
                                    researchers free of charge, as an open-source product. That goal, to me,
                                    is in perfect alignment with SABR's objectives and mission, as I cited
                                    earlier.

                                    I think what this group would like to know is what it takes to remove
                                    "eventually" from that sentence and replace it with a date more certain -
                                    say, three months, or six months, or three years, or ten years - so that
                                    this group can move on with its mission, with or without any support from
                                    SABR. I think it would be desirable for the two groups to work together,
                                    but this group IMO is not going to be terribly patient with a "wait and
                                    see" approach, because then it moves from the realm of the "can't right
                                    now" to the realm of the "won't".
                                    --
                                    Mike Emeigh
                                    MWE55inNC@...
                                  • Theodore Turocy
                                    ... That s not true. Dual-licensing models are widely and successfully used in the world of open-source software. Just as a for-instance, I ve been maintaining
                                    Message 17 of 19 , Apr 11 3:01 PM
                                    View Source
                                    • 0 Attachment
                                      On 11 Apr 2011, at 17:33 , anson2995 wrote:

                                      > We all acknowledge that Palmer/Gillette/SABR sets the gold standard for baseball data. But those databases are available through a commercial license, which is great for commercial content providers but shuts out researchers and other value-adders. It has to, by its nature. You can exploit licensing opportunities or you can be open source: you can't do both.
                                      >
                                      ...
                                      >
                                      > So, I think the questions about access to existing proprietary databases are ultimately not germane to a discussion about the future of the Databank. At the end of the day, I don't see how an organization can support the development and maintenance of commercial content while at the same time fostering the development of open source content (or vice versa).

                                      That's not true. Dual-licensing models are widely and successfully used in the world of open-source software.

                                      Just as a for-instance, I've been maintaining software released under the GPL for going on 16 years now. During that time, I have licensed that same software under different terms to firms and individuals who wanted to make customizations or extensions, but did not want to be bound by the terms of the GPL. There is no legal or conceptual conflict in that practice, and it has a number of benefits overall: essentially, commercial users wind up subsidizing the development of the open resource.

                                      Ted
                                    • Theodore Turocy
                                      ... With respect to my colleague F.X., having processed the discussions over the last few weeks, I no longer agree with this way forward. I continue to assert
                                      Message 18 of 19 , Apr 11 3:13 PM
                                      View Source
                                      • 0 Attachment
                                        On 11 Apr 2011, at 18:06 , F. X. Flinn wrote:
                                        >
                                        > Anyway, bottom line is that in an ideal world BDB would be under the SABR umbrella. After all, SABR is owned and operated by baseball researchers. 600 votes over two elections is all you need to control it.
                                        >

                                        With respect to my colleague F.X., having processed the discussions over the last few weeks, I no longer agree with this way forward.

                                        I continue to assert that SABR ought to be taking the lead in providing high-quality datasets, under a dual-licensing model: something like Creative Commons for non-commercial use, with an option to purchase a license/support further development for those who want to use the data commercially, or want greater support, detail, or speed of delivery.

                                        That said, as has been pointed out a few times now, for at least some people, baseball-databank serves a different purpose - a completely unrestricted dataset, in which someone can volunteer to take over some part of it, and feel some sense of pride of ownership in maintaining it. For those purposes, it's not critically important that databank has all the latest revisions to demographics and statistics from the 19th and early 20th century, for instance. (And before anyone takes this as a slam, it's not. My argument is based on revealed preference: if it were critically important, someone would have volunteered to make those updates. It hasn't happened. For most people, the additional value of those fixes aren't worth the price of the effort to compile them.)

                                        So my proposed worldline going forward is this: SABR works out the details and starts publishing data under a dual-license scheme. The community can then evaluate it and decide whether that meets everyone's needs. If it does, great. If it doesn't - and it's hard to imagine that it's possible to meet the needs of every last person - then baseball-databank can continue, in whatever form or way people want to continue it. If SABR's datasets become the de-facto successor to baseball-databank, great; if not, that's great too.

                                        Ted
                                      • jdpeterman
                                        I realize that I am not a frequent contributor here, so this is solely my two cents (and may only be worth as much), but I am in agreement with Ted s position.
                                        Message 19 of 19 , Apr 11 4:14 PM
                                        View Source
                                        • 0 Attachment
                                          I realize that I am not a frequent contributor here, so this is solely my two cents (and may only be worth as much), but I am in agreement with Ted's position. I recall the time year's ago when I began my research study into the game and the difficulty there was in finding data when paying for it was troublesome. And even more troublesome was mining through the arguments (we all remember MLB and their contention that the data (facts) were owned by them) about what anyone had the right to use or quote, in both an editorial or commercial sense. To have the BDB as a completely open and usable for any purpose database is essential, in my eyes, particularly for those starting out in the field. Perhaps today I could find the $ to buy that commercial license, but somebody may not be able to. And I'm not sure we want to truncate anybody's passion for moving the data and analysis forward, for whatever that is worth. Also, just because somebody would like to use something for commercial purposes, does not mean they're necessarily making a significant profit from it. In fact, I'm sure that's been the case for more than a few in the field, both in the past, today, and in the future.

                                          JD

                                          --- In baseball-databank@yahoogroups.com, Theodore Turocy <drarbiter@...> wrote:
                                          >
                                          >
                                          > On 11 Apr 2011, at 18:06 , F. X. Flinn wrote:
                                          > >
                                          > > Anyway, bottom line is that in an ideal world BDB would be under the SABR umbrella. After all, SABR is owned and operated by baseball researchers. 600 votes over two elections is all you need to control it.
                                          > >
                                          >
                                          > With respect to my colleague F.X., having processed the discussions over the last few weeks, I no longer agree with this way forward.
                                          >
                                          > I continue to assert that SABR ought to be taking the lead in providing high-quality datasets, under a dual-licensing model: something like Creative Commons for non-commercial use, with an option to purchase a license/support further development for those who want to use the data commercially, or want greater support, detail, or speed of delivery.
                                          >
                                          > That said, as has been pointed out a few times now, for at least some people, baseball-databank serves a different purpose - a completely unrestricted dataset, in which someone can volunteer to take over some part of it, and feel some sense of pride of ownership in maintaining it. For those purposes, it's not critically important that databank has all the latest revisions to demographics and statistics from the 19th and early 20th century, for instance. (And before anyone takes this as a slam, it's not. My argument is based on revealed preference: if it were critically important, someone would have volunteered to make those updates. It hasn't happened. For most people, the additional value of those fixes aren't worth the price of the effort to compile them.)
                                          >
                                          > So my proposed worldline going forward is this: SABR works out the details and starts publishing data under a dual-license scheme. The community can then evaluate it and decide whether that meets everyone's needs. If it does, great. If it doesn't - and it's hard to imagine that it's possible to meet the needs of every last person - then baseball-databank can continue, in whatever form or way people want to continue it. If SABR's datasets become the de-facto successor to baseball-databank, great; if not, that's great too.
                                          >
                                          > Ted
                                          >
                                        Your message has been successfully submitted and would be delivered to recipients shortly.