Loading ...
Sorry, an error occurred while loading the content.

2122Re: [baseball-databank] Re: Means of Updating and Correcting BDB Data

Expand Messages
  • Sean Forman
    May 1, 2004
    • 0 Attachment
      Derek Adair wrote:
      > On Fri, 30 Apr 2004, tmasc@... wrote:
      > >
      > > --- Derek Adair <dadair@...> wrote:
      > > > I like this table, and I like the sources table. I
      > > > think it makes sense to
      > > > have an admin table as well. Generally that will be
      > > > the person who
      > > > committed the file, but I'd rather keep the data in
      > >
      > > I was considering it, but then figured that CVS should
      > > have the person who checked in that version of the
      > > file.
      > This might be something you'd want to display elsewhere, though - release
      > notes, for example.
      > > > DB format than depend
      > > > on CVS to track it. On the other hand, I'd leave CVS
      > > > version out, since
      > > > CVS will be much better at tracking revision numbers
      > >
      > > I agree. That field I have would come *from* CVS.
      > > This DATACHANGE table, through the CVSversion field,
      > > will map directly to CVS. Hence, no need to have the
      > > admin field in DATACHANGE.
      > This is possible, but a couple of things to keep in mind here. First,
      > versions are file-based. So the version number you'd want is the one
      > associated with the table modified. Second, version numbers aren't added
      > until a file is committed. So you'll need to run an update on that file
      > after making the changes but before committing, check the version number,
      > add one, make the change to the changes table and commit. Not saying it's
      > a bad idea at all, just pointing out there's some overhead.
      > Regards,
      > Derek

      This is starting to look a bit onerous.

      The current conversation is about the mechanics of how to make the
      changes and how to store the changes, which is probably the first thing
      that needs to be discussed.

      I currently have a cvs version of the BDB on my computer and I make
      alterations in mysql and then dump the tables out in text format and
      then run cvs commit to log the changes. It takes a few minutes time for
      each commit, but it works easily. I think it would be doable to write a
      script that works the other way and loads the db from the tables checked
      out from CVS. Again, the only issue I see would be speed. I also don't
      know how easy it would be to do this sort of thing in Windows. In
      linux, I can write one script that would update my text files with the
      latest version of the db, load those text files into CVS. Then another
      script to reverse the process. As part of this, we could add a CHANGES
      file to the CVS repository where it would be upon the updater to
      explicitly state the changes made to the files. Naturally, the cvs log
      messages would need to be explicit as well. We'd probably develop a
      standard for this.

      Also MYSQL does have good update logging tools, so one possibility would
      be to have all changes go through mysql and then dump the tables and
      then cvs commit on the update logs and the tables in order to properly
      capture the changes. It is possible to use comments in mysql, but I
      don't know if they are properly logged or not within the log files.

      I think the main thing that we want to capture is how to track who is
      doing what, what changes have been made, and also the ability to rewind
      easily when a disastrous mistake is made. Are there other data projects
      that use CVS? MySQL? a combination?

      I like Paul's idea of framing this using real data.

      Here is an e-mail I got yesterday. Ignoring for the moment that we
      would probably wait for Bill Carle's approval before making this
      correction, what would be the steps for making this change? Be explicit
      in describing the steps.

      -------- Original Message --------
      Subject: BR-A Suggestion
      Date: Fri, 30 Apr 2004 01:02:08 -0400
      From: Michael Timmons
      To: Baseball-Reference Comment <feedback@...>

      This comment regards:

      Frank McCarton was my great-grandfather. According to the 1880 US
      Census, he was born in New York City, not Middletown, CT. Also, in the
      1920 & 1930 US Census, his daughter Sarah (my grandmother) stated her
      father was born in New York. I just like to see the record
      Michael Timmons

      Sean Forman

      Baseball Stats! http://www.Baseball-Reference.com/
      Baseball Analysis! http://www.BaseballPrimer.com/
    • Show all 20 messages in this topic