Loading ...
Sorry, an error occurred while loading the content.

Re: Proposal: Convert current db back to solely numeric keys

Expand Messages
  • tjruane
    ... In the new edition of SABR s Biographical data there are two Bradys, about whom no other information is known. One played for Chicago in 1875 and the
    Message 1 of 8 , Jan 30, 2006
    • 0 Attachment
      Sean Forman wrote:

      > I'm not sure about your Brady example Tom. I see only one Brady
      > in your DB and none in mine.
      >
      > Even so, we have pretty close to a unique identifier and that
      > is good enough for what we are doing. Without a doubt someone
      > would have to come along and clean up any redundant entries.

      In the new edition of SABR's Biographical data there are two Bradys,
      about whom no other information is known. One played for Chicago in
      1875 and the other umpired in the National League in 1877. There are
      also two Jones (one played for 1884 WAS a and the other 1885 NY a),
      two Martin Malones, William McLaughlins, John Ryans, Smiths, Sullivans
      and two John Wards. The fact that no other information is known about
      these people doesn't alter the fact that from the perspective of the
      DB, they have the same birthdate and first and last names.

      I guess I misunderstood what was being proposed here. I thought that
      this combination of data items was being proposed as a unique key.
      Since "pretty close to a unique identifier" is "good enough for what
      we are doing", something else entirely must be being done with these.
      Most of the time, "pretty close" is not "good enough".

      Tom Ruane
    • Paul Wendt
      baseball-databank article number 3000 is another informative one by Tom Ruane. ... In the new edition of SABR s Biographical data there are two Bradys, about
      Message 2 of 8 , Feb 3, 2006
      • 0 Attachment
        baseball-databank article number 3000 is another informative one by
        Tom Ruane.
        >>
        In the new edition of SABR's Biographical data there are two Bradys,
        about whom no other information is known. One played for Chicago in
        1875 and the other umpired in the National League in 1877. There are
        also two Jones (one played for 1884 WAS a and the other 1885 NY a),
        two Martin Malones, William McLaughlins, John Ryans, Smiths, Sullivans
        and two John Wards. The fact that no other information is known about
        these people doesn't alter the fact that from the perspective of the
        DB, they have the same birthdate and first and last names.
        <<

        Right.

        > I guess I misunderstood what was being proposed here.

        KJOK observed that the cost of maintaining or developing the data
        without meaningful person IDs such as the Lahman 5-2-2 is mitigated by
        the fact that people can use a combination such as Last First DOB. I
        am not sure in what contexts, on what scale. One special context,
        almost always tiny or small in scale, is the "handwritten" note to
        this egroup. We use the IDs frequently here.

        Perhaps more to the point, we commonly quote entire or truncated
        records from one of the database tables. Certainly the data
        maintenance function of this egroup works as well as it does partly
        because almost everyone here accesses bbdb tables directly.

        Paul Wendt
      Your message has been successfully submitted and would be delivered to recipients shortly.