Loading ...
Sorry, an error occurred while loading the content.

557Re: [baseball-databank] DEFINITION - team IDs and ballclub IDs

Expand Messages
  • Sean Forman
    Jul 2, 2002
    • 0 Attachment
      tangotiger wrote:
      > Let's continue on the discussion of team IDs and ballclub IDs.
      >
      > For those late in the game, a ballclub id (or franchise id or
      > continuity id) is an id that refers to the same ballclub, regardless
      > of where they play. For example, the Angels from 1961 to today have
      > the same ballclub id (whatever its actual id is), whether they are
      > the LA Angels, California Angels or Anaheim Angels. Same thing with
      > teams moving from NY to LA or SF, etc.
      >
      > I believe that Paul, KJOK, and Sean F were able to agree on every
      > ballclub, post-1900.


      The ballclub ID's are pretty solid.


      > Now, the question is what "team id" does each team get?

      > For example, do you keep the same id or change it under the following
      > conditions:
      > 1 - The team decides to change its nickname, but continue playing in
      > the same park
      > 2 - The team decides to change its "city/state" name, but continue
      > playing in the same park
      > 3 - The team keeps the same name, but changes park within the same
      > city
      > 4 - The team keeps the same name, but changes park into another city
      > altogether
      > 5 - Team changes name, and changes park, but inside the same city
      > 6 - The team changes name and change park into another city
      > 7 - Team changes league
      > 8 - Team changes division
      >
      > Not sure if there's anything else....
      >
      >>From my standpoint, I think it's easiest to give a new team id if
      > there is a change only in team city name or team nickname, meaning in:
      > 1,2,5,6


      To be honest, I'm not a big fan of the change because of a nickname change.

      I personally would remove the teammaster table and add, city name and
      nickname to the teams table. If we are looking at the teams and players
      as parallel data sets. The team ID's don't match to the lahman_ID's,
      but the team, lg, year ID's match to the lahman_ID's. Therefore we have
      a name for each player and we should have a name (city and nick, or
      combined) for each team,year,lg combination.

      If you did that, searching for the team leaders of the Colt .45's would
      involve

      Select lahman_ID, max(HR) from teams, batting where
      batting.team_ID=teams.team_ID AND batting.year_ID=teams.year_ID and
      batting.lg_ID=teams.lg_ID and teams.name="Houston Colt .45's";


      > After all, the "ballclub id" that we'll be introducing will have the
      > continuity we need to determine "ballclub/franchise" records, while
      > making the team id in any year very clear as to which team we are
      > talking about.
      >
      > The team id should be unique on its own, so that if the Brooklyn
      > Dodgers start operations next year, their team id will be different
      > from that of 50 years ago.
      >
      > Thanks, Tom

      I like the 3-letter abbrevs for team ID's, but that is probably just an
      emotional attachment.

      We could do something like 3-1 with the last character being blank for
      modern franchises and a numeral for older franchises.

      Getting rid of the numbers will be difficult given there are like three
      different Philadelphia Athletics franchises.

      PHA, PHA1, PHA2

      How should one represent the New York Franchises?

      In these cases I tend to think clarity is the most important
      consideration, and making the year, team combination unique in all cases.

      I also agree that we may need to add a level designation as new data is
      added.

      later,
      sean

      Baseball Stats! http://www.Baseball-Reference.com/
      Baseball Analysis! http://www.BaseballPrimer.com/
    • Show all 6 messages in this topic