Loading ...
Sorry, an error occurred while loading the content.

MySQL SQL syntax help

Expand Messages
  • Derek Adair
    I m looking for an MySQL-compatible SQL query that will, given a year range and a stat, provide me with either (ideally) a major league average per year for
    Message 1 of 6 , Jun 14, 2004
    • 0 Attachment
      I'm looking for an MySQL-compatible SQL query that will, given a year
      range and a stat, provide me with either (ideally) a major league average
      per year for that stat or a result set I can calculate that average from
      (PlayerID, Year, Stat). I'm running into a few problems. The average needs
      to be calculated after a grouping by PlayerID, then Year, since I don't
      want a player with two or more stints getting a lower total averaged in.
      I'm not sure how to group by two variables, and I don't know how two
      layers of aggregation can be accomplished (SUM, then AVG). Subqueries
      would seem to help a bit, but I can't see how to get all the way there,
      even if I upgrade to 4.1 alpha.

      Any thoughts?

      Regards,
      Derek
    • Tangotiger
      Derek, I m having a heckuva time following the request here (though, the A/C is out here, and I can barely focus). If you send me an email with your contact
      Message 2 of 6 , Jun 14, 2004
      • 0 Attachment
        Derek,

        I'm having a heckuva time following the request here
        (though, the A/C is out here, and I can barely focus).

        If you send me an email with your contact number, I
        can call you, and I'm sure we can figure it out.

        (Unless Michael M has already figured it out.)

        Tom


        --- Derek Adair <dadair@...> wrote:
        > I'm looking for an MySQL-compatible SQL query that
        > will, given a year
        > range and a stat, provide me with either (ideally) a
        > major league average
        > per year for that stat or a result set I can
        > calculate that average from
        > (PlayerID, Year, Stat). I'm running into a few
        > problems. The average needs
        > to be calculated after a grouping by PlayerID, then
        > Year, since I don't
        > want a player with two or more stints getting a
        > lower total averaged in.
        > I'm not sure how to group by two variables, and I
        > don't know how two
        > layers of aggregation can be accomplished (SUM, then
        > AVG). Subqueries
        > would seem to help a bit, but I can't see how to get
        > all the way there,
        > even if I upgrade to 4.1 alpha.
        >
        > Any thoughts?
        >
        > Regards,
        > Derek
        >





        __________________________________
        Do you Yahoo!?
        Friends. Fun. Try the all-new Yahoo! Messenger.
        http://messenger.yahoo.com/
      • David Kent
        Why not just create a view containing a players yearly counting stats and run another query against that. For example: create or replace view batting_yearly
        Message 3 of 6 , Jun 14, 2004
        • 0 Attachment
          Why not just create a view containing a players yearly counting stats
          and run another query against that. For example:

          create or replace view batting_yearly as
          select playerID,
          yearID,
          sum(G) G,
          sum(AB) AB,
          sum(H) H,
          .
          .
          .
          from batting
          group by playerID,
          yearID;

          select yearID,
          avg(&stat_to_be_averaged)
          from batting_yearly
          where yearID between &start_year and &end_year
          group by yearID;

          I use MS Access, Oracle MS and Sybase SQL Server so the syntax may be a
          little different for MySQL.

          ..Dave



          On Mon, 14 Jun 2004 12:50:00 -0700 (PDT), "Tangotiger"
          <tangotiger@...> said:
          >
          > Derek,
          > I'm having a heckuva time following the request here
          > (though, the A/C is out here, and I can barely focus).
          > If you send me an email with your contact number, I
          > can call you, and I'm sure we can figure it out.
          > (Unless Michael M has already figured it out.)
          > Tom
          > --- Derek Adair <dadair@...> wrote:
          > > I'm looking for an MySQL-compatible SQL query that
          > > will, given a year
          > > range and a stat, provide me with either (ideally) a
          > > major league average
          > > per year for that stat or a result set I can
          > > calculate that average from
          > > (PlayerID, Year, Stat). I'm running into a few
          > > problems. The average needs
          > > to be calculated after a grouping by PlayerID, then
          > > Year, since I don't
          > > want a player with two or more stints getting a
          > > lower total averaged in.
          > > I'm not sure how to group by two variables, and I
          > > don't know how two
          > > layers of aggregation can be accomplished (SUM, then
          > > AVG). Subqueries
          > > would seem to help a bit, but I can't see how to get
          > > all the way there,
          > > even if I upgrade to 4.1 alpha.
          > >
          > > Any thoughts?
          > >
          > > Regards,
          > > Derek
          > >
          >
          >
          > __________________________________
          > Do you Yahoo!?
          > Friends. Fun. Try the all-new Yahoo! Messenger.
          > [1]http://messenger.yahoo.com/
          > [2]http://www.baseball-databank.org/
          >
          > Yahoo! Groups Sponsor
          > ADVERTISEMENT
          > [3]click here
          > ___________________________________________________________
          >
          > Yahoo! Groups Links
          > * To visit your group on the web, go to:
          > [4]http://groups.yahoo.com/group/baseball-databank/
          >
          > * To unsubscribe from this group, send an email to:
          > [5]baseball-databank-unsubscribe@yahoogroups.com
          >
          > * Your use of Yahoo! Groups is subject to the [6]Yahoo! Terms
          > of Service.
          >
          > References
          >
          > 1. http://messenger.yahoo.com/
          > 2. http://www.baseball-databank.org/
          > 3.
          > http://us.ard.yahoo.com/SIG=129spqo17/M=285832.5035992.6164947.1793602/D=groups/S=1705066645:HM/EXP=1087329003/A=2178650/R=0/SIG=14drisgb3/*http://www.householdfinance.com/ln/TrackingServlet?cmd_MediaCode=&fc=APS&mkt=000&mc=01PSYAYA004001B220000U0300L0030000000000&dest=HOME_PAGE
          > 4. http://groups.yahoo.com/group/baseball-databank/
          > 5.
          > mailto:baseball-databank-unsubscribe@yahoogroups.com?subject=Unsubscribe
          > 6. http://docs.yahoo.com/info/terms/
        • Michael Westbay
          ... Hash: SHA1 ... OK, first we need some raw data: PID Year Part Games Hits ABs 150 1994 1 25 7 44 150 1995 1 82 40 200 150 1996 1 101 60 286 150 1997 1 81 37
          Message 4 of 6 , Jun 14, 2004
          • 0 Attachment
            -----BEGIN PGP SIGNED MESSAGE-----
            Hash: SHA1

            Adair-san wrote:

            | I'm looking for an MySQL-compatible SQL query that will, given a year
            | range and a stat, provide me with either (ideally) a major league average
            | per year for that stat or a result set I can calculate that average from
            | (PlayerID, Year, Stat).

            OK, first we need some raw data:

            PID Year Part Games Hits ABs
            150 1994 1 25 7 44
            150 1995 1 82 40 200
            150 1996 1 101 60 286
            150 1997 1 81 37 188
            150 1998 1 60 24 118
            150 1999 1 52 4 42
            150 2000 1 19 1 11
            150 2001 1 31 12 60
            150 2002 1 4 0 1
            150 2002 2 35 20 87
            150 2003 1 10 2 10
            995 1997 1 10 3 16
            995 1998 1 107 80 301
            995 1999 1 36 9 53
            995 2000 1 92 71 221
            995 2001 1 69 14 66
            995 2002 1 31 5 38
            995 2002 2 37 26 101
            995 2003 1 60 19 88

            Note: Part is Stint for my local database, which is MySQL 3.xx.

            Now, if I'm reading this right, you want something like the average hits
            per year for each player:

            select PlayerID, count(year), sum(Hits), avg(Hits),
            sum(Hits)/(count(year)-1)
            from Batting
            where PlayerID in (150,995)
            group by PlayerID
            order by PlayerID

            PID Years Hits Avg RAvg
            150 11 207.0 18.8182 20.7
            995 8 227.0 28.375 32.43

            As you can see by manually checking the original data, the RAvg is the
            correct value, removing the year that the player had data for two
            stints. I believe that this is the problem you're trying to deal with,
            correct?

            Essentially, what you need to do is create a temporary view of the data
            with the stint (Part field in this case) removed.

            create temporary table tmpBatting
            select PlayerID, Year, sum(Games) as Games, sum(Hits) as Hits,
            sum(AtBats) as AtBats
            from Batting
            where PlayerID in (150, 995)
            group by PlayerID, Year

            Now confirm the data:

            PID Year Part Games Hits ABs
            150 1994 25.0 7.0 44.0
            150 1995 82.0 40.0 200.0
            150 1996 101.0 60.0 286.0
            150 1997 81.0 37.0 188.0
            150 1998 60.0 24.0 118.0
            150 1999 52.0 4.0 42.0
            150 2000 19.0 1.0 11.0
            150 2001 31.0 12.0 60.0
            150 2002 39.0 20.0 88.0
            150 2003 10.0 2.0 10.0
            995 1997 10.0 3.0 16.0
            995 1998 107.0 80.0 301.0
            995 1999 36.0 9.0 53.0
            995 2000 92.0 71.0 221.0
            995 2001 69.0 14.0 66.0
            995 2002 68.0 31.0 139.0

            As you can see, the 2002 data for each player has been merged properly.
            ~ Now run the average query against this table (without the final field):

            select PlayerID, count(year), sum(Hits), avg(Hits)
            from tmpBatting
            where PlayerID in (150,995)
            group by PlayerID
            order by PlayerID

            PID Years Hits Avg
            150 10 207.0 20.7
            995 7 227.0 32.4286

            The year count is now correct, as is the average per year.

            You can now drop tmpBatting, or when you log off, it will be dropped
            automatically.

            Now, for specific year ranges, modify the WHERE statement accordingly
            (i.e. "where Year between 1990 and 1999").

            To automate this with a bash script will now be pretty trivial. The
            only thing you have to be careful of is to make this ONE msyql command
            line statement with ";" separating the temporary table creation and the
            selection. Use $1 etc. to specify the field you want to use, limiting
            the size of the temporary table.

            If you're doing this often, make the view a little more permanent,
            perhaps with a cron job updating the view nightly, or a bash script to
            update it manually after new data was input.

            I think that MySQL 4.1 will finally support true views, but I don't know
            for sure. I leave that for you to check.

            Hope this helps.

            - --
            Michael Westbay
            Writer/System Administrator
            http://JapaneseBaseball.com
            Public Key: http://www.japanesebaseball.com/keys/westbaystars.gpgkey
            -----BEGIN PGP SIGNATURE-----
            Version: GnuPG v1.2.1 (Darwin)
            Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

            iD8DBQFAzk8qEGkvdz3fJlcRApsFAKCAqPT8FCJHMgN0GEj99Wj9CDG9IQCcD1Lk
            xBpQlvmIb/YaiTVTMhQ8Ubk=
            =TOEq
            -----END PGP SIGNATURE-----
          • Michael Westbay
            ... Hash: SHA1 ... Views are not yet implemented in MySQL. Transactions, foreign keys, and views were features that MySQL had traditionally stayed away from
            Message 5 of 6 , Jun 15, 2004
            • 0 Attachment
              -----BEGIN PGP SIGNED MESSAGE-----
              Hash: SHA1

              David Kent wrote:

              | Why not just create a view containing a players yearly counting stats
              | and run another query against that. For example:
              |
              | create or replace view batting_yearly as

              Views are not yet implemented in MySQL. Transactions, foreign keys, and
              views were features that MySQL had traditionally stayed away from for
              performance reasons. The target audience was developers who had full
              control over what went into and came out of the database, therefore
              these features weren't considered very important.

              MySQL's popularity, however, has seen a large influx of former Oracle,
              Sybase, etc. users who expect these features. For that reason, they're
              slowly getting implemented.

              For views, here's what the manual currently states:

              ~ Views currently are being implemented, and will shortly
              ~ appear in the 5.0 version of MySQL Server. Unnamed views
              ~ (derived tables, a subquery in the FROM clause of a SELECT)
              ~ are already implemented in version 4.1. Unqualified union
              ~ views (SELECT ... UNION SELECT ... are also available through
              ~ the MERGE table feature. See section 15.2 The MERGE Storage
              ~ Engine.

              The full man page reference is here:

              http://dev.mysql.com/doc/mysql/en/ANSI_diff_Views.html

              Hope this helps.

              - --
              Michael Westbay
              Writer/System Administrator
              http://JapaneseBaseball.com
              Public Key: http://www.japanesebaseball.com/keys/westbaystars.gpgkey
              -----BEGIN PGP SIGNATURE-----
              Version: GnuPG v1.2.1 (Darwin)
              Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

              iD8DBQFAzs3VEGkvdz3fJlcRAhTkAKDRBMd+vMbVrfIv3XC5BSik2Gaz9wCfW5Jr
              uH8GJFkVqzGq4JpUDyaxg1k=
              =XwIe
              -----END PGP SIGNATURE-----
            • Michael Mavrogiannis
              Re: MySQL SQL syntax help On Mon Jun 14, 2004, Derek Adair asked for ... The following worked on my planet. I believe that on planet MySQL, the IIF() function
              Message 6 of 6 , Jun 17, 2004
              • 0 Attachment
                Re: MySQL SQL syntax help

                On Mon Jun 14, 2004, Derek Adair asked for
                >>an MySQL-compatible SQL query that will,
                >>given a year range and a stat, provide
                >>me with either (ideally) a major league
                >>average per year for that stat [...]
                >>I don't want a player with two or more
                >>stints getting a lower total averaged in.

                The following worked on my planet. I believe that on planet MySQL,
                the IIF() function is spelled IF().

                SELECT first(yearID) as Season,
                sum(H) as [Total Hits],
                sum(iif(stint=1,1,0)) as Players,
                sum(H)/sum(iif(stint=1,1,0)) as [Average Hits]
                FROM batting
                WHERE yearID between 1872 and 2002
                GROUP BY yearID;
              Your message has been successfully submitted and would be delivered to recipients shortly.