Loading ...
Sorry, an error occurred while loading the content.

Re: [agileDatabases] Efficiency of Count queries?

Expand Messages
  • Clifford Heath
    ... You re using a database system with a crap optimiser. Probably MySQL, which is known to be back in about 1985 when it comes to optimizer technology. Update
    Message 1 of 17 , Dec 21, 2008
    • 0 Attachment
      On 22/12/2008, at 9:20 AM, Joseph Fall wrote:
      > I've got a real puzzler here: two COUNT queries that do exactly the
      > same thing, but with orders of magnitude difference in efficiency.

      You're using a database system with a crap optimiser. Probably MySQL,
      which is known to be back in about 1985 when it comes to optimizer
      technology. Update to a decent DBMS like PostgreSQL or any of the
      commercial products, and you'll find you aren't playing with a toy any
      more, and both these queries work as well as each other.

      Clifford Heath.
    • Gabriel Tanase
      Hello, No matter which the DBMS is, I think we could guess that the table ORGANIZATIONS has a (much) smaller number of rows than the FILES and CREDITS tables.
      Message 2 of 17 , Dec 22, 2008
      • 0 Attachment
        Hello,

        No matter which the DBMS is, I think we could guess that the table
        ORGANIZATIONS has a (much) smaller number of rows than the FILES and CREDITS
        tables.
        The first query joins ORGANIZATIONS *twice* with the (filtered) join of
        FILES and CREDITS.
        If we assume that the number of rows in the filtered join of FILES and
        CREDITS is N, it is likely that the size of the full intermediate result set
        of the twice joining has an order of magnitude of N at the power of 2 (N^2)
        Then this N^2 result set must be sorted, so that the count(distinct ...) can
        be obtained (I am indeed assuming here that the DBMS uses some form of
        sorting to do count(distinct ...) ).
        I believe that the main driver of the (lack of) performance for the first
        query is the sorting of the large intermediate result set (N^2).

        In the second query, the two intermediate result sets from joining FILES and
        CREDITS are each sorted immediately after (or simultaneously with) the join,
        so that the count(distinct ...) results are obtained. Therefore the DBMS
        must do two sorts over result sets of size N each, so the performance of the
        second query is likely to be proportional with 2*N, rather than N^2.
        The results of the inside queries, once GROUP BY'd, are - I believe - much
        much smaller, i.e. each would have a number of rows at most equal to the
        number of rows in ORGANIZATIONS, say O. Therefore the two left joins between
        ORGANIZATIONS (O rows) and two intermediate results each of at most O rows,
        considering that O is much smaller than N, is not likely to add
        significantly to the time taken by the query to execute.

        I believe that, basically, the performance difference is between 2*N and
        N^2, where N is the size of the result set of the join between FILES and
        CREDITS.


        HTH,
        Gabriel



        2008/12/21 Joseph Fall <jfall@...>

        > Hello group,
        > first off - thanks for all the helpful and interesting items that come
        > across this list.
        > I've got a real puzzler here: two COUNT queries that do exactly the
        > same thing, but with orders of magnitude difference in efficiency.
        > These queries create a summary of the entire DB - I have paired them
        > down here to their essential elements:
        >
        > The first simply counts results from one big join:
        > SELECT o.orgid,
        > count(distinct f1.fileid) AS COUNT_CAT1,
        > count(distinct f2.fileid) AS COUNT_CAT2
        > FROM organizations o
        > LEFT JOIN (files f1 iNNER JOIN credits c1
        > ON c1.creditid = f1.creditid and c1.category
        > = 1)
        > ON o.orgid = f1.orgid
        > LEFT JOIN (files f2 INNER JOIN credits c2
        > ON c2.creditid = f2.creditid and c2.category
        > = 2)
        > ON o.orgid = f2.orgid
        > GROUP BY o.orgid;
        >
        > The second uses a nested query to do the count for each category and
        > joins the results:
        > SELECT o.orgid,
        > COALESCE(f1.count, 0) AS COUNT_CAT1,
        > COALESCE(f2.count, 0) AS COUNT_CAT2
        > FROM organizations o
        > LEFT JOIN
        > (SELECT orgid, COUNT(DISTINCT fileid) count
        > FROM files f INNER JOIN credits c
        > ON c.creditid = f.creditid AND c.category = 1
        > GROUP BY orgid) f1
        > ON o.orgid = f1.orgid
        > LEFT JOIN
        > (SELECT orgid, COUNT(DISTINCT fileid) count
        > FROM files f INNER JOIN credits c
        > ON c.creditid = f.creditid AND c.category = 2
        > GROUP BY orgid) f2
        > ON o.orgid = f2.orgid;
        >
        > The second query (as shown) runs an order of magnitude faster. My
        > actual query adds 4 additional join/count categories, and in that
        > case, the second approach is several orders of magnitude faster.
        >
        > I suspect this is because in the first query, the COUNT operations
        > each need to run through the entire result set, or perhaps because in
        > the first case the entire result set (which is big) needs to be
        > assembled in its entirety? If the queries are paired down further to
        > join / count only a single category, both queries perform equally, so
        > either explanation makes sense... but I might be completely off track...
        >
        > I'd really like to understand what is going on here so I can structure
        > more efficient queries in future - thanks in advance for any light you
        > might shed on this mystery.
        >
        > ...Joseph
        >


        > .
        >
        >
        >


        [Non-text portions of this message have been removed]
      • Joseph Fall
        Thanks Curt - I ll take that advice. We re using MySQL - so I ll ask on their forum. We had looked at the EXPLAIN for both queries, but neither turned up
        Message 3 of 17 , Dec 22, 2008
        • 0 Attachment
          Thanks Curt - I'll take that advice.
          We're using MySQL - so I'll ask on their forum.

          We had looked at the EXPLAIN for both queries, but neither turned up
          anything useful. In fact, the slow query makes better use of indexes
          because it does not use any nested queries?

          Sorry for posting an inappropriate question.
          ...Joseph

          On 21-Dec-08, at 5:48 PM, Curt Sampson wrote:

          > On 2008-12-21 14:20 -0800 (Sun), Joseph Fall wrote:
          >
          > > I've got a real puzzler here: two COUNT queries that do exactly the
          > > same thing, but with orders of magnitude difference in efficiency.
          >
          > I think that this really isn't the place for this question. This is
          > going to be specific to the particular DBMS you're using, and the way
          > it optimizes queries. You want to ask about this on a list related to
          > optimization of queries for your particular RDBMS system.
          >
          > Just as a hint, though, you probably want to use a command called
          > EXPLAIN
          > to help see what's going on. However, if you didn't even know about
          > that
          > command, you'll need a lot of background to use it effectively.
          >
          > > I'd really like to understand what is going on here so I can
          > structure
          > > more efficient queries in future - thanks in advance for any light
          > you
          > > might shed on this mystery.
          >
          > If you're serious about this, I recommend Dan Tow's _SQL Tuning_, from
          > O'Reilly:
          >
          > http://www.amazon.com/SQL-Tuning-Dan-Tow/dp/0596005733
          >
          > cjs
          > --
          > Curt Sampson <cjs@...> +81 90 7737 2974
          > Functional programming in all senses of the word:
          > http://www.starling-software.com
          >
          >



          [Non-text portions of this message have been removed]
        • Joseph Fall
          Thanks for the detailed analysis Gabriel - that confirms my suspicions and is very clear. Much appreciated. ...Joseph ... [Non-text portions of this message
          Message 4 of 17 , Dec 22, 2008
          • 0 Attachment
            Thanks for the detailed analysis Gabriel - that confirms my suspicions
            and is very clear.
            Much appreciated.
            ...Joseph
            On 22-Dec-08, at 9:22 AM, Gabriel Tanase wrote:

            > Hello,
            >
            > No matter which the DBMS is, I think we could guess that the table
            > ORGANIZATIONS has a (much) smaller number of rows than the FILES and
            > CREDITS
            > tables.
            > The first query joins ORGANIZATIONS *twice* with the (filtered) join
            > of
            > FILES and CREDITS.
            > If we assume that the number of rows in the filtered join of FILES and
            > CREDITS is N, it is likely that the size of the full intermediate
            > result set
            > of the twice joining has an order of magnitude of N at the power of
            > 2 (N^2)
            > Then this N^2 result set must be sorted, so that the
            > count(distinct ...) can
            > be obtained (I am indeed assuming here that the DBMS uses some form of
            > sorting to do count(distinct ...) ).
            > I believe that the main driver of the (lack of) performance for the
            > first
            > query is the sorting of the large intermediate result set (N^2).
            >
            > In the second query, the two intermediate result sets from joining
            > FILES and
            > CREDITS are each sorted immediately after (or simultaneously with)
            > the join,
            > so that the count(distinct ...) results are obtained. Therefore the
            > DBMS
            > must do two sorts over result sets of size N each, so the
            > performance of the
            > second query is likely to be proportional with 2*N, rather than N^2.
            > The results of the inside queries, once GROUP BY'd, are - I believe
            > - much
            > much smaller, i.e. each would have a number of rows at most equal to
            > the
            > number of rows in ORGANIZATIONS, say O. Therefore the two left joins
            > between
            > ORGANIZATIONS (O rows) and two intermediate results each of at most
            > O rows,
            > considering that O is much smaller than N, is not likely to add
            > significantly to the time taken by the query to execute.
            >
            > I believe that, basically, the performance difference is between 2*N
            > and
            > N^2, where N is the size of the result set of the join between FILES
            > and
            > CREDITS.
            >
            > HTH,
            > Gabriel
            >
            > 2008/12/21 Joseph Fall <jfall@...>
            >
            > > Hello group,
            > > first off - thanks for all the helpful and interesting items that
            > come
            > > across this list.
            > > I've got a real puzzler here: two COUNT queries that do exactly the
            > > same thing, but with orders of magnitude difference in efficiency.
            > > These queries create a summary of the entire DB - I have paired them
            > > down here to their essential elements:
            > >
            > > The first simply counts results from one big join:
            > > SELECT o.orgid,
            > > count(distinct f1.fileid) AS COUNT_CAT1,
            > > count(distinct f2.fileid) AS COUNT_CAT2
            > > FROM organizations o
            > > LEFT JOIN (files f1 iNNER JOIN credits c1
            > > ON c1.creditid = f1.creditid and c1.category
            > > = 1)
            > > ON o.orgid = f1.orgid
            > > LEFT JOIN (files f2 INNER JOIN credits c2
            > > ON c2.creditid = f2.creditid and c2.category
            > > = 2)
            > > ON o.orgid = f2.orgid
            > > GROUP BY o.orgid;
            > >
            > > The second uses a nested query to do the count for each category and
            > > joins the results:
            > > SELECT o.orgid,
            > > COALESCE(f1.count, 0) AS COUNT_CAT1,
            > > COALESCE(f2.count, 0) AS COUNT_CAT2
            > > FROM organizations o
            > > LEFT JOIN
            > > (SELECT orgid, COUNT(DISTINCT fileid) count
            > > FROM files f INNER JOIN credits c
            > > ON c.creditid = f.creditid AND c.category = 1
            > > GROUP BY orgid) f1
            > > ON o.orgid = f1.orgid
            > > LEFT JOIN
            > > (SELECT orgid, COUNT(DISTINCT fileid) count
            > > FROM files f INNER JOIN credits c
            > > ON c.creditid = f.creditid AND c.category = 2
            > > GROUP BY orgid) f2
            > > ON o.orgid = f2.orgid;
            > >
            > > The second query (as shown) runs an order of magnitude faster. My
            > > actual query adds 4 additional join/count categories, and in that
            > > case, the second approach is several orders of magnitude faster.
            > >
            > > I suspect this is because in the first query, the COUNT operations
            > > each need to run through the entire result set, or perhaps because
            > in
            > > the first case the entire result set (which is big) needs to be
            > > assembled in its entirety? If the queries are paired down further to
            > > join / count only a single category, both queries perform equally,
            > so
            > > either explanation makes sense... but I might be completely off
            > track...
            > >
            > > I'd really like to understand what is going on here so I can
            > structure
            > > more efficient queries in future - thanks in advance for any light
            > you
            > > might shed on this mystery.
            > >
            > > ...Joseph
            > >
            >
            > > .
            > >
            > >
            > >
            >
            > [Non-text portions of this message have been removed]
            >
            >
            >



            [Non-text portions of this message have been removed]
          • Joseph Fall
            Thanks for the advice Clifford. Often (as in my case) we don t have much say in which DB our clients / employers use - we just have to make do. But thanks for
            Message 5 of 17 , Dec 22, 2008
            • 0 Attachment
              Thanks for the advice Clifford.
              Often (as in my case) we don't have much say in which DB our clients /
              employers use - we just have to make do. But thanks for the insights
              about MySQL - I'll keep them in mind when I can make the choice.
              ...Joseph

              On 21-Dec-08, at 10:38 PM, Clifford Heath wrote:

              > Update to a decent DBMS



              [Non-text portions of this message have been removed]
            • Curt Sampson
              ... Well, never having used the MySQL EXPLAIN facility, I can t say whether that s because you re not interpreting it properly or just because it s as bad as
              Message 6 of 17 , Dec 22, 2008
              • 0 Attachment
                On 2008-12-22 10:54 -0800 (Mon), Joseph Fall wrote:

                > We had looked at the EXPLAIN for both queries, but neither turned up
                > anything useful.

                Well, never having used the MySQL EXPLAIN facility, I can't say whether
                that's because you're not interpreting it properly or just because it's
                as bad as most of the rest of MySQL.

                As an aside that actually is on topic for this group, having the
                customer dictate which tools you use is an agile anti-pattern. When
                the customer does that, and forces you to use poor tools, essentially
                he's saying, "please be less productive." (Which some development shops
                interpret as, "please let me pay you more money," I suppose. :-/)

                > In fact, the slow query makes better use of indexes
                > because it does not use any nested queries?

                Do you mean "makes better use by using more indices" or "makes better
                use by ignoring them and doing more table scans"? Either can be the
                case, depending on the situation.

                If you're not nodding your head right now and saying, "of course,"
                you definitely need to read Dan Tow's _SQL Tuning_. In fact, having a
                developer who understands disk I/O stop developing for two or three days
                and just read that will probably pay itself back within a few weeks.

                As for whether it uses nested queries at the SQL level, that's
                irrelevant, if you have any kind of half-way decent optimizer. It should
                all get turned into a single compiled access pattern of some sort, and
                there's no reason that nested or not on the SQL level should change the
                access pattern.

                cjs
                --
                Curt Sampson <cjs@...> +81 90 7737 2974
                Functional programming in all senses of the word:
                http://www.starling-software.com
              • Cameron Laird
                On Tue, Dec 23, 2008 at 10:26:28AM +0900, Curt Sampson wrote: . . . ... . . . ? That gives me the impression that agile is narrower than I otherwise thought.
                Message 7 of 17 , Dec 24, 2008
                • 0 Attachment
                  On Tue, Dec 23, 2008 at 10:26:28AM +0900, Curt Sampson wrote:
                  .
                  .
                  .
                  > As an aside that actually is on topic for this group, having the
                  > customer dictate which tools you use is an agile anti-pattern. When
                  > the customer does that, and forces you to use poor tools, essentially
                  > he's saying, "please be less productive." (Which some development shops
                  > interpret as, "please let me pay you more money," I suppose. :-/)
                  .
                  .
                  .
                  ? That gives me the impression that agile is narrower
                  than I otherwise thought. "Use MySQL" (for example)
                  looks to me like a business requirements, just as much
                  as any other business requirement a project might en-
                  counter.

                  For these purposes, all my instincts are as a front-
                  line coder; I very much want customers to back off, and
                  let me choose the most personally-rewarding toolset.
                  Are you're saying there's more to agility's stance in
                  this matter than self-indulgence?
                • Joseph Fall
                  Useful discussion, but in fact, in the case that motivated it, I arrived on a project in-progress. While it is always my habit in such cases to back up and
                  Message 8 of 17 , Dec 27, 2008
                  • 0 Attachment
                    Useful discussion, but in fact, in the case that motivated it, I
                    arrived on a project in-progress. While it is always my habit in such
                    cases to back up and re-evaluate requirements, priorities, and
                    technologies before attempting to move forward, switching DB systems
                    mid-way through the project did not seem to make a lot of sense in
                    light of the top priority of my client, which was "we are months
                    behind schedule, we need this release out the door yesterday".
                    Especially given that there were no issues with the DB technology in
                    terms of the project goals.

                    I agree in principle with Curt's view that "having the customer
                    dictate which tools you use is an agile anti-pattern", but I also
                    think there are situations where pragmatism trumps principle. I
                    think this discussion might benefit from a distinction between
                    productivity and tool choice on a new project vs. a legacy system.

                    ...Joseph


                    On 24-Dec-08, at 9:00 AM, Cameron Laird wrote:

                    > On Tue, Dec 23, 2008 at 10:26:28AM +0900, Curt Sampson wrote:
                    > .
                    > .
                    > .
                    > > As an aside that actually is on topic for this group, having the
                    > > customer dictate which tools you use is an agile anti-pattern. When
                    > > the customer does that, and forces you to use poor tools,
                    > essentially
                    > > he's saying, "please be less productive." (Which some development
                    > shops
                    > > interpret as, "please let me pay you more money," I suppose. :-/)
                    > .
                    > .
                    > .
                    > ? That gives me the impression that agile is narrower
                    > than I otherwise thought. "Use MySQL" (for example)
                    > looks to me like a business requirements, just as much
                    > as any other business requirement a project might en-
                    > counter.
                    >
                    > For these purposes, all my instincts are as a front-
                    > line coder; I very much want customers to back off, and
                    > let me choose the most personally-rewarding toolset.
                    > Are you're saying there's more to agility's stance in
                    > this matter than self-indulgence?
                    >
                    >



                    [Non-text portions of this message have been removed]
                  • Scott Ross
                    There are valid business concerns that go into choosing technology - Maintenance, infrastructure for instance. Therefore, I m not sure I agree its an
                    Message 9 of 17 , Dec 27, 2008
                    • 0 Attachment
                      There are valid business concerns that go into choosing technology -
                      Maintenance, infrastructure for instance.

                      Therefore, I'm not sure I agree its an anti-pattern. It might be for the
                      consultant, but then you have choosen the wrong person to accomplish the
                      job in the first place.




                      > On Tue, Dec 23, 2008 at 10:26:28AM +0900, Curt Sampson wrote:
                      > .
                      > .
                      > .
                      >> As an aside that actually is on topic for this group, having the
                      >> customer dictate which tools you use is an agile anti-pattern. When
                      >> the customer does that, and forces you to use poor tools, essentially
                      >> he's saying, "please be less productive." (Which some development shops
                      >> interpret as, "please let me pay you more money," I suppose. :-/)
                      > .
                      > .
                      > .
                      > ? That gives me the impression that agile is narrower
                      > than I otherwise thought. "Use MySQL" (for example)
                      > looks to me like a business requirements, just as much
                      > as any other business requirement a project might en-
                      > counter.
                      >
                      > For these purposes, all my instincts are as a front-
                      > line coder; I very much want customers to back off, and
                      > let me choose the most personally-rewarding toolset.
                      > Are you're saying there's more to agility's stance in
                      > this matter than self-indulgence?
                      >
                    • Curt Sampson
                      ... It could be, as others have pointed out, that there are good business reasons to want to use a piece of technology, but there s still no question in my
                      Message 10 of 17 , Dec 28, 2008
                      • 0 Attachment
                        On 2008-12-24 17:00 +0000 (Wed), Cameron Laird wrote:

                        > ? That gives me the impression that agile is narrower than I otherwise
                        > thought. "Use MySQL" (for example) looks to me like a business
                        > requirements, just as much as any other business requirement a project
                        > might en- counter.

                        It could be, as others have pointed out, that there are good business
                        reasons to want to use a piece of technology, but there's still no
                        question in my mind that the choice between, say, MySQL and PostgreSQL
                        has a very strong technical component. These two products are certainly
                        not at all the same thing, nor interchangable unless one avoids using
                        their most productivty-enhancing features.

                        > For these purposes, all my instincts are as a front- line coder; I
                        > very much want customers to back off, and let me choose the most
                        > personally-rewarding toolset. Are you're saying there's more to
                        > agility's stance in this matter than self-indulgence?

                        Absolutely. In the agile world, it's the development team's
                        responsibility to work in a cost-effective manner, and part of that
                        responsibility means chosing appropriate tools. Every case will need
                        to be judged on its own merits, of course, but it's generally not
                        self-indulgant to convince a customer not to start new project in COBOL.
                        Why would other technical decisions be different?

                        There is quite likely nothing wrong with you want to use a personally
                        rewarding toolset, either. It's generally accepted that happier
                        developers are more productive developers, and for most of us being more
                        productive is why we find certain tools more rewarding than others.

                        On 2008-12-27 22:49 -0500 (Sat), Scott Ross wrote:

                        > There are valid business concerns that go into choosing technology -
                        > Maintenance, infrastructure for instance.

                        Indeed. And developers should be taking that into account as well, when
                        they chose tools. I would think a project where the developers are not
                        building a maintenance plan is in trouble.

                        On 2008-12-27 12:29 -0800 (Sat), Joseph Fall wrote:

                        > Useful discussion, but in fact, in the case that motivated it, I
                        > arrived on a project in-progress. While it is always my habit in such
                        > cases to back up and re-evaluate requirements, priorities, and
                        > technologies before attempting to move forward, switching DB systems
                        > mid-way through the project did not seem to make a lot of sense in
                        > light of the top priority of my client, which was "we are months
                        > behind schedule, we need this release out the door yesterday".

                        So that sounds like a reasonable decision. But I'll note that *you* made
                        this decision after a careful evaluation of the client's business goals;
                        thus, it's not the antipattern I was discussing.

                        cjs
                        --
                        Curt Sampson <cjs@...> +81 90 7737 2974
                        Functional programming in all senses of the word:
                        http://www.starling-software.com
                      • Matt
                        Curt, ... whether ... it s ... shops ... I can think of a number of reasons why a customer *should* be allowed to choose the tools: 1) The customer is paying
                        Message 11 of 17 , Dec 29, 2008
                        • 0 Attachment
                          Curt,

                          --- In agileDatabases@yahoogroups.com, Curt Sampson <cjs@...> wrote:
                          >
                          > On 2008-12-22 10:54 -0800 (Mon), Joseph Fall wrote:
                          >
                          > > We had looked at the EXPLAIN for both queries, but neither turned up
                          > > anything useful.
                          >
                          > Well, never having used the MySQL EXPLAIN facility, I can't say
                          whether
                          > that's because you're not interpreting it properly or just because
                          it's
                          > as bad as most of the rest of MySQL.
                          >
                          > As an aside that actually is on topic for this group, having the
                          > customer dictate which tools you use is an agile anti-pattern. When
                          > the customer does that, and forces you to use poor tools, essentially
                          > he's saying, "please be less productive." (Which some development
                          shops
                          > interpret as, "please let me pay you more money," I suppose. :-/)

                          I can think of a number of reasons why a customer *should* be allowed to
                          choose the tools:

                          1) The customer is paying not just for development but also for
                          deployment and maintenance... which could have a different cost
                          structure than initial development.

                          2) The customer probably has a decent knowledge of the cost and
                          availability of the skill sets in the market place. If they have to
                          replace the developer, one technology might make the replacement task
                          easier.

                          3) The customer is less likely to get involved in "religious wars" over
                          whether MySQL is better than PostgreSQL :)

                          Matt
                        • Chris Holmes
                          ... over ... Not necessarily. If a customer is dictating technology, then they already have an opinion. If anything, the customer is going to be
                          Message 12 of 17 , Dec 29, 2008
                          • 0 Attachment
                            > 3) The customer is less likely to get involved in "religious wars"
                            over
                            > whether MySQL is better than PostgreSQL :)
                            >
                            > Matt

                            Not necessarily. If a customer is dictating technology, then they
                            already have an opinion.

                            If anything, the customer is going to be uninformed/ignorant and grab on
                            to whatever the latest tech magazine says is hot, or whatever Microsoft
                            recommends, or whatever they internally believe to be gospel. (Or
                            they're just going to say, "Use MySQL because it's what our one
                            programmer fresh out of college knows how to use" - is that a great way
                            to make decisions or what?)

                            Forcing a technology decision on a development team - a good development
                            team that already knows what they are doing - is exactly as Curt says it
                            is: you're asking them, whether you realize it or not, to be less
                            productive.

                            Now, if you, as the customer, are willing to accept that and are willing
                            to pay for that hit in productivity to get the technology you desire,
                            then by all means - dictate the technology. But know ahead of time what
                            the results of that decision will be. Be informed about it.

                            I've got no problem with a customer telling me to use XYZ technology -
                            as long as they understand that my preference would be ABC technology
                            and I'd get the work done faster with more a more reliable codebase if
                            they let me use the tools that I prefer and think are better tools. It's
                            when that discussion doesn't even take place that gets me riled.

                            -Chris
                          • Curt Sampson
                            ... Right, and if the developers are also responsible for the deployment and maintenance, and they re any good, they ll be taking all of this into account. If
                            Message 13 of 17 , Dec 29, 2008
                            • 0 Attachment
                              On 2008-12-29 15:34 -0000 (Mon), Matt wrote:

                              > I can think of a number of reasons why a customer *should* be allowed to
                              > choose the tools:
                              >
                              > 1) The customer is paying not just for development but also for
                              > deployment and maintenance... which could have a different cost
                              > structure than initial development.

                              Right, and if the developers are also responsible for the deployment and
                              maintenance, and they're any good, they'll be taking all of this into
                              account.

                              If they developers are not responsible for doing deployment and
                              maintenance, you're in trouble in other ways. The developers are
                              certainly in the best position to do maintenance, since they know
                              the system the best, and they are already doing deployment work for
                              their functional test setups that should be re-used for production
                              deployments.

                              And, putting on my manager hat, splitting the development and
                              deployment/maintenance teams creates economic externalities guaranteed
                              to waste time and money and cause problems. Much as software not
                              developed to be tested with automated systems is very hard to test,
                              software not developed with production deployment in mind is hard to
                              deploy. A development team will not do the extra work to understand,
                              much less solve, production deployment issues if they don't have to, and
                              in fact have pressure on them to go do more "development" work instead.

                              This situation should be familiar to everyone here: it's the common
                              situation between development groups and DBMS groups in large companies,
                              and is another agile anti-pattern.

                              > 2) The customer probably has a decent knowledge of the cost and
                              > availability of the skill sets in the market place. If they have to
                              > replace the developer, one technology might make the replacement task
                              > easier.

                              It depends on the sort of development you're doing, of course. I suppose
                              that if you work in the "it's all Java developers, and they're all
                              fungible," that's true enough. It's not a world I work in, and I think
                              technical managers that do that are short-sighted and won't perform as
                              well as those who understand that there are tremendous differences in
                              development productivity amongst different developers.

                              But on the other hand, it's probably less risky in many situations, so
                              I can see why it would be preferred in some business situations where
                              getting anything that sort of works is more important than having a
                              chance of getting really good software. These are, needless to say, not
                              the kind of people who tend toward agile development (which is also, at
                              the moment, comparatively risky), so they're not going to be worried
                              about agile anti-patterns, either. In fact, some of them they ought to
                              be practicing.

                              > 3) The customer is less likely to get involved in "religious wars" over
                              > whether MySQL is better than PostgreSQL :)

                              True, and there is perhaps a larger grain of truth in this than you
                              might think; it's hard to avoid the religious wars and get a good
                              technical comparison of two products. But it's something you must be
                              able to do, if you want to lay any claim to being a good developer: to
                              insist that these two things (to give just one example) are the same
                              thing and it doesn't matter which you use is patently absurd to those
                              who understand at least one DBMS in great depth and have a reasonably
                              good knowledge of those two in question.

                              cjs
                              --
                              Curt Sampson <cjs@...> +81 90 7737 2974
                              Functional programming in all senses of the word:
                              http://www.starling-software.com
                            • Matt
                              Chris, ... It s ... I hear what both you and Curt are saying... but here s the thing. You are both assuming that you are the pinnacle of knowledge and have
                              Message 14 of 17 , Dec 30, 2008
                              • 0 Attachment
                                Chris,

                                --- In agileDatabases@yahoogroups.com, "Chris Holmes" <ChrisHolmes@...>
                                wrote:

                                > I've got no problem with a customer telling me to use XYZ technology -
                                > as long as they understand that my preference would be ABC technology
                                > and I'd get the work done faster with more a more reliable codebase if
                                > they let me use the tools that I prefer and think are better tools.
                                It's
                                > when that discussion doesn't even take place that gets me riled.

                                I hear what both you and Curt are saying... but here's the thing. You
                                are both assuming that you are the pinnacle of knowledge and have all
                                the information necessary to make the decision.

                                I prefer to think that I am pretty darned good at what I do and I don't
                                mind letting the customers and management know this... but I also am
                                willing to "let them tell me what to do" if they make a good case for
                                it.

                                To each their own though.

                                Matt
                              • Curt Sampson
                                ... I think that s a slight misinterpretation. We--or at least I--are assuming that we can become the pinnacle of knowledge (well, more like a local maximum)
                                Message 15 of 17 , Dec 31, 2008
                                • 0 Attachment
                                  On 2008-12-31 01:25 -0000 (Wed), Matt wrote:

                                  > both you and Curt are saying... but here's the thing. You are both
                                  > assuming that you are the pinnacle of knowledge and have all the
                                  > information necessary to make the decision.

                                  I think that's a slight misinterpretation. We--or at least I--are
                                  assuming that we can become the pinnacle of knowledge (well, more like
                                  a local maximum) and have as much information as anyone to make the
                                  decision.

                                  Honestly, and MBAs may not like this, but it's a lot more likely that a
                                  smart software developer can learn a lot about business than it is that
                                  a smart businessman will learn a lot about developing software. This is
                                  nothing particular to the computer field; just look around over the last
                                  eighty years or so and consider how many smart engineers and scientists
                                  have become managers, and how many smart managers have become engineers
                                  and scientists.

                                  And this is not to say that one doesn't take direction from the business
                                  folks. I may think (and have often thought, in fact) that working on
                                  project X is going to produce little return and the company should focus
                                  on project Y instead, and I'll make as strong a case as I can for that.
                                  But if the managers still insist on doing project X, I go with it.
                                  That's a situation where there may be technical implications, but in the
                                  end it's a business decision.

                                  But chosing your RDBMS, well, I don't think you can argue that there's
                                  not a huge, even project-changing, technical component there, and while
                                  there will also be business factors you have to take into account, in
                                  the end, this falls in the developer side, not the business side.

                                  cjs
                                  --
                                  Curt Sampson <cjs@...> +81 90 7737 2974
                                  Functional programming in all senses of the word:
                                  http://www.starling-software.com
                                Your message has been successfully submitted and would be delivered to recipients shortly.