Re: [scrumdevelopment] Re: A good release workflow
- Hi Mark,
It's great you worked hard refining your tests, but it seems that end-to-end test execution speed isn't the bottleneck in your case. I would start revisiting the test strategy.
Many people think end-to-end is better, but that's not quite true. End-to-end tests validates if the layers of your application can integrate with each other and with the environment in place.
Things starts to go bad IMO, when you use them to validate all your business rules and application behavior. Most likely you will get in trouble with slow and unstable tests.
A good start is to check what end-to-end tests can be replaced by unit and integration test. Much faster, easier to write and easier to maintain. This usually goes down to your code base, how testable it's.
Even if you end up with many end-to-end tests, if they're well written, I mean, independent of each other, you can parallelize them across different machines.
Another way to speed up is to modularize your application, and release modules, instead of the whole application. So, for example, when a bug fix take place, you only need to test the changed module and its integrations.
The list goes on. Improvements like these can help you in the way to get a better test speed. Most of the times it's worth the investment.
Em 30/12/2009 14:35, woynam escreveu:
--- In scrumdevelopment@ yahoogroups. com, Ron Jeffries <ronjeffries@ ...> wrote:
> Hello, woynam. On Wednesday, December 30, 2009, at 10:23:28 AM,
> you wrote:
> > As the test suite grows, so does the execution time. You can
> > optimize it, but it will generally continue to grow.
> > Our 18,000+ acceptance tests take roughly 3 days to run. In some
> > instances, the business decides it can't wait 3 days before
> > applying a critical fix.
> True. Those tests run too long.
Really? Wow, you're incredible, Ron. Where can I get a crystal ball like the one you have?
We're talking end-to-end acceptance tests on a very large distributed system here, not unit tests. There are hundreds of services distributed across dozens of types of processes.
This is an electronic trading system with complex algorithms for handling everything from numerous allocation algorithms, flash orders, contingency orders, spread orders, buy-writes, options, indexes, futures, and stock sessions, in-crowd manual handling, firm-level order routing, inter-market linkage, market data dissemination, trade busts, etc.
So, feel free to come by and volunteer your time to help us out. We've been refining the test suite for the past 7+ years, and I'm sure we're doing it all wrong.
> > Testing is ultimately about risk management. You can never test
> > everything, so you pick the areas where you have a higher
> > probability of discovering defects.
> True. We can skip some tests, or we can make the tests more
> efficient, or both.
> > Now, that's not to say I don't agree with you in general. There's
> > always room for improvement.
> Yes. And I'm thinking there's a room near us all right now.
> Ron Jeffries
> www.XProgramming. com
> www.xprogramming. com/blog
> Debugging is twice as hard as writing the code in the first place.
> Therefore, if you write the code as cleverly as possible, you are,
> by definition, not smart enough to debug it. -- Brian Kernighan
- On Wed, Dec 30, 2009 at 7:23 AM, woynam <woyna@...> wrote:
>More or less. It should be feasible to run tests in parallel. May only
> As the test suite grows, so does the execution time. You can optimize it, but it will generally continue to grow.
require an investment in hardware to see a several-fold increase.
> Our 18,000+ acceptance tests take roughly 3 days to run. In some instances, the business decides it can't wait 3 days before applying a critical fix.That's too long. Split the system into functional modules that you can
test independently and/or in parallel.
18,000 tests in 3 days is an average of 14.4 seconds per test. I've
seen worse for acceptance tests, but I would consider doing more fast
unit tests and less slow acceptance tests.
For a system of moderate complexity a few hundred acceptance tests and
10K or so unit tests usually does the trick. As a system gets more
complex than that it should be decomposed into modules for the sake of
everyone's sanity. It should be possible to test the modules
independently or in parallel.
> Testing is ultimately about risk management. You can never test everything, so you pick the areas where you have a higher probability of discovering defects.Everything is about risk management.
- On Wed, Dec 30, 2009 at 9:49 AM, Ron Jeffries <ronjeffries@...> wrote:
> Hello, And. On Wednesday, December 30, 2009, at 8:38:43 AM, youI am working on a blog post about this which should be available
>> This said, when I have asked the dev team to see if there is any possibility
>> to have more and better unit test, they said that they do as much as they
>> can but that doing Unit test on GUI of .NET is not so simple for them. I
>> have to trust them as I think our dev team is very competent on their work
>> I've try to use some tool to record the screen clicks to reproduce the GUI
>> test, once this is done but I was never happy with the results; in any case
>> I keep searching for a better solution than the actual one, without much
>> luck though. Well, I must say that arriving to the actual status took us
>> about 1 year of learning, actually the time since we are agile, who knows
>> where we'll be in another year.
> Yes. Testing thru the GUI seems always to be slow and inconvenient.
> What I like to do is to build a reasonably impermeable membrane
> between the GUI and the domain objects, and then test the heck out
> of the domain objects using nUnit and Fit/FitNesse. Then some simple
> GUI tests tend to be enough to fill in the gaps.
shortly. The problem, as I see it, is that most UIs are poorly
designed and that is what makes them hard to test.
UI is code. Think about the typical web UI:
Some template code on the server side generates HTML which is sent to
the browser. The browser renders a button. When the user clicks on the
button the server is called to grab another template and generate some
The problem with that is that if you look at the HTML as code (Which
it is) then it should become apparent that it has low cohesion and
high coupling. The HTML is dumb (low cohesion) it depends on the
server both to know how to put it together and to know how it should
behave (highly coupled.)
Web 2.0 technologies like Ajax make it possible to create web UIs that
are cohesive and couple to the server only through a simple service
API. However, most websites and nearly all frameworks still employ the
model where code is generated on the server side in response to
requests from the UI. Doing it that way means that you can never know
what the UI is supposed to look like or how it is supposed to behave
without multiple round trips to the server (And probably manipulating
The approach that I advocate is to write pure DHTML that consists of
service layer. If you follow this approach then the UI is highly
cohesive (It's correctness can be determined without talking to the
server) and it's coupling across the network is low (Ajax callbacks
return simple data objects, and they can be mocked.) Testing such a UI
is actually quite painless, and fast (Not as fast as server side code,
but much faster than typical GUI testing.) Also, there is no need to
test "through" the GUI, because the GUI can be tested in isolation.
A similar approach can be used for native UIs. The key is to treat the
UI as its own highly cohesive module and to decouple interaction with
the domain by exposing a useful domain API. Nothing really new here,
this is what folks have been saying for decades, but for some reason
no one follows the rules when it comes to UIs (One reason is that UI
frameworks almost universally suck and encourage poor design.)
- --- In firstname.lastname@example.org, Adam Sroka <adam.sroka@...> wrote:
>One of our test environments runs ~$500,000. That quickly adds up to a whole lot of $$$.
> On Wed, Dec 30, 2009 at 7:23 AM, woynam <woyna@...> wrote:
> > As the test suite grows, so does the execution time. You can optimize it, but it will generally continue to grow.
> More or less. It should be feasible to run tests in parallel. May only
> require an investment in hardware to see a several-fold increase.
>The system *is* decomposed. There are hundreds of services running in ~50 different process types, i.e. subsystems.
> > Our 18,000+ acceptance tests take roughly 3 days to run. In some instances, the business decides it can't wait 3 days before applying a critical fix.
> That's too long. Split the system into functional modules that you can
> test independently and/or in parallel.
>Well, the system isn't even close to moderately complex. We're talking massive complexity here. A single "use case" can have a hundred test cases alone.
> 18,000 tests in 3 days is an average of 14.4 seconds per test. I've
> seen worse for acceptance tests, but I would consider doing more fast
> unit tests and less slow acceptance tests.
> For a system of moderate complexity a few hundred acceptance tests and
> 10K or so unit tests usually does the trick. As a system gets more
> complex than that it should be decomposed into modules for the sake of
> everyone's sanity. It should be possible to test the modules
> independently or in parallel.
We run 6 different financial exchanges/trading sessions on our platform, including the worlds largest options exchange. Each session has a completely different set of business rules.
Yes, you can test things in isolation, but at the end of the day the business doesn't like outages, and we've been bit enough by integration bugs to not test end-to-end.
There are days like these when I'd like to trade jobs with someone for a day or two. We're not talking about some low-volume e-commerce web site here. We're talking about a financial exchange that processes several billion quotes on a good day, with peak message rates exceeding 1 million messages *a second* executing complex business logic.
I'm sure we can milk a bit more performance out of our acceptance test suite. However, that's not the real bottleneck. Our performance/load/capacity/rollout testing runs in parallel with our functional testing. The perf tests take ~1 month to complete. A single load/capacity test may require a full day's run.
Since the system is not flash cut in a single day, we simulate a single day of rollout, and perform a subset of tests on the mixed-version system. For each rollout, we also have to test the fallback procedures, and verify the system functions after a fallback.
With over 800 servers in the production environment, the rollout can take anywhere from 3 to 4 weeks. We do this because we've been bitten in the past by incompatible versions of subsystems. It's insufficient to test the new version of the system only, as there will be a period when you have multiple different versions live at the same time.
We've managed to cut 50% of the time from the performance tests recently with the addition of some extra hardware. However, since the performance tests run on production-grade hardware (e.g. 16-way, 64Gb top-of-the-line clusters), you're looking at a lot of money to run additional parallel sessions.
>Yup. When a single outage can cost the business hundreds of thousands of dollars, you take your testing seriously. Of course, we're not talking about life-critical systems, but there are days when it certainly feels that way. :-) Over the years you tend to get a bit conservative when the cost of failure approaches 7 figures.
> > Testing is ultimately about risk management. You can never test everything, so you pick the areas where you have a higher probability of discovering defects.
> Everything is about risk management.
- Hello, Mark.
Something (in my inbox) tells me you didn't like my answer. I'm
sorry about that but not about the answer. Your tests take three
days, the business needs less time. Therefore something needs doing.
More below ...
On Wednesday, December 30, 2009, at 11:35:56 AM, you wrote:
>> > Our 18,000+ acceptance tests take roughly 3 days to run. In some
>> > instances, the business decides it can't wait 3 days before
>> > applying a critical fix.
>> True. Those tests run too long.
> Really? Wow, you're incredible, Ron. Where can I get a crystal
> ball like the one you have?
> We're talking end-to-end acceptance tests on a very large
> distributed system here, not unit tests. There are hundreds of
> services distributed across dozens of types of processes.
Yes. Adam Sroka has already raised some options for improving the
situation. There are others.
One main thrust of his note was the replacement of large numbers of
end-to-end tests with faster unit tests. Given the large amount of
time (14 seconds, did he estimate?) for a test, I'd guess that a lot
of that time is in inter-system communication, which suggests that
replacing end-to-end with unit tests could have a profound impact,
and also makes one think about the judicious use of test doubles.
> This is an electronic trading system with complex algorithms for
> handling everything from numerous allocation algorithms, flash
> orders, contingency orders, spread orders, buy-writes, options,
> indexes, futures, and stock sessions, in-crowd manual handling,
> firm-level order routing, inter-market linkage, market data
> dissemination, trade busts, etc.
Yes, it sounds very large. It also sounds to me as if it's pretty
clear than not every change is likely to break every module
everywhere. That would suggest that some strong interface tests
serving as firewalls could allow the test space to be partitioned
with high safety.
All this would of course be dependent on your organization wanting
to address this issue. If you're perfectly happy with three days,
then why change, as it will surely be costly in time and work to do
so. If faster test times would pay off in important quality
improvements or cost reductions or faster deployment, then it might
There's no way to tell from here, of course, whether it would be
> We've been refining the test suite for the past 7+ years, and I'm
> sure we're doing it all wrong.
I don't think so. On the other hand, I've been doing significant and
varied software for seven times that long and I've quite commonly
had useful ideas, which makes me confident that I could help if it
made sense for both of us.
Based on recent communications, of course, it probably doesn't.
> So, feel free to come by and volunteer your time to help us out.
I do actually do pro-bono work, though not usually in your industry.
However, I can do what I say, and I can help other people learn to
do it as well. I would be willing in principle to undertake a
preliminary planning session with the right people from your
organization, and then to put together a program of -- at an
estimate -- about a month, on a contingent basis roughly like this:
We, your right people and I, propose some things to do and predict
the improvements we expect to get. If the effort is approved, we
work through the program. If we don't accomplish what we set out
to, I'd charge no fee.
I don't expect to hear from you, since something tells me you think
I'm an *******. But I can do what I say, and I'd put money behind
But let's be realistic. I'm sure you already know as well as I do
that there are things that could be done to substantially improve
the testing aspects of your system if you only could invest the time
My purpose was to remind you of that, on the assumption, pretty well
founded, I think, based on the things you've posted over time, that
you really already know it. I'm sorry that ticked you off, but
that's the price I pay for saying what I think.
The main reason that testing at the end of a development cycle finds
problems is not that problems were put in near the end, it is that
testing was put off until then.