## Re: Velocity Experiment [80 hours solo = ?? hours with PP (Re: Productive 80 hour week - was Re: [XP] Re: Weaknesses of XP)]

Expand Messages
• ... Who s up for an experiment? Here s how it works: 1) Track velocity for at least two iterations keeping the team s hours constant to establish a baseline.
Message 1 of 21 , Dec 30, 2001
• 0 Attachment
> But I'd like to have the facts. They're clearly team, company, and
> individual dependent. And we don't have them, as far as I know, for
> anyone.
> Ron Jeffries

Who's up for an experiment?

Here's how it works:

1) Track velocity for at least two iterations keeping the team's hours
constant to establish a baseline. The team needs to be practicing XP instead
of going through the learning phases for the data to be useful.
2) After the baseline is established, cut 5-10 hours off the workweek, e.g.
"we're now doing 60 hour weeks, let's cut back to 50 hours". Work this way
for at least two iterations. Does velocity go up, down, or stay the same?
3) Bring the hours back up for at least two more iterations. What happens to
velocity then?

When you have enough data post the results to the list.

Best Regards,

Chris

Christopher Hart
President & CTO
Hart Edwards Corporation, Inc.
Tel: 303-402-9883 ext 117
Mobile: 720-231-6616

Enterprise and Network Management Solutions

Proprietary and Confidential Correspondence
• ... The problem is, you re boiling the frogs to begin with, and then seeing if they start to swim again when you turn down the heat. Start the experiment at 40
Message 2 of 21 , Dec 31, 2001
• 0 Attachment
--- In extremeprogramming@y..., "Christopher Hart" <hart@h...> wrote:
> "we're now doing 60 hour weeks, let's cut back to 50 hours". Work
> this way for at least two iterations. Does velocity go up, down, or
> stay the same?

The problem is, you're boiling the frogs to begin with, and then
seeing if they start to swim again when you turn down the heat.
Start the experiment at 40 hours, and go up to 50 and then 60, and it
would be a fairer experiment, although still not a humane one.

John Brewer
Jera Design

Extreme Programming FAQ: http://www.jera.com/techinfo/xpfaq.html
• ... John, What I m saying in response to Ron s comments about facts is to baseline velocity in a project and then start experimenting with different hours.
Message 3 of 21 , Dec 31, 2001
• 0 Attachment
> The problem is, you're boiling the frogs to begin with, and then
> seeing if they start to swim again when you turn down the heat.
> Start the experiment at 40 hours, and go up to 50 and then 60, and it
> would be a fairer experiment, although still not a humane one.
> John Brewer

John,

What I'm saying in response to Ron's comments about facts is to baseline
velocity in a project and then start experimenting with different hours.
That way there is some hard data to back up productivity claims. It's not
about being humane/inhumane and it's not like they are being asked to work
more than what the team considers to be normal. Also the reason to track the
velocity changes over at least two iterations is because the first
"transition" iteration may have a shock factor to it that throws off the
data too much to be useful.

The team can experiment back and forth until they determine how many hours
maximizes their productivity. Maybe your team works best at 60 hours. Maybe
37.5. You won't know until you test it. Plus once you have some hard data,
you can take that to customers/management to justify the team's hourly
commitment.

Best Regards,

Chris

Christopher Hart
President & CTO
Hart Edwards Corporation, Inc.
Tel: 303-402-9883 ext 117
Mobile: 720-231-6616

Enterprise and Network Management Solutions

Proprietary and Confidential Correspondence
• ... Leaving aside the humaneness question, this whole idea strikes me as very pseudo-scientific. Here s why: How do you know that two iterations (or 20) is
Message 4 of 21 , Dec 31, 2001
• 0 Attachment
Christopher Hart:
>What I'm saying in response to Ron's comments about facts is to baseline
>velocity in a project and then start experimenting with different hours.

Leaving aside the humaneness question, this whole idea strikes me as very
pseudo-scientific. Here's why:

How do you know that two iterations (or 20) is statistically significant?
How do you control for other variables such as holidays, awareness of
experimentation, difficulty of stories worked, inaccuracies in estimates,
political changes, etc.
How do you account for long-term tradeoffs such as design debt, untested
defects found downstream, health, and morale effect?
How many projects should you do this on? One, ten, 100, etc.
How do you account for individual preference?

It seems to me that all this experiment would do is arm one of the two
camps with a straw man argument that could then be attacked by the opposing
camp based largely on chance.

Best,
Bill
• ... What would be a better thing to do? Ron Jeffries www.XProgramming.com Logic is overrated as a system of thought.
Message 5 of 21 , Dec 31, 2001
• 0 Attachment
Around Monday, December 31, 2001, 12:30:58 PM, wecaputo@... wrote:

> It seems to me that all this experiment would do is arm one of the two
> camps with a straw man argument that could then be attacked by the opposing
> camp based largely on chance.

What would be a better thing to do?

Ron Jeffries
www.XProgramming.com
Logic is overrated as a system of thought.
• ... Glad you asked that :-) Baseline the velocity of a real world project. Use these measurments to review stories estimates for accuracy at the end of the
Message 6 of 21 , Dec 31, 2001
• 0 Attachment
Ron Jeffries:
>What would be a better thing to do?

Baseline the velocity of a real world project. Use these measurments to
review stories estimates for accuracy at the end of the project. Select a
representative sample of those stories, hand them to a researcher who can
set up the experiment in a controlled environment, and run the necessary
trials.

The problems of course are having a research facility handy, a grant, a set
of stories that a real corporation would allow the use of, and a real XP
project to do the work so as to measure the initial story estimates. But
then that's probably why its not been done yet. :-)

Best,
Bill
• ... The re not. It would be better to run this over a much longer period, but my hope is that if enough teams tried it and posted the results we might be able
Message 7 of 21 , Dec 31, 2001
• 0 Attachment
> Leaving aside the humaneness question, this whole idea strikes me as very
> pseudo-scientific. Here's why:
>
> How do you know that two iterations (or 20) is statistically significant?

The're not. It would be better to run this over a much longer period, but my
hope is that if enough teams tried it and posted the results we might be
able to come up with some good general stats.

> How do you control for other variables such as holidays, awareness of
> experimentation, difficulty of stories worked, inaccuracies in estimates,
> political changes, etc.

It's accounted for in team velocity. Velocity is pretty abstract, but it's a
good metric for this type of variation.

> How do you account for long-term tradeoffs such as design debt, untested
> defects found downstream, health, and morale effect?

Other metrics that should be watched are the success/failure ratios for
acceptance tests and integration. If velocity is up, but the quality of the
code is suffering then you should pull back the hours.

> How many projects should you do this on? One, ten, 100, etc.

IMHO the optimum number of hours needs to be tailored to each team and each
project. Reducing or adding an hour a day for fine tuning shouldn't be that
much of a burden. Plus once you convince yourself, your team, the customer,
and management that N number of hours maximizes productivity it's a fairly
safe bet that the value of N will be effective in the future.

> How do you account for individual preference?

Not sure what you mean since this is a team effort. I like putting in long
hours, but I'm an entrepreneur with a direct stake in the success of a
project. Does that mean my team should work my hours? Certainly not if that
doesn't enable them to perform at peak.

> It seems to me that all this experiment would do is arm one of the two
> camps with a straw man argument that could then be attacked by the
opposing
> camp based largely on chance.
>
> Best,
> Bill

Maybe, but we don't have stats now. The XP literature glosses over the whole
concept, and the existing research on time-commitment/productivity for
creative endeavors is sparse. Just because Kent said 40 hours is a good
thing doesn't make it gospel. The whole concept of the 40-hour Week or
Sustained Pace is wide open to attack.

If you had to justify reducing the number of hours commited to a project to
executive management , could you? My guess is that most of the list could
not because there isn't enough ammunition there. Does anybody have a better
method that would help every team on this list?

Best Regards,

Chris

Christopher Hart
President & CTO
Hart Edwards Corporation, Inc.
Tel: 303-402-9883 ext 117
Mobile: 720-231-6616

Enterprise and Network Management Solutions

Proprietary and Confidential Correspondence
• Hi Bill, ... I m not sure what you mean by accuracy. Are you suggesting to determine whether story points correlate to something like effort or duration? Dale
Message 8 of 21 , Dec 31, 2001
• 0 Attachment
Hi Bill,

> Baseline the velocity of a real world project. Use these measurments
> to review stories estimates for accuracy at the end of the project.

I'm not sure what you mean by accuracy. Are you suggesting to
determine whether story points correlate to something like effort or
duration?

Dale
• ... to ... Only in the way I just stated in my response to Ron. Its based on principles. Balance between work and life. Honesty about our true effort and where
Message 9 of 21 , Dec 31, 2001
• 0 Attachment
Christopher Hart:
>If you had to justify reducing the number of hours commited to a project
to
>executive management , could you

Only in the way I just stated in my response to Ron. Its based on
principles. Balance between work and life. Honesty about our true effort
and where we are. That consensus building (including consensus on hours
worked) is best for group morale which in turn is best for success. That
forcing people to do things they don't want to is a greater risk to project
failure than any technology risk.

The specific arguments I would use would be situationally dependant, but I
am convinced that working at maximum burn rate for any sustained length of
time is a serious risk, and risk indicator.

When I am convinced of something I will argue for it on any project. The
result will be that project's optimum, never perfect.

As always Martin Fowler summed it up beautifully to me one day:
"We have to be completely inflexible on where we want to go, but completely
flexible on how we get there."

For me quality of project life is inflexible because it is one of things
that ships software on time and at high quality. Not spending every waking
moment at work is an important (and often missing) aspect of our quality of
life.

If you can figure out a way to confirm that hypothesis experimentally, I
would love to have the data, but until then I will rely on more
philosophical and historical arguments. :-)

Best,
Bill
• ... I am imperfectly stating the fact that our knowledge of our estimates is usually perfect -- after the project is complete. If we did estimates and used
Message 10 of 21 , Dec 31, 2001
• 0 Attachment
Dale Emery:

>I'm not sure what you mean by accuracy. Are you suggesting to
>determine whether story points correlate to something like effort or
>duration?

I am imperfectly stating the fact that our knowledge of our estimates is
usually perfect -- after the project is complete. If we did estimates and
used velocity to track our progress across an entire project, we should
have a good idea which stories were initially estimated accurately i.e. our
predicted result was close to actual result.

By judiciously picking stories, and then using them experimentally, we
would have some control over that variable. This would be easier with
estimates that are relative to each other rather than to calendar time,
because relative difficulty should be similar among different groups -- or
at least as similar as I can figure out how to get.

Since velocity is more useful for prediction on our real projects when
estimates are seen as relative weights (more anecdotal evidence here) this
seems reasonable to me.

The whole idea is conjecture of course, but that's never stopped me before.
:-)

Best,
Bill
• Hi Bill, ... I ve never been able to get very good estimates even after the project is done. ;-) ... I m asking for specific units of measure. The reason I
Message 11 of 21 , Dec 31, 2001
• 0 Attachment
Hi Bill,

> I am imperfectly stating the fact that our knowledge of our
> estimates is usually perfect -- after the project is complete.

I've never been able to get very good estimates even after the project
is done. ;-)

> If we did estimates and used velocity to track our progress across
> an entire project, we should have a good idea which stories were
> initially estimated accurately i.e. our predicted result was close
> to actual result.

I'm asking for specific units of measure. The reason I ask is that
the key unit of measure in XP seems to be story points. I have no
idea how to determine "actual" story points, so I'm guessing that you
are thinking of some other units of measure.

So what are you proposing to estimate and measure? Effort for each
story? Duration from start to finish for each story? Something else?

Dale
• ... Ahh, good point Dale, as always when I listen to you I see things. Let s walk through this and see what we can use: (This example assumes all stories were
Message 12 of 21 , Dec 31, 2001
• 0 Attachment
Dale Emery:
>So what are you proposing to estimate and measure? Effort for each
>story? Duration from start to finish for each story? Something else?

Ahh, good point Dale, as always when I listen to you I see things. Let's
walk through this and see what we can use:

(This example assumes all stories were estimated at the beginning of the
project)

Story A 1
Story B 2
Story C 3
Story D 1
Story E 2
Story F 3

Say we do Story A, B and C in an iteration. and we get them done. Our
velocity is 6
The next week we do stories DEF and we do not finish D. We have two options

1) We adjust our velocity to 5 and move on.
2) We say D was a two and keep velocity at 6

Now when doing an XP project, we are encouraged to do 1 for several reasons
including the fact that this abstracts the many variables of a project as
Chris Hart suggested in an earlier posting today. We adjust velocity and
move on.

Now at an arbitray iteration, (we are again at 6 velocity) we have stories
GHI again at 1, 2, 3
And again we get 5 points done and again the 1 pt story (G) is unfinished.
Again we have our choice. And again I would change velocity.

Now, it is a recmmended practice to review estimates periodically and
change those you feel are truly off. If we were to review the iterations
between 1 and when GHI and I were done, we see that A, D & G are all 1
point, and that 2 of those were involved in velocity reductions.

Maybe we take all stories like D & G and revise them to 2. (in practice I
might want more than 2 unless I really knew from working on them that they
were the cause of the velocity miss).

(BTW, to anyone following who is curious, new stories could be compared to
existing stories and estimated accordingly, this is how XP'rs get good at
estimates)

If we did this on the entire project -- Tracking original estimates, and
velocity, and revising estimates and commenting on those that appeared off,
would we not have stories that, regardless of actual velocity, were
appropriately estimated *relative to each other*?

I think so, what do you see?

Best,
Bill
• ... I think we would, with one important caveat for the experiment: the relative difficulty depends on the team s plan/vision/architecture for the
Message 13 of 21 , Dec 31, 2001
• 0 Attachment
Around Monday, December 31, 2001, 4:47:58 PM, wecaputo@... wrote:

> If we did this on the entire project -- Tracking original estimates, and
> velocity, and revising estimates and commenting on those that appeared off,
> would we not have stories that, regardless of actual velocity, were
> appropriately estimated *relative to each other*?

I think we would, with one important caveat for the experiment: the
relative difficulty depends on the team's plan/vision/architecture for
the implementation. A concern might be that other teams would do it
all some way that would change the _relative_ rankings.

But if they did, that should show up in their actuals. It'd still be
interesting ...

Ron Jeffries
www.XProgramming.com
Logic is overrated as a system of thought.
• Hi Bill, ... I m not sure how you decided that it was the 1 point stories that were off. Why not change the 2 point stories to 3, or the 3 point stories to 4?
Message 14 of 21 , Dec 31, 2001
• 0 Attachment
Hi Bill,

> Now, it is a recmmended practice to review estimates periodically
> and change those you feel are truly off. If we were to review the
> iterations between 1 and when GHI and I were done, we see that A, D
> & G are all 1 point, and that 2 of those were involved in velocity
> reductions.
>
> Maybe we take all stories like D & G and revise them to 2.

I'm not sure how you decided that it was the 1 point stories that
were off. Why not change the 2 point stories to 3, or the 3 point
stories to 4? Perhaps you're basing your decision on information you
haven't told me about. What's the information?

It isn't clear to me that a drop in velocity means that the individual
story estimates were off. You know that your velocity went down, but
maybe that had nothing to do with errors in estimating how challenging
each story would be. Maybe it was due to a couple of people having a
couple of slow days.

> I think so, what do you see?

I'm not yet seeing what information you will use to determine which
estimates were off, or even whether the estimates were off at all.

Dale
• ... First off this example has been hypothetical, so it assumed that there is something, but in general, if I revise an estimate, its because I have reason to
Message 15 of 21 , Jan 1, 2002
• 0 Attachment
Dale:

>Perhaps you're basing your decision on information you
>haven't told me about. What's the information?

First off this example has been hypothetical, so it assumed that there is
something, but in general, if I revise an estimate, its because I have
reason to in the situation based on my experience with the stories and the
system being built.

>I'm not yet seeing what information you will use to determine which
>estimates were off, or even whether the estimates were off at all.

In my example, the people doing the estimates felt that the 1 point story
was the one that was more difficult than expected.

Try looking at it this way:
Me: "See those bags of flour over there? How much do you think they weigh
relative to each other?"
You: "Hmmm, let's see, the second one looks twice as heavy as the first,
and the third looks three times as heavy."
Me: "OK, could you now estimate these other 50 bags."
(Imaginary Dale does so)
Me: "OK, now go pick some of them up, if you would please."
You: "Sure"
(Imaginary Dale goes over and hefts bags of flour."
Me: "Now what do you think about their relative weights?"
You: "I think there is something weird about that 1 point bag over there,
it seemed almost as heavy as those two point bags."
Me: "Maybe you just imagined it. Here are some more bags, give 'em a heft."
(imaginary Dale kindly obliges)
Me: "Well? What did you find out?"
You: "There is definitely something weird about those little 1 point bags,
I think they are wider than the others or something, because they
definitely are shorter, but they weigh about as much as those 2 point
bags."
Me: "Would you like to revise your estimates?"

Now Dale, perhaps you can answer these questions for me:
a) what information would imaginary you be using to determine the bags'
relative weights?
b) Why is imaginary you doing the re-estimating rather than imaginary me?
c) Why did imaginary you think his estimates were off?
d) How did he "know" that it was the 1 point bag and not the 2 point bag
that was "wrong"?
e) What does imaginary you's revised estimates do for Ron when he comes
along to heft flour?
f) What if they were rated "really heavy", "twice of a pain", and "triple
back surgery", how would that affect the estimates' usefulness to Ron?
g) What if we rated them by how long it would take you to carry them 50
meters, would that change anything?

Best,
Bill
• ... Have you no f5g life? Do you know what time it is??? Ron Jeffries www.XProgramming.com It is not because things are difficult that we do not dare, it is
Message 16 of 21 , Jan 1, 2002
• 0 Attachment
Around Tuesday, January 01, 2002, 6:07:08 AM, wecaputo@... wrote:

> First off

Have you no f5g life? Do you know what time it is???

Ron Jeffries
www.XProgramming.com
It is not because things are difficult that we do not dare,
it is because we do not dare that they are difficult. --Seneca
• ... Yes. One typical revision will be because with that class of story we forgot something that has to be done. (Ohh, we forgot we have to refactor the
Message 17 of 21 , Jan 1, 2002
• 0 Attachment
Around Tuesday, January 01, 2002, 6:07:08 AM, wecaputo@... wrote:

> First off this example has been hypothetical, so it assumed that there is
> something, but in general, if I revise an estimate, its because I have
> reason to in the situation based on my experience with the stories and the
> system being built.

Yes. One typical revision will be because with that class of story we
forgot something that has to be done. (Ohh, we forgot we have to
refactor the database for these.)

Another occurs when refactoring or tool-making makes something
formerly difficult really easy. My favorite example of that was when
George announced at one planning meeting that he would do all the rest
(25 or something) of things that he could formerly do a few of per
iteration.

The basic idea, of course, is that these stories are different in
proportion to the others, from what we thought (or from what they
actually were). In that case, it makes sense to adjust them rather
than hope that velocity will handle it, because ... velocity won't
handle it very well.

Ron Jeffries
www.XProgramming.com
Sigs are like I Ching or Tarot. They don't mean anything,
but sometimes if you think about them you'll get a useful idea.
• Yeah, I was up late playing Empire Earth, what was your excuse?? Ron Jeffries
Message 18 of 21 , Jan 1, 2002
• 0 Attachment
Yeah, I was up late playing Empire Earth, what was your excuse??

Ron Jeffries
<ronjeffries@acm. To: extremeprogramming@yahoogroups.com
org> cc:
Subject: Re: Velocity Experiment [80 hours solo = ?? hours with PP (Re: Productive 80
01/01/2002 05:17 hour week - was Re: [XP] Re: Weaknesses of XP)]
AM
extremeprogrammin
g

Around Tuesday, January 01, 2002, 6:07:08 AM, wecaputo@...
wrote:

> First off

Have you no f5g life? Do you know what time it is???

Ron Jeffries
www.XProgramming.com
It is not because things are difficult that we do not dare,
it is because we do not dare that they are difficult. --Seneca

To Post a message, send it to: extremeprogramming@...

To Unsubscribe, send a blank message to:
extremeprogramming-unsubscribe@...

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
• ... ...but wait... why *start* at 40 ? You d be missing a third of the measurable curve outright. If this is to be a proper experiment, you should insist on
Message 19 of 21 , Jan 1, 2002
• 0 Attachment
> Start the experiment at 40 hours, and go up to 50 and then 60, and it
> would be a fairer experiment, although still not a humane one.

...but wait... why *start* at 40 ? You'd be missing a third of the
measurable curve outright. If this is to be a proper experiment, you
should insist on starting measurement at 0 hour weeks, and moving up
to 168.
-[Morendil]-
Your email has been returned due to insufficient voltage.
• Hi Bill, ... Actually, I can t. Perhaps the Dale you imagined can answer them. Dale
Message 20 of 21 , Jan 1, 2002
• 0 Attachment
Hi Bill,

> Now Dale, perhaps you can answer these questions for me:

Actually, I can't. Perhaps the Dale you imagined can answer them.

Dale
• Hi Bill, ... Okay. That s good enough for me. Dale
Message 21 of 21 , Jan 1, 2002
• 0 Attachment
Hi Bill,

> >Perhaps you're basing your decision on information you
> >haven't told me about. What's the information?
>
> First off this example has been hypothetical, so it assumed that
> there is something, but in general, if I revise an estimate, its
> because I have reason to in the situation based on my experience
> with the stories and the system being built.

Okay. That's good enough for me.

Dale
Your message has been successfully submitted and would be delivered to recipients shortly.