From: WarrenS <warren.wds@...
> > > b -= d+d+a;
> > > a += d; //the obvious "optimization" of this
> & previous line... makes it slower!
> > > (...)
> > I can't imagine that
> > a += d ; b -= d+a
> > would be slower.
> --it is slower! On my computer, anyhow.
It's a sparse enough inner loop that I can easily imagine the increased dependency makes it slower. What's the latency of an add nowadays? I know it's crept up to about 6 in the past decade (at least on the SIMD units). Something like that's a huge bubble, and should definitely be avoided.