- Am Donnerstag, 15. Mai 2003 08:24 schrieb jhr.walter@...:
> Hi all,

I have to report two things - a typo and a problem:

>

> yet another update.

>

> I applied a similar patch for sparse proxy evaluators. Gunter

> reported another problem in some axpy_prod()'s debug mode, which

> should be fixed by now, too. New download available.

The typo:

The second function in operation.hpp should be

axpy_prod (const compressed_matrix<T1, column_major, 0, IA1, TA1>

&e1, const vector_expression<E2> &e2, V &v, column_major_tag) {

(there was one incorrect row_major inside)

The problem:

The sample69.cpp shows some severe Problem of the axpy-LU for column

major matrices (g++ -DORIGINAL -DCOMPRESSED):

$ ./sample69 4000

column_major - lu_factorize 4.78

row_major - lu_factorize 5.41

axpy_column_major - axpy_lu_factorize 344.48

axpy_row_major - axpy_lu_factorize 4.68

(note the 344 (!) seconds)

Why that? The specialized axpy for compressed matrices does not work if

there is a product like matrix_range< compressed_matrix< ... > > *

vector_expression<...>. That's why the general axpy is called - which

maybe slow for column_major (In the row_major case the general axpy is

called, too, but its surprisingly fast; general column major axpy takes

4 times as long as compressed axpy). I think the problem is, that the

rhs of this axpy is a column of a sparse matrix, so v(index1) always

needs some time to find the element. (The profiler output says: most of

the time is spent in find1 and std::lower_bound).

Ok. Enough debugging for today.

mfg

Gunter

PS: I always use row_major matrices ... - Hi Stephan,

you wrote:

> The following program (Sorry, I had to include the part from lu.hpp, since

Probably a missing

> the new axpy_lu_factorize produced some kind of compiler errore, which I

> did not yet investigate - I use the last experimental version):

#include <boost/numeric/ublas/operation.hpp>

[snip some lu.hpp code]

> using namespace boost::numeric::ublas;

This is the expected behaviour as far as a LAPACK compatible layout

>

> int main(void)

> {

> matrix<double> a(3,3);

> permutation_matrix<std::size_t> p(3);

>

> p[0] = 2;

> p[1] = 1;

> p[2] = 0;

>

> a(0,0) = 1; a(0,1) = 2; a(0,2) = 3;

> a(1,0) = 4; a(1,1) = 5; a(1,2) = 6;

> a(2,0) = 7; a(2,1) = 8; a(2,2) = 9;

>

> std::cout << "before " << a << std::endl;

>

> swap_rows(p, a);

>

> std::cout << "after " << a << std::endl;

> }

>

> The program should (IMHO) simply switch the first and the third line of

> the matrix. What it actually does is the following:

>

> before [3,3]((1,2,3),(4,5,6),(7,8,9))

> after [3,3]((1,2,3),(4,5,6),(7,8,9))

>

> in other words: It first swaps the first line with the third, then the

> second line with the seocnd and last: The third line with the first ->

> NOTHING. I guess, that this is not the expected behaviour.

concerns. On the contrary, my last change to lu_factorize() concerning the

permutations was due to a complete miscomprehension :-(. I've reverted it

and uploaded a new version (containing other actual fixes, too).

> The believe,

[snip a swap_rows() alternative]

> that the swap_rows function is too simple. A better function should looks

> somewhat like this (Sorry, this is only a version for vectors - I have

> not written the matrix version yet :-( ):

> I have not fully tested it, but IMHO it should give better results. Error

Not sure, if we wouldn't leave the right path then ;-)

> reports, corrections and comments are of course very much welcome.

Best,

Joerg

> P.S.: I have written a "sign" member function (also not yet really fully

If I'd only be able to do what I'd like with my time ;-)

> tested) - are you interested?

- Hi Gunter,

you wrote:

> I have to report two things - a typo and a problem:

Fixed. Thanks.

>

> The typo:

>

> The second function in operation.hpp should be

>

> axpy_prod (const compressed_matrix<T1, column_major, 0, IA1, TA1>

> &e1, const vector_expression<E2> &e2, V &v, column_major_tag) {

>

> (there was one incorrect row_major inside)

> The problem:

Yep. I noticed that, too.

>

> The sample69.cpp shows some severe Problem of the axpy-LU for column

> major matrices (g++ -DORIGINAL -DCOMPRESSED):

>

> $ ./sample69 4000

> column_major - lu_factorize 4.78

> row_major - lu_factorize 5.41

> axpy_column_major - axpy_lu_factorize 344.48

> axpy_row_major - axpy_lu_factorize 4.68

>

> (note the 344 (!) seconds)

> Why that? The specialized axpy for compressed matrices does not work if

Yep, I guess you're right. I imagine one could write a column oriented

> there is a product like matrix_range< compressed_matrix< ... > > *

> vector_expression<...>. That's why the general axpy is called - which

> maybe slow for column_major (In the row_major case the general axpy is

> called, too, but its surprisingly fast; general column major axpy takes

> 4 times as long as compressed axpy). I think the problem is, that the

> rhs of this axpy is a column of a sparse matrix, so v(index1) always

> needs some time to find the element. (The profiler output says: most of

> the time is spent in find1 and std::lower_bound).

version of axpy_lu, too, but I'm not sure if it is worth the effort.

Thanks,

Joerg

> PS: I always use row_major matrices ...

... and there's hopefully no real need for axpy_lu ...

- Hej Joerg!

> On the contrary, my last change to lu_factorize() concerning

The change to lu_factorize was also the reason I came up with the change.

> the permutations was due to a complete miscomprehension :-(. I've

> reverted it and uploaded a new version (containing other actual fixes,

> too).

My swap_rows() version would only be appropriate for the incorrect

non-LAPACK version - You would leave the right path ;-) if you would use

it. Sorry, I did not know that the LAPACK definition of a permutation is

the appropriate one - but it does make sense (interoperability).

Thanks for the clarifications!!

Best regards & have a nice weekend

Stephan

-----------------------------------------------------------

Stephan Puchegger

Email: Stephan.Puchegger@...

-----------------------------------------------------------