Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error -9999 with dnaupd in ARPACK 3.6 for a matrix that worked with ARPACK 3.5 #142

Closed
ntamas opened this issue Jul 26, 2018 · 19 comments
Closed

Comments

@ntamas
Copy link

ntamas commented Jul 26, 2018

The easiest way to test this issue is with GNU Octave 4.4.0 linked to ARPACK 3.6, but the same problem can be reproduced from C as well. The matrix is the PageRank matrix of an undirected star graph with 11 vertices and damping = 0.85:

octave:1> m = ones(11, 11) * 0.15 / 11;
octave:2> m(1, 2:11) += 0.85;
octave:3> m(2:11, 1) = (1 - m(1, 1)) / 10;

The matrix has three eigenvalues: 1, -0.85 and 0 (multiplicity = 9). Asking Octave to obtain the dominant eigenvalue with eigs (which in turn uses dnaupd) yields an error:

octave:4> [v, lambda] = eigs(m, 1)
error: __eigs__: eigs: error -9999 in dnaupd
error: called from
    eigs at line 285 column 18

(Related issue: igraph/igraph#1107 )

I know that the multiplicity of the zero eigenvalue could be a problem here; however, note that this matrix worked just fine with ARPACK 3.5 and a simple power iteration also converges to the correct dominant eigenvector. I managed to narrow the source down to this single line of code, but if my understanding is correct, then this line is actually necessary for the referenced patch to work.

Is there anything I am missing here?

(For what it's worth, I have also managed to reproduce the problem by modifying the matVec function in TESTS/bug_1315_double.c to perform the multiplication with the matrix described above. Note that the test case currently does not check the value of info after running dnaupd so the error goes unnoticed by the test case, but I have verified that info is -9999 in the test case. Let me know if you need my modified plain C code to test this).

@sylvestre
Copy link
Contributor

@caliarim rings a bell ? thanks

@caliarim
Copy link
Contributor

I think I know what is happening. Can you modify your plain C file (not octave) and solve a generalized problem with the silly matrix M = identity? I think it will fail, both with 3.5.0 and 3.6.0.

@ntamas
Copy link
Author

ntamas commented Jul 26, 2018

Yes, you are correct; modifying the C source to solve a generalized problem with M = identity makes the computation fail both in 3.5.0 and in 3.6.0.

Attached is my C code (which is a crude modification of bug_1315_double.c so please bear with the formatting and everything).

bug_142_double.c.gz

@caliarim
Copy link
Contributor

I hoped I was wrong. It means there is a long standing bug in Arpack with rank deficient matrices. It cannot compute a Krylov basis of length ncv, since with my patch (or in the generalized case, even before my patch) the initial vector is in the range of the operator (of dimension < ncv). Arpack tries to change the initial vector some times, but it is always in the range of the operator. After some attempts, it failes with info = -9999. I see two possibilities: 1) take the initial vector in the range of the operator only the very first time: any further attempt should not do it or 2) if after some attempts it is not possible to build a basis of length ncv, guess that the matrix is rank deficient and stop to compute the basis, instead of failing. I will need some time to reproduce the problem in fortran77 and try these (or other) possible solutions. Unfortunately, I do not see a workaround in the meantime.

@ntamas
Copy link
Author

ntamas commented Jul 27, 2018

Thanks for the quick response! I am not familiar with the internals of the algorithm implemented in ARPACK, but based on my limited understanding, implementing option 1 would take us back to where we was with ARPACK 3.5 if the matrix is rank-deficient, while keeping the benefits of the proposed change in all other cases. This would suit me but I was wondering whether this would affect the correctness of the result; if the result is likely to be incorrect with this workaround, I'd rather be on the safe side even if it means that I need to wait more.

@caliarim
Copy link
Contributor

caliarim commented Jul 27, 2018

Can you try if the following change

 --- dgetv0_3.6.f	2018-07-27 17:04:22.170831356 +0200
 +++ dgetv0_142.f	2018-07-27 16:59:47.690826426 +0200
 @@ -242,12 +242,14 @@
 c        %----------------------------------------------------------%
 c
          call arscnd (t2)
-         nopx = nopx + 1
-         ipntr(1) = 1
-         ipntr(2) = n + 1
-         call dcopy (n, resid, 1, workd, 1)
-         ido = -1
+         if (itry .eq. 1) then
+            nopx = nopx + 1
+            ipntr(1) = 1
+            ipntr(2) = n + 1
+            call dcopy (n, resid, 1, workd, 1)
+            ido = -1
          go to 9000
+         end if
       end if
 c 
 c     %-----------------------------------------%
@@ -274,7 +276,7 @@
 c
       call arscnd (t2)
       first = .TRUE.
-      call dcopy (n, workd(n+1), 1, resid, 1)
+      if (itry .eq. 1) call dcopy (n, workd(n+1), 1, resid, 1)
       if (bmat .eq. 'G') then
          nbx = nbx + 1
          ipntr(1) = n + 1
  1. fixes the problem
  2. passes the tests

? Thanks

@ntamas
Copy link
Author

ntamas commented Jul 27, 2018

I have tested the patch above - it fixes the problem and the failing test cases in igraph. Thanks a lot!

@caliarim
Copy link
Contributor

All the previous tests in ARPACK pass, too. ntamas, I see in your last post here igraph/igraph#1107 that there is still one failing test. Is it related to my previous commit, too?

@ntamas
Copy link
Author

ntamas commented Jul 30, 2018

Nope, I need to fix that; that particular test case is probably unrelated. ARPACK returns eigenvectors that are -1 times the "old" eigenvectors that previous versions returned, and the test case in igraph is too strict and does not ignore the sign. I only need to update the test case to make everything pass.

@caliarim
Copy link
Contributor

My previous patch has a problem in the generalized case. It should be

--- dgetv0_360.f	2018-07-30 13:23:25.218601879 +0200
+++ dgetv0.f	2018-07-30 13:21:34.558605942 +0200
@@ -242,12 +242,16 @@
 c        %----------------------------------------------------------%
 c
          call arscnd (t2)
-         nopx = nopx + 1
-         ipntr(1) = 1
-         ipntr(2) = n + 1
-         call dcopy (n, resid, 1, workd, 1)
-         ido = -1
-         go to 9000
+         if (itry .eq. 1) then
+            nopx = nopx + 1
+            ipntr(1) = 1
+            ipntr(2) = n + 1
+            call dcopy (n, resid, 1, workd, 1)
+            ido = -1
+            go to 9000
+         else if (itry .gt. 1 .and. bmat .eq. 'G') then
+            call dcopy (n, resid, 1, workd(n + 1), 1)
+         end if
       end if
 c 
 c     %-----------------------------------------%
@@ -274,7 +278,7 @@
 c
       call arscnd (t2)
       first = .TRUE.
-      call dcopy (n, workd(n+1), 1, resid, 1)
+      if (itry .eq. 1) call dcopy (n, workd(n+1), 1, resid, 1)
       if (bmat .eq. 'G') then
          nbx = nbx + 1
          ipntr(1) = n + 1

@ntamas
Copy link
Author

ntamas commented Jul 31, 2018

Re-tested the failing test cases in igraph with the new version and it works equally well, FYI.

@ntamas
Copy link
Author

ntamas commented Aug 1, 2018

Okay, I spoke too soon; I was checking that one single remaining test case failure in igraph, and it turns out that it also fails for ARPACK 3.6 but not for ARPACK 3.5, and the cause of the failure can also be narrowed down to the same commit that caused the original issue reported here.

The test case in igraph attempts to find the three eigenvectors with the largest imaginary part in the following 10x10 matrix (generated randomly with a pre-determined seed so it's the same with every test run):

octave:1> a = [-6 0 10 3 8 1 -4 10 -8 0
 -6 1 0 8 -4 4 -7 1 1 6
 7 -7 8 6 -4 -8 -1 -7 -3 -7
 6 8 -4 -1 10 3 7 7 -3 -8
 1 -7 -4 9 0 5 5 6 -8 10
 -9 10 -5 -9 5 3 -5 7 -7 10
 -3 0 8 -6 -2 -7 1 -3 -8 1
 2 0 9 -3 0 -9 -4 0 10 0
 -9 1 -6 -1 7 10 9 9 8 -2
 -7 1 9 -7 10 -1 -2 -5 7 6]
a =

   -6    0   10    3    8    1   -4   10   -8    0
   -6    1    0    8   -4    4   -7    1    1    6
    7   -7    8    6   -4   -8   -1   -7   -3   -7
    6    8   -4   -1   10    3    7    7   -3   -8
    1   -7   -4    9    0    5    5    6   -8   10
   -9   10   -5   -9    5    3   -5    7   -7   10
   -3    0    8   -6   -2   -7    1   -3   -8    1
    2    0    9   -3    0   -9   -4    0   10    0
   -9    1   -6   -1    7   10    9    9    8   -2
   -7    1    9   -7   10   -1   -2   -5    7    6

Using eig() in Octave (which uses LAPACK), I can verify that the three eigenvalues with the largest imaginary parts are: -3.00735 + 19.2957i, -3.00735 - 19.2957i and 12.1099 + 6.27293i (of course the complex conjugate pair of the latter is also in the set of eigenvalues). ARPACK correctly reports these in the igraph test case. It also correctly reports the eigenvectors for the first two eigenvalues. However, for the third eigenvalue, the eigenvector reported by ARPACK mixes up the real and the imaginary parts and then negates the real part.

The eigenvector reported by LAPACK in Octave is as follows:

octave:2> [vec, val] = eig(a);
octave:3> val = diag(val);
octave:4> val(7)
ans =  12.1099 +  6.2729i
octave:5> vec(:, 7)
ans =

   0.10197 + 0.24665i
   0.29260 - 0.22392i
  -0.24166 + 0.02660i
   0.20076 + 0.08521i
   0.07248 + 0.37699i
   0.24232 - 0.13719i
  -0.48469 + 0.00000i
  -0.04705 + 0.32278i
   0.02209 + 0.19020i
  -0.06435 + 0.28162i

ARPACK reports the following values in the 3rd and 4th columns of the eigenvector memory area (first and second columns are for the correct eigenpair, 3rd and 4th belong to the next eigenpair):

0.21912 0.152389
0.125122 -0.346547
-0.188931 0.153007
0.21496 -0.0368355
0.264759 0.277981
0.129704 -0.246402
-0.40777 0.262001
0.134903 0.296988
0.121394 0.148076
0.0980886 0.271714

This is parallel to the eigenvector reported by LAPACK in Octave only if you swap the two columns and then negate the first column (or, alternatively, if you take the complex conjugate and then swap the real part with the imaginary one).

The same test case in igraph also tests dnaupd() in modes LM and SR, and those are okay.

@caliarim
Copy link
Contributor

caliarim commented Aug 1, 2018

no, the two eigenvectors are parallel, they differ by a complex multiplicative constant.

@ntamas
Copy link
Author

ntamas commented Aug 1, 2018

Indeed; sorry for the noise!

@caliarim
Copy link
Contributor

caliarim commented Aug 2, 2018

@ntamas: do you confirm that all your tests now pass? I'm going to make a pull request.

@ntamas
Copy link
Author

ntamas commented Aug 2, 2018

Yes, all test cases pass now, thanks for your efforts!

caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
caliarim added a commit to caliarim/arpack-ng that referenced this issue Aug 6, 2018
@caliarim
Copy link
Contributor

caliarim commented Aug 8, 2018

Am I write that this change, as others, is not ported to pARPACK?

@sylvestre
Copy link
Contributor

sylvestre commented Aug 12, 2018 via email

@juanjosegarciaripoll
Copy link
Contributor

I am afraid to reopen this issue, but arpack fails once more with identity matrices. One just needs to change TESTS/icb_arpac_c.c to return the same vector and the test fails to run.

void dMatVec(const double* x, double* y) {
  int i;
  for (i = 0; i < 1000; ++i) y[i] = x[i]; /* ((double)(i + 1)) * x[i];*/
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants