Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify RNG seed initialization in PARPACK #431

Closed

Conversation

FabienPean
Copy link
Contributor

Following up #424, one can simplify the parpack seed initialization to fulfill the LAPACK requirements. Before, it was already a constant number which could handle "only" 1547 (0.5*(4095-1000-1)) proc id.

@fghoussen
Copy link
Collaborator

I'll have a closer look when possible.
@szhorvat: Sounds right to you?

PARPACK/SRC/MPI/pdgetv0.f Outdated Show resolved Hide resolved
@szhorvat
Copy link
Contributor

Yes, this change makes sense to me.

@fghoussen
Copy link
Collaborator

@FabienPean: BTW, why do you think this is better than the original way of seeding?
@pv: does it help #218?

@FabienPean
Copy link
Contributor Author

FabienPean commented Aug 20, 2023

@FabienPean: BTW, why do you think this is better than the original way of seeding?

It is simpler, no error block if it reaches 1547 MPI proc, and does not make some obfuscating calculation 1000+2*myid+1 is still a constant seed per MPI proc. The only difference is how the value are spread over iseed.
It was yielding

  • 0<= iseed(1)<5
  • 0<=iseed(2)<1000
  • 0<=iseed(3)<100
  • 1<iseed(4)<10

Since there is no recommendation on the spread of the initial seed from LAPACK, the new value values are as good as any. So I copied the initial part from the serial code for iseed(1-3) and used a modulo for iseed(4), which at least guarantee an initial seed uniqueness until 2048 MPI proc, then MPI processes 0 and 2048 will start to reuse the initial seed and therefore random values for their part of the problem.

In brief, this PR yields fewer LOC and reduced seed repetition for high MPI proc counts

@pv: does it help #218?

This PR will not help for this issue, it is more connected to #424 in particular that comment:

Otherwise, the seed could be provided by the user but it means bubbling up the parameter and affecting function signatures.

If someone provides a known residual and it fails for the user, then because of the constant seed, it will always fail (same sequence of numbers will be generated upon getv0 calls). The only way to dodge the constant seed is for the user to modify the initial residual (and hope for the best)

EDIT: yup indeed the last part was effectively discussed in length previously in #218 :)

@fghoussen
Copy link
Collaborator

It is simpler

I guess this is OK and will not trigger regression.

@caliarim: what do you think? Potential reg or fix related to JuliaLinearAlgebra/Arpack.jl#138?

@fghoussen
Copy link
Collaborator

related to JuliaLinearAlgebra/Arpack.jl#138

Just realizing that this was serial (arpack) so parpack will have no impact here

@szhorvat
Copy link
Contributor

I am not an expert on RNGs and can't comment much, but here's a reason why this manner of seeding may not be ideal, depending on the type of RNG used: https://stats.stackexchange.com/a/233091/4764 I believe myid will increase sequentially from thread to thread.

iseed(1) = 1
iseed(2) = 3
iseed(3) = 5
iseed(4) = 1+2*mod(myid,2048)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#431 (comment): Not sure what to do. Does this mean we should let seed([123]) as is (or dependent of rand?), and, initialize RNG seed to myid (with some kind of call random_seed(put=myid)?) before changing iseed(4) to iseed(4) = 1+2*mod(rand(),2048)?

@fghoussen
Copy link
Collaborator

I believe myid will increase sequentially from thread to thread.

Yes, it will.

@szhorvat
Copy link
Contributor

szhorvat commented Sep 2, 2023

Well, this may not be an issue at all here. Whether it is depends on how LAPACK's RNG works, and once again, I'm definitely not an RNG expert, and I do not have a theoretical understanding of why setting only the least significant part of the seed may be an issue.

Also, problems would only occur if iseed(4) (and not iseed(1)) is the least significant part of the 48 bit value it operates with. I am not entirely certain if this is the case.

If we want to be concerned about this, we can make all of iseed(1), iseed(2), iseed(3) and iseed(4) dependent on myid.

But at this point, I'm really hoping to see 3.9.1 released as soon as possible :-)

@FabienPean
Copy link
Contributor Author

FabienPean commented Sep 2, 2023

I am not an expert on RNGs and can't comment much, but here's a reason why this manner of seeding may not be ideal, depending on the type of RNG used: https://stats.stackexchange.com/a/233091/4764 I believe myid will increase sequentially from thread to thread.

I am not entirely sure what you mean here. AFAIU, the link discusses the choice of an initial seed and seed sampling per thread across several run of an application. In the case of ARPACK, the seed was already the same across runs. The risk is more about the lack of sensitivity of the RNG (small seed number variation yields a small spread in the number generated, i≈j ⇒ x≈y). A good characteristic of a RNG is to not have this problem, which I would expect form Lapack.

The documentation of Lapack does not mention advice or any restriction on the choice of seed (dlaran), diving deeper in the theory or testing it with different input might provide further insights

This routine uses a multiplicative congruential method with modulus
248 and multiplier 33952834046453 (see G.S.Fishman,
'Multiplicative congruential random number generators with modulus
2
b: an exhaustive analysis for b = 32 and a partial analysis for
b = 48', Math. Comp. 189, pp 331-344, 1990).

48-bit integers are stored in 4 integer array elements with 12 bits
per element. Hence the routine is portable across machines with
integers of 32 bits or more.

This PR instead of spreading a seed across the 4 elements of the array in an arbitrary manner, it increases linearly in iseed(4). For a fully random seed at each run, I gave a try in #424 , but it made several tests to fail consistently, so I didn't pursue then because of the major change it would be and the lack of visibility on the reason for test failure.

Comparison for a few MPI ranks:

myid Before This PR
1 [1,0,0,3] [1,3,5,3]
3 [1,0,0,7] [1,3,5,7]
5 [1,0,1,1] [1,3,5,11]
404 [1,8,0,9] [1,3,5,809]
         igen = 1000 + 2*myid + 1
         iseed(1) = igen/1000
         igen     = mod(igen,1000)
         iseed(2) = igen/100
         igen     = mod(igen,100)
         iseed(3) = igen/10
         iseed(4) = mod(igen,10)

@szhorvat
Copy link
Contributor

szhorvat commented Sep 4, 2023

I experimented a bit with this specific RNG and it became clear that using sequential seeds is not acceptable.

A good characteristic of a RNG is to not have this problem, which I would expect form Lapack.

There is a misunderstanding here about what the goal of a PRNG is and what makes it good quality. For a given seed, a PRNG will produce a sequence of numbers, which should be uniformly distributed and should appear statistically independent. This does not mean that if we plug in one seed to get the sequence $a_1, a_2, \dots$ and a second seed to get $b_1, b_2, \dots$ then $a_i$ and $b_i$ will be independent. In fact, it is a characteristic of some RNG algorithms that they won't be independent for certain seed choices.

Let's reproduce LAPACK's RNG algorithm. The laruv docs state,

  This routine uses a multiplicative congruential method with modulus
  2**48 and multiplier 33952834046453 (see G.S.Fishman,
  'Multiplicative congruential random number generators with modulus
  2**b: an exhaustive analysis for b = 32 and a partial analysis for
  b = 48', Math. Comp. 189, pp 331-344, 1990).

So we have
$$r_{i+1} = A r_i \mod M$$
where $M=2^{48}$ and $A=33952834046453$.

Note that with laruv, the "seed" is in fact the entire RNG state. Let's take consecutive seeds 1 and 3, and see what we get if we plot $(a_i, b_i)$. Note that the state needs to be an odd number (the Fishman reference explain why). I'll include the Mathematica code I used for this, for reference.

iter[k_] := Mod[33952834046453 k, 2^48]

a = NestList[iter, 1, 1000];
b = NestList[iter, 3, 1000];

ListPlot[Transpose[{a, b}], AspectRatio -> Automatic]
image

It's clear that there are strong correlations between the two sequences:

Let's try 1 and 11:

image

How about 1 and 1001?

image

Now it's looking better.

Conclusion: The parallel threads should not be initialized with consecutive seeds. Instead, the effect of myid should be spread out through the four 12-bit parts of the state/seed.

To be fair, this specific visualization won't reveal the non-independence if we don't use small seeds. And I'll emphasize once more than I am not an RNG expert. But I believe the point I made in the first paragraph does stand.

Here's another experiment: Let's take seeds between 10000 and 11000, iterate each 100 times to get a number. Then plot these numbers in order:

ListPlot[Nest[iter, #, 100] & /@ Range[10000, 11000]]
image

I am not entirely sure what you mean here. AFAIU, the link discusses the choice of an initial seed and seed sampling per thread across several run of an application.

The SE post discusses the same situation we have here. If we have 4 threads, is it okay to init them with consecutive seeds, for example 23, 24, 25, 26. It states that with a Mersenne twister it is not, as the four sequences would not be independent.

@szhorvat
Copy link
Contributor

szhorvat commented Sep 4, 2023

I'll put this here as well: https://stackoverflow.com/q/51184553/695132

@fghoussen
Copy link
Collaborator

Interesting! So as stated here: https://stackoverflow.com/questions/51184553/do-consequtive-rng-seed-yield-independent-random-numbers, one way to go could be: "Take those sequential numbers and run them through a cryptographic hash, then use that as the seed. Completely repeatable yet uncorrelated."? Only for iseed(4)? Also for iseed([123])?

It's clear that there are strong correlations

Never used Mathematica: not sure what the code does! You plot a on the x-axis and b on the y-axis, right?

@szhorvat
Copy link
Contributor

szhorvat commented Sep 5, 2023

"Take those sequential numbers and run them through a cryptographic hash, then use that as the seed. Completely repeatable yet uncorrelated."

That would be one possible option.

Only for iseed(4)?

seed represents a 48-bit number split into four 12-bit parts. So, ideally, populate all of it with a hash function that produces at least 48 bits.

Never used Mathematica

You don't need to read the code but I like to insert it in these situations for my own reference.

You plot a on the x-axis and b on the y-axis, right?

Yes, up to the last plot.

The last plot shows the 100th random number in streams initialized with consecutive seeds, the point being again that consecutive seeds won't produce independent streams.

@FabienPean
Copy link
Contributor Author

Thanks that's clearer and they are compelling arguments. The preexisting code is maybe not optimal, but this PR would theoretically make it worse.

"Take those sequential numbers and run them through a cryptographic hash, then use that as the seed. Completely repeatable yet uncorrelated."

That would be one possible option.

Another option could be to run the dlarnv routine myid times from a common starting seed across all MPI processes, convert the generated doubles to suitable integers and use that as a seed for subsequent calls in parpack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants