-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty sector update tree r gpu #1528
Empty sector update tree r gpu #1528
Conversation
This may be slower in the general case, so it needs to be perf tested before it can be considered. Working on that ... |
These are the PC2 performance numbers that I got (showing no measurable regression):
This PR branch
EDIT: These numbers are using GPU and CUDA. We also need CPU numbers to make sure there are no measurable regressions. |
I will probably replace the closures with real methods (to help avoid duplication) and re-test to be sure. |
Hrm, with CPU Tree building enabled, a regression is noted:
This PR branch
|
The latest commit is not pretty, but reduces the added 'double conversion' when CPU tree building. Times are:
Which is comparable to the measured |
b936815
to
4a8cf99
Compare
Ooh nice, the latest GPU times improved:
|
f4f247a
to
875f063
Compare
ce1836f
to
e7517a8
Compare
0888200
to
2ef0f1f
Compare
Odd, the failing cuda tests all pass locally on my setup :-( |
Nevermind, I can reproduce now |
I'm confused. Reproduced once, but can't again. Something is going on. |
16cb12b
to
241fd49
Compare
2ef0f1f
to
ede86b4
Compare
Hrm, this rebased is hosed :-( Will see about getting it going better |
ede86b4
to
f8b6533
Compare
Ok still hosed, but getting there 😓 |
f8b6533
to
1f83fca
Compare
a6dc94c
to
9b21f76
Compare
reduce conversions in encode (@dignifiedquire) feat: reduce conversions for CPU tree building differently feat: use an enum return type to avoid transmute (unsafe) fix: properly use GPU if possible
fix: add doc and additional type check
74d00e8
to
57955b5
Compare
Final testing showed comparablly equal performance for CPU tree building and improved building for GPU. |
* feat: re-factor tree_r_last building for external re-use * reduce conversions in encode (@dignifiedquire) * feat: use an enum return type to avoid transmute (unsafe) * fix: keep new crate version consistent with latest release * fix: add doc and additional type check * feat: wrap settings for tree builders in a method
* feat: re-factor tree_r_last building for external re-use * reduce conversions in encode (@dignifiedquire) * feat: use an enum return type to avoid transmute (unsafe) * fix: keep new crate version consistent with latest release * fix: add doc and additional type check * feat: wrap settings for tree builders in a method
* feat: re-factor tree_r_last building for external re-use * reduce conversions in encode (@dignifiedquire) * feat: use an enum return type to avoid transmute (unsafe) * fix: keep new crate version consistent with latest release * fix: add doc and additional type check * feat: wrap settings for tree builders in a method
* feat: encode/decode/remove_data API with tests * feat: add Jake's latest circuit code * feat: added some filecoin-proofs types for updated porep_config * feat: add a generate update proof stub API call * feat: start hooking up vanilla merkle proof generation for challenges * feat: add gathering of apex leaves * fix: updates required after rebasing against master * feat: formalize the vanilla update and vanilla proof types * feat: empty-sector-update ProofScheme and CompoundProof * feat: update filecoin-proofs with latest changes * fix: properly calculate phi using comm_r instead of comm_r_last * refactor: remove lifetime parameter from EmptySectorUpdate struct * feat: complete vanilla proving and verify through tests * feat: remove t_aux from being required in vanilla encode * feat: remove t_aux from vanilla PrivateInputs * feat EmptySectorUpdateCompound tests * fix: CompoundProof::circuit partition-index * feat: expose some required data through filecoin-proofs * feat: add prove and verify API interfaces * feat: update benches with new porep_config fields * feat: generate empty sector update parameters for testing * feat: update paramcache for generating empty sector update params * feat: bump parameter cache version to generate new params * feat: properly wire in prove/verify for empty sector updates * feat: add apex-por gadget tests * fix: high bits should include partition-index * style: rename 'blank' to 'empty' * style: move por_no_challenge_input gadget to storage_proofs_core * feat: precompute rhos when encoding/decoding * feat: pack k and h_select into one circuit pub-input * fix: apply more feedback * feat: bump CI parameter cache version * feat: clean-up p_aux/t_aux usage * feat: output converted fr bytes directly * docs: update incorrect comment * refactor: add structs for EmptySectorUpdateCircuit's public and private inputs * fix: address review comments * fix: bump CI parameter cache version * feat: test challenges against hardcoded vectors * feat: update parameter cache version for empty sector updates * feat: isolate testing of empty circuit tests to reduce RAM usage * fix: update clippy settings due to isolated testing * fix: updates required after rebase to master * fix: move comm_c from being a public input and into the proof * feat: use poseidon dst when generating randomness * fix: update challenge test vectors * feat: bump ci param-cache version * feat: parallelize vanilla challenge-proof validation * refactor: improve naming and comments * Empty sector update tree r gpu (#1528) * feat: re-factor tree_r_last building for external re-use * reduce conversions in encode (@dignifiedquire) * feat: use an enum return type to avoid transmute (unsafe) * fix: keep new crate version consistent with latest release * fix: add doc and additional type check * feat: wrap settings for tree builders in a method * fix: ensure data file is >= sector key file This check was loosened since lotus may pad the data file * feat: add larger commented ignored tests for sector updates * fix: add partition arg in non-exposed single vanilla partition proof API * feat: add and expose an API to prove sector updates with vanilla proofs * fix: apply missed doc and error handling feedback * feat: expose sector update partition proof type * fix: add opencl/cuda feature flags to more storage update spots * feat: consolidate CPU tree building method * fix: use different method to build binary tree From the lotus side, additional data may be stored after the nodes we want in the tree, so using a different builder method will allow that to happen without a panic * feat: add optional data mapping of a specified length * fix: use proper node count * feat: persist p_aux and t_aux into cache dir after encoding/remove * fix: ensure data is >= replica in remove data call * Change H to 1024 Signed-off-by: Jakub Sztandera <kubuxu@protocol.ai> * feat: try to split lifecycle upgrade tests from others on OpenCL GPU * fix: remove typo * docs: add official external audit result to the repo Co-authored-by: DrPeterVanNostrand <jnz@riseup.net> Co-authored-by: Jakub Sztandera <kubuxu@protocol.ai>
No description provided.