Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove carv1/carv2 specific semantics from commP calculations #1651

Closed
wants to merge 23 commits into from

Conversation

willscott
Copy link
Collaborator

The commp should be directly over the bytes of the deal, it shouldn't care about car format.

The commp should be directly over the bytes of the deal, it shouldn't care about car format.
@willscott willscott requested a review from dirkmc August 29, 2023 07:44
@willscott willscott marked this pull request as ready for review August 29, 2023 07:44
@nonsense
Copy link
Member

nonsense commented Aug 29, 2023

We should also fix Lotus to calculate commp in this form, at the moment AddPiece is failing because of this change at:

		ppi, err := m.sealer.AddPiece(sealer.WithPriority(ctx.Context(), DealSectorPriority),
			m.minerSector(sector.SectorType, sector.SectorNumber),
			pieceSizes,
			deal.size,
			deal.data)
		if !ppi.PieceCID.Equals(deal.deal.DealProposal.PieceCID) {
			err = xerrors.Errorf("got unexpected piece CID: expected:%s, got:%s", deal.deal.DealProposal.PieceCID, ppi.PieceCID)
			deal.accepted(sector.SectorNumber, offset, err)
			return ctx.Send(SectorAddPieceFailed{err})
		}

which refers to:

storage/sealer/ffiwrapper/sealer_cgo.go

func (sb *Sealer) pieceCid(spt abi.RegisteredSealProof, in []byte) (cid.Cid, error) {
	prf, werr, err := commpffi.ToReadableFile(bytes.NewReader(in), int64(len(in)))
	if err != nil {
		return cid.Undef, xerrors.Errorf("getting tee reader pipe: %w", err)
	}

	pieceCID, err := ffi.GeneratePieceCIDFromFile(spt, prf, abi.UnpaddedPieceSize(len(in)))
	if err != nil {
		return cid.Undef, xerrors.Errorf("generating piece commitment: %w", err)
	}

	_ = prf.Close()

	return pieceCID, werr()
}

@willscott
Copy link
Collaborator Author

I don't see anything in the commp utility / ffi calculations of commp that do more than a direct hash summing over the bytes of the piece/deal/file. Are you pointing to somewhere else where a carv2 would get unwrapped to only commp over the wrapped carv1?

@nonsense
Copy link
Member

I just reviewed CI, and if you search for unexpected you come up with this error, which makes me think that we don't conform to the commp calculation that Lotus does (and the current codebase does so), which makes me think that Lotus also does more than direct hashing over the file.

@willscott
Copy link
Collaborator Author

I believe the remaining rough edge here is in piecedirectory.GetBlockstore, which tries to use the "absolute" index that boost generates with the carv2 blockstore interface - waiting on ipld/go-car#490 to be able to support that.

rvagg added a commit to ipld/go-car that referenced this pull request Sep 1, 2023
* SkipNext's metadata calculation wasn't working properly in the case of a
  CARv2 as it didn't take into account the V1 header in the original offset
  calculation.
* Add a SourceOffset and some docs to be absolutely clear what offsets we're
  returning in the metadata.

Ref: filecoin-project/boost#1651
rvagg and others added 2 commits September 1, 2023 06:29
* fix: use fixed car BlockReader#SkipNext SourceOffset

Ref: ipld/go-car#491

* chore: regenerate cbor-gen types w/ cbor-gen update
rvagg added a commit to ipld/go-car that referenced this pull request Sep 1, 2023
* SkipNext's metadata calculation wasn't working properly in the case of a
  CARv2 as it didn't take into account the V1 header in the original offset
  calculation.
* Add a SourceOffset and some docs to be absolutely clear what offsets we're
  returning in the metadata.

Ref: filecoin-project/boost#1651
@willscott
Copy link
Collaborator Author

flagged by @rvagg that we should make sure there's a test over a carv2 deal to exercise that branch of graphsync GetBlockstore

* fix: offsets to CAR section starts

these are required for carv2 indexes when we pass them back to a Blockstore

* fix: use ReadNode and compare CID we find
@rvagg
Copy link
Member

rvagg commented Sep 1, 2023

we should make sure there's a test over a carv2 deal to exercise that branch of graphsync GetBlockstore

Yep, it was broken. Tested and fixed here: #1662

@dirkmc
Copy link
Contributor

dirkmc commented Sep 5, 2023

@willscott @rvagg nice work! Thanks for diving deep into the code to figure this one out 🙌

Once we merge this PR, our restore functionality will also need to change, because at the moment it assumes CAR files. So we're going to hold off on merging until we've figured out how to modify the restore function.

if err != nil {
return &dealMakingError{
retry: types.DealRetryFatal,
error: fmt.Errorf("failed to open CARv2 file: %w", err),
error: fmt.Errorf("failed to stat CARv1 file: %w", err),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
error: fmt.Errorf("failed to stat CARv1 file: %w", err),
error: fmt.Errorf("failed to stat file: %w", err),

"sort"
)

func NewMultiReaderAt(parts ...ReaderAtSize) io.ReaderAt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a unit test for this functionalty.

@dirkmc
Copy link
Contributor

dirkmc commented Sep 5, 2023

Let's also add an itest with a file that is not a car file to make sure that flow works as expected

@LexLuthr
Copy link
Collaborator

LexLuthr commented Sep 8, 2023

We need to account for this change in our LID recovery tool. We should hold off on merging this PR till we complete the following tasks.

@willscott
Copy link
Collaborator Author

willscott commented Sep 8, 2023

is the podsi itest sufficient for the itest ask?

e.g. we'll merge this first into #1495

@rvagg
Copy link
Member

rvagg commented Sep 8, 2023

also needs to be updated to go-car/v2@v2.13.1 which will break the calculations here due to the "fix"

@LexLuthr
Copy link
Collaborator

LexLuthr commented Sep 8, 2023

@willscott I think we can marge this in data-segment indexing once unti test to NewMultiReaderAt() is done. I am moving the other 2 tasks to respective PRs.
@rvagg Is the carv2 update blocker for this merge?

@rvagg
Copy link
Member

rvagg commented Sep 8, 2023

@LexLuthr yes, please consider it a blocker. I fixed a bug and released 2.12 for this branch but then we realised later we changed the behaviour in a braking way so fixed that in 2.13.1, now the values returned from BlockMetadata#Offset are referencing the wrong thing. We need to sort that out here first.
Either @willscott or myself will do it in the next few days . I'll do it my Monday if it's not don't by then.

@rvagg
Copy link
Member

rvagg commented Sep 11, 2023

@LexLuthr go-car/v2 update done

@rvagg
Copy link
Member

rvagg commented Sep 11, 2023

and a test for NewMultiReaderAt while I'm at it

@LexLuthr
Copy link
Collaborator

I think we are good to merge now. We just need to ensure that we are merging piecedirectory/segmentindex not merging in main. I will leave the rebase strategy upto the authors.

@LexLuthr
Copy link
Collaborator

@willscott Are we good to close this? I already see a commit with merge from this branch on #1495

@willscott
Copy link
Collaborator Author

yes. i think we can consider this 'merged' - there are failing tests in #1495 that i was going to take a look at addressing / identifying as fall-out from this merge or unrelated

@willscott willscott closed this Sep 12, 2023
@LexLuthr LexLuthr deleted the fix/commp branch September 12, 2023 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants