Skip to content
This repository has been archived by the owner on Apr 18, 2024. It is now read-only.

feat: switch to boxo and fix CAR fetch timeouts #68

Merged
merged 4 commits into from
Mar 30, 2023

Conversation

lidel
Copy link
Contributor

@lidel lidel commented Mar 29, 2023

This PR switches to boxo (why: ipfs/boxo#215).

It is possible we will be also adding things related to ipfs/boxo#176 and ipfs-inactive/bifrost-gateway#61 to this PR, so everything can be reviewed and merged together.

(Usually trying to have smaller PRs, but we have too many PRs caused by boxo rename, trying to minimize overhead related to this unplanned rename)

TODO

TBD, could be follow-up PR

  • clean up Fetch API
  • verify timeouts
  • verify metrics

lidel added a commit to ipfs-inactive/bifrost-gateway that referenced this pull request Mar 29, 2023
lidel added a commit to ipfs-inactive/bifrost-gateway that referenced this pull request Mar 29, 2023
@lidel lidel changed the title refactor: go-libipfs → boxo feat: switch to boxo and fix CAR fetch timeouts Mar 29, 2023
19s is not enough for fetching CAR stream of unknown length,
every bigger request was failing.

If we need to pick some ceiling, 30m sound like a good starting point
(this is when CAR stream got timeouted on the old ipfs.io).
Comment on lines +153 to +162
// TODO: Ideally, we would have additional "PerRequestInactivityTimeout"
// which is the amount of time without any NEW data from the server, but
// that can be added later. We need both because a slow trickle of data
// could take a large amount of time.
requestTimeout := DefaultSaturnCarRequestTimeout
if isBlockRequest {
requestTimeout = DefaultSaturnBlockRequestTimeout
}

reqCtx, cancel := context.WithTimeout(ctx, requestTimeout)
Copy link
Contributor Author

@lidel lidel Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 ⚠️ we may have a DoS vector here, if L1 is malicious, it could be keeping a transferless connection open for 30 minutes – not the best.

Implementing "PerRequestInactivityTimeout" would help a lot – we could then have this 30m timeout as a hard ceiling (or even raise it), but then have the same timeout for block and for CAR when L1 did not send any new bytes for some time.

Comment on lines +83 to +84
const DefaultSaturnBlockRequestTimeout = 19 * time.Second
const DefaultSaturnCarRequestTimeout = 30 * time.Minute
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Had to add separate timeout for CARs.
Not feeling strongly about 30m, but that is for how long I was able to stream wikipedia DAG from the old gateway, so a good starting point.

Undef cid was displayed as "b" (multibase prefix)
@lidel lidel requested review from willscott and aarshkshah1992 and removed request for willscott March 29, 2023 23:53
@lidel lidel marked this pull request as ready for review March 29, 2023 23:53
@@ -122,7 +130,16 @@ type ErrCoolDown struct {
}

func (e *ErrCoolDown) Error() string {
return fmt.Sprintf("multiple saturn retrieval failures seen for CID %s/Path %s, please retry after %s", e.Cid, e.Path, humanRetry(e.retryAfter))
Copy link
Contributor Author

@lidel lidel Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ This was producing very confusing log errors failures seen for CID b/Path due to e.Cid being undefined, and printed not sanitized input from the user.

@aarshkshah1992 aarshkshah1992 merged commit 28e66fc into main Mar 30, 2023
@aarshkshah1992 aarshkshah1992 deleted the refactor/migrate-to-boxo branch March 30, 2023 09:47
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants