Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching #2256

Open
ronag opened this issue Sep 8, 2023 · 8 comments · May be fixed by #3562
Open

Caching #2256

ronag opened this issue Sep 8, 2023 · 8 comments · May be fixed by #3562

Comments

@ronag
Copy link
Member

ronag commented Sep 8, 2023

I started working on a caching dispatcher + handler. Thought I'd post it here in case anyone is interested in working on something like that.

We only needed to cache redirects so I kind of stopped there.

import stream from 'node:stream'
import { LRUCache } from 'lru-cache'
import cacheControlParser from 'cache-control-parser'

class CacheHandler {
  constructor({ key, handler, store }) {
    this.entry = null
    this.key = key
    this.handler = handler
    this.store = store
    this.abort = null
    this.resume = null
  }

  onConnect(abort) {
    this.abort = abort

    return this.handler.onConnect(abort)
  }

  onHeaders(statusCode, rawHeaders, resume, statusMessage) {
    this.resume = resume

    // TODO (fix): Check if content-length fits in cache...

    let cacheControl
    for (let n = 0; n < rawHeaders.length; n += 2) {
      if (
        rawHeaders[n].length === 'cache-control'.length &&
        rawHeaders[n].toString().toLowerCase() === 'cache-control'
      ) {
        cacheControl = cacheControlParser.parse(rawHeaders[n + 1].toString())
        break
      }
    }

    if (
      cacheControl &&
      cacheControl.public &&
      !cacheControl.private &&
      !cacheControl['no-store'] &&
      // TODO (fix): Support all cache control directives...
      // !opts.headers['no-transform'] &&
      !cacheControl['no-cache'] &&
      !cacheControl['must-understand'] &&
      !cacheControl['must-revalidate'] &&
      !cacheControl['proxy-revalidate']
    ) {
      const maxAge = cacheControl['s-max-age'] ?? cacheControl['max-age']
      const ttl = cacheControl.immutable
        ? 31556952 // 1 year
        : Number(maxAge)

      if (ttl > 0) {
        this.entry = this.store.create(this.key, ttl * 1e3)
        this.entry.statusCode = statusCode
        this.entry.statusMessage = statusMessage
        this.entry.rawHeaders = rawHeaders
        this.entry.on('drain', resume)
        this.entry.on('error', this.abort)
      }
    }

    return this.handler.onHeaders(statusCode, rawHeaders, resume, statusMessage)
  }

  onData(chunk) {
    let ret = true

    if (this.entry) {
      this.entry.size += chunk.bodyLength
      if (this.entry.size > this.store.maxEntrySize) {
        this.entry.destroy()
        this.entry = null
      } else {
        ret = this.entry.write(chunk)
      }
    }

    return this.handler.onData(chunk) !== false && ret !== false
  }

  onComplete(rawTrailers) {
    if (this.entry) {
      this.entry.rawTrailers = rawTrailers
      this.entry.end()
    }

    return this.handler.onComplete(rawTrailers)
  }

  onError(err) {
    if (this.entry) {
      this.entry.destroy(err)
    }

    return this.handler.onError(err)
  }
}

// TODO (fix): Filsystem backed cache...
class CacheStore {
  constructor({ maxSize, maxEntrySize }) {
    this.maxSize = maxSize
    this.maxEntrySize = maxEntrySize
    this.cache = new LRUCache({
      maxSize,
      sizeCalculation: (value) => value.body.byteLength,
    })
  }

  create (key, ttl) {
    const entry = Object.assign(new stream.PassThrough(), {
      statusCode: null,
      statusMessage: null,
      rawHeaders: null,
      rawTrailers: null,
      size: 0,
    }).on('finish', () => {
      this.cache.set(key, entry, ttl)
    })
    return entry
  }

  get(key, callback) {
    callback(null, this.cache.get(key))
  }
}

export class CacheDispatcher {
  constructor(dispatcher, { maxSize = 0, maxEntrySize = maxSize / 10 }) {
    this.dispatcher = dispatcher
    this.store = new CacheStore({ maxSize, maxEntrySize })
  }

  dispatch(opts, handler) {
    if (opts.headers?.['cache-control'] || opts.headers?.authorization) {
      // TODO (fix): Support all cache control directives...
      // const cacheControl = cacheControlParser.parse(opts.headers['cache-control'])
      // cacheControl['no-cache']
      // cacheControl['no-store']
      // cacheControl['max-age']
      // cacheControl['max-stale']
      // cacheControl['min-fresh']
      // cacheControl['no-transform']
      // cacheControl['only-if-cached']
      this.dispatcher.dispatch(opts, handler)
      return
    }

    // TODO (fix): Support all methods?
    if (opts.method !== 'GET' && opts.method !== 'HEAD') {
      this.dispatcher.dispatch(opts, handler)
      return
    }

    // TODO (fix): Support body?
    opts.body.resume()

    // TODO (fix): How to generate key?
    const key = `${opts.method}:${opts.path}`

    this.store.get(key, (err, value) => {
      if (err) {
        // TODO (fix): How to handle cache errors?
        this.handler.onError(err)
      } else if (value) {
        const { statusCode, statusMessage, rawHeaders, rawTrailers } = value
        const ac = new AbortController()
        const signal = ac.signal

        let _resume = null
        const resume = () => {
          _resume?.(null)
          _resume = null
        }
        const abort = () => {
          ac.abort()
          resume()
        }

        try {
          handler.onConnect(abort)
          signal.throwIfAborted()
          handler.onHeaders(statusCode, rawHeaders, resume, statusMessage)
          signal.throwIfAborted()

          stream.pipeline(
            value,
            new stream.Writable({
              signal,
              write(chunk, encoding, callback) {
                try {
                  if (handler.onData(chunk) === false) {
                    _resume = callback
                  } else {
                    callback(null)
                  }
                } catch (err) {
                  callback(err)
                }
              },
            }),
            (err) => {
              if (err) {
                handler.onError(err)
              } else {
                handler.onComplete(rawTrailers)
              }
            }
          )
        } catch (err) {
          handler.onError(err)
        }
      } else {
        this.dispatcher.dispatch(opts, new CacheHandler({ handler, store: this.store, key }))
      }
    })
  }
}
@metcoder95
Copy link
Member

Is there anything in particular that you'd like to extend/add (besides the comments added)?

@ronag
Copy link
Member Author

ronag commented Sep 9, 2023

Did a minor update and added more TODOs I could think of.

@ronag
Copy link
Member Author

ronag commented Sep 9, 2023

I think the two major issues are:

  1. How to generate the key? User passes a function that takes opts?
  2. Filesystem backed cache. Maybe a two layered cache would be nice. In memory for recent and small entries and filesystem for older and/or larger entries.

@metcoder95
Copy link
Member

  1. How to generate the key? User passes a function that takes opts?

I think this should be sufficient. In the worst case, if not provided, we can create one based on the origin + HTTP Method + HTTP Path. Although it can lead to collisions, it can highlight the need for the implementation to submit a key generator function

  1. Filesystem backed cache. Maybe a two layered cache would be nice. In memory for recent and small entries and filesystem for older and/or larger entries.

Some sort of log-merge tree?
We just also need to be aware of eviction, as some 3xx are ephemeral, permanent can be kept for longer times (if not the whole lifetime of the app). An option to disable the file-backed cache is also a must, as some environments restrict fs (thinking about serverless especially)

@ronag ronag mentioned this issue Sep 13, 2023
@thewilkybarkid
Copy link

An option to disable the file-backed cache is also a must, as some environments restrict fs (thinking about serverless especially)

A configurable cache store would be great. We're currently using make-fetch-happen as it's the only Node fetch implementation with a built-in cache (that I'm aware of), but it uses cacache and so is restricted to file systems. (I'd love to be able to use Redis as storage.)

@metcoder95
Copy link
Member

A configurable cache store would be great.

That sounds great, maybe we should define a set of APIs to use as contracts that the custom cache store can follow

@mcollina
Copy link
Member

https://github.com/mcollina/async-cache-dedupe should have all the things we need (including Redis support).

@metcoder95
Copy link
Member

I'd be working on a PoC soon, I'll open a draft as soon as I have something 🙂

flakey5 added a commit to flakey5/undici that referenced this issue Sep 7, 2024
Implements bare-bones http caching as per rfc9111

Closes nodejs#3231
Closes nodejs#2760
Closes nodejs#2256
Closes nodejs#1146

Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
flakey5 added a commit to flakey5/undici that referenced this issue Sep 7, 2024
Implements bare-bones http caching as per rfc9111

Closes nodejs#3231
Closes nodejs#2760
Closes nodejs#2256
Closes nodejs#1146

Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
flakey5 added a commit to flakey5/undici that referenced this issue Sep 7, 2024
Implements bare-bones http caching as per rfc9111

Closes nodejs#3231
Closes nodejs#2760
Closes nodejs#2256
Closes nodejs#1146

Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
@flakey5 flakey5 linked a pull request Sep 7, 2024 that will close this issue
7 tasks
flakey5 added a commit to flakey5/undici that referenced this issue Sep 11, 2024
Implements bare-bones http caching as per rfc9111

Closes nodejs#3231
Closes nodejs#2760
Closes nodejs#2256
Closes nodejs#1146

Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>

Co-authored-by: Carlos Fuentes <me@metcoder.dev>
Co-authored-by: Robert Nagy <ronagy@icloud.com>
flakey5 added a commit to flakey5/undici that referenced this issue Sep 12, 2024
Implements bare-bones http caching as per rfc9111

Closes nodejs#3231
Closes nodejs#2760
Closes nodejs#2256
Closes nodejs#1146

Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>

Co-authored-by: Carlos Fuentes <me@metcoder.dev>
Co-authored-by: Robert Nagy <ronagy@icloud.com>

Co-authored-by: Isak Törnros <isak.tornros@hotmail.com>
flakey5 added a commit to flakey5/undici that referenced this issue Sep 14, 2024
Implements bare-bones http caching as per rfc9111

Closes nodejs#3231
Closes nodejs#2760
Closes nodejs#2256
Closes nodejs#1146

Co-authored-by: Carlos Fuentes <me@metcoder.dev>

Co-authored-by: Robert Nagy <ronagy@icloud.com>

Co-authored-by: Isak Törnros <isak.tornros@hotmail.com>

Signed-off-by: flakey5 <73616808+flakey5@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants