Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setQueryData slow with large data sets #1633

Closed
andrewnaeve opened this issue Jan 12, 2021 · 5 comments
Closed

setQueryData slow with large data sets #1633

andrewnaeve opened this issue Jan 12, 2021 · 5 comments

Comments

@andrewnaeve
Copy link

Describe the bug
This may be a performance question, or maybe I need a paradigm shift.

I have a large query that comes back with several mb worth of data, for a table that gets frequently sorted. When I sort the data, and call queryClient.setQueryData() with the new sorted rows, it takes around half a second to complete. I've tried to break the data out into React state, and when I do that, it sorts in 10ms- but getting React state to sync nicely with React Query is challenging.

In my profiler, I see quite a bit of time being taken by "replaceEqualDeep".

I'm wondering if I'm using react-query incorrectly. Should I be replacing the cached query data like this? Are there any better strategies for large datasets?

Screenshots
calltree

Desktop (please complete the following information):

  • MacOS, latest FIrefox and Chrome
  • "react-query": "^3.5.5"
@TkDodo
Copy link
Collaborator

TkDodo commented Jan 12, 2021

This might be due to react-query's structural sharing feature. You can turn that off by setting structuralSharing: false on your query.

however, I would advise against mixing client state with server state and manipulating the query in the cache directly. The idea is that the data you see is just a borrowed "view" of the actual server state, so the client does not own it. If a background update happens, or another component mounts that also uses this query, the data will be re-fetched and you will "loose" your client side sorting.

I think a good alternative would be to keep server and client state separate. In your case, that would mean:

  • keep the data that you get from the server in the queryClient and do not change it.
  • only store the sorting "choice" that the user made in react state on the frontend.
  • compute the new data in the render function, possibly with useMemo, combining those two states.

This could be nicely abstracted in a custom hook, something like:

const useBigData = () => {
    const queryInfo = useQuery('bigData', () => fetchBigData())
    const [sorting, setSorting] = React.useState(undefined) // accepts 'foo' | 'bar' as sorting keys

  
   return {
        query: {
            ...queryInfo,
            data: React.useMemo(() =>{
                return sortedDataDependingOnStoredSorting(queryInfo.data, sorting)
            }, [queryInfo.data, sorting])
        },
        setSorting,
        sorting,
   }
}

You can use this like so in your component:

const { query, sorting, setSorting } = useBigData()

<Button onClick={() => setSorting('foo)}>Sort by Foo</Button>
<Button onClick={() => setSorting('bar)}>Sort by Bar</Button>
<DataTable>{data}</DataTable>

if the server data updates - it will be sorted. if the user changes the sorting - it will be sorted. I've made the experience that this approach works very well in most cases.

@andrewnaeve
Copy link
Author

Thanks for this great reply. Thought I might have been thinking about it incorrectly. I'll give that a shot.

@stephen776
Copy link

We are experiencing this issue on react-native without a particularly large set of data. Dataset is an object with a handful of primitive properties and one property which is an array of objects with maybe 20 entries.

We are calling setQueryData upon receipt of of an event over websockets that occurs about every 3 seconds. The JS thread eventually grinds to a halt.

@baughmann
Copy link

I'm sorry to say this is still an issue, even without structuralSharing. I've had to move away from this library for my app's primary data because of this, though I still use it for data that is not frequently updated via a websocket.

@SebKranz
Copy link
Contributor

SebKranz commented Jan 27, 2024

Screenshot 2024-01-27 at 22 58 45

It looks like setQueryData uses find internally, which iterates over every value in the cache. This means, if I receive a list and from the server and want to prime the cache with every value, it's almost quadratic in complexity.

As a workaround, I accessed Query-Object directly:

// this returns the cache for this key if existing, or creates a new one with default options:
// Unlike `setQueryData`, it will use the hashtable to find the entry.
const query = client.getQueryCache().build(client, { queryKey: /* ... */ }) 
query.setData(data)

With this, inserting many values into the cache is now as fast as expected.

I believe this should be brought to attention in a separate ticket. The only advantage of this quadratic behaviour, is that you can update many existing values in one call. But I doubt that this is needed more commonly than my use-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants