Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump Onyx version to help with IndexedDB slow writes #8053

Merged
merged 6 commits into from
Mar 18, 2022

Conversation

marcaaron
Copy link
Contributor

@marcaaron marcaaron commented Mar 9, 2022

Details

This is a hot fix for a react-native-onyx issue affecting web/desktop related to slow IndexedDB writing blocking the main thread.

Fixed Issues

$ Expensify/react-native-onyx#119

Tests (tested against the production API via local proxy)

  1. Sign out of NewDot
  2. Sign back in with an Expensify account with lots of data
  3. Once the sidebar loads immediately switch to another chat via the sidebar
  4. Verify there is no white screen and the UI doesn't hang for multiple seconds
  • Verify that no errors appear in the JS console

QA Steps

  1. Sign out of NewDot
  2. Sign back in with an Expensify account with lots of data
  3. Once the sidebar loads immediately switch to another chat via the sidebar
  4. Verify there is no white screen and the UI doesn't hang for multiple seconds
  • Verify that no errors appear in the JS console

Screenshots

@marcaaron marcaaron self-assigned this Mar 9, 2022
@marcaaron marcaaron requested a review from a team as a code owner March 9, 2022 16:16
@MelvinBot MelvinBot requested review from tylerkaraszewski and removed request for a team March 9, 2022 16:17
@marcaaron
Copy link
Contributor Author

cc @AndrewGable @Julesssss since y'all approved the react-native-onyx PR

Copy link
Contributor

@Julesssss Julesssss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Soooo. Things do seem quicker, but I'm seeing a new issue that is not occurring on staging. Tapping a chat will temporarily hide it from the LHN...

Chats-Dissapearing.mov

@marcaaron
Copy link
Contributor Author

Hmm... my LHN is in a completely different order on staging than it is on dev (staging looks wrong). But I do see the chat disappearing thing.... let me try and debug why it happens.

@marcaaron
Copy link
Contributor Author

Only seems to happen for certain chats. After clicking through a bunch where it was happening I'm having a hard time reproducing. 🤔

@marcaaron
Copy link
Contributor Author

One thing I'm noticing is that when it happens we make an extra call to PersonalDetails_GetForEmails and we set a huge thing of 230+ details from inside Onyx...

I suspect this is happening here:

const formattedPersonalDetails = formatPersonalDetails(details);
Onyx.merge(ONYXKEYS.PERSONAL_DETAILS, formattedPersonalDetails);

We only set the details for the participants of that report. But whenever we do a merge we always get first and then combine the data together see Onyx code here.

That should be fine to do sync and I can't think of why it would make the option row disappear. The lack of consistent reproduction here makes it hard to debug.

Moving on... the only time this should happen however is when this gets called here:

if (!this.props.report.reportID) {
Report.fetchChatReportsByIDs([this.props.reportID], true);
}

So that must mean that we are missing the report object whenever this happens... ?

Forced the report fetch to run by removing the conditional and verified the sidebar row did not disappear consistently (so it's maybe related to the missing report thing which I'm not able to reproduce).

Not sure what we should do in this case. I think we should maybe just revert the Onyx PR for now. The reported issue does eventually seem to sort itself out and I can't repro it at all with my current data...

@marcaaron marcaaron changed the title Bump Onyx version to help with IndexedDB slow writes [HOLD] Bump Onyx version to help with IndexedDB slow writes Mar 9, 2022
@marcaaron
Copy link
Contributor Author

marcaaron commented Mar 9, 2022

Alright so verified the issue is because the reportID is not on the report object and we are ending up with partial report objects somehow. Which is pretty weird. I thought it could be related to writes happening too slowly (but that seems unlikely because even after all the writes are done we still have partial report data)...

Here's what the report page had for a report when the bug occurs (first one is fine next one is not)

2022-03-09_09-38-33

Going to look at what might cause partial report data to be set. My assumption was that since we have caching in Onyx that it doesn't really matter whether we throttle the writes or not, but data is getting corrupted somehow.

The fact that we only have optimisticActionIds, reportName, and icons as properties is probably a clue into where things are breaking down.

@marcaaron
Copy link
Contributor Author

Full report data appears and we try to set it here:

Onyx.mergeCollection(ONYXKEYS.COLLECTION.REPORT, simplifiedReports);

2022-03-09_09-53-22

But IndexedDB only shows partial data... 🤔
2022-03-09_09-53-35

Something is broken 😅

@marcaaron
Copy link
Contributor Author

marcaaron commented Mar 9, 2022

Got a bit closer to figuring this out, but not sure if I understand the problem entirely. Could use some 👀 from @kidroca here.

Basically, the implementation of Onyx.multiMerge() seems to be getting screwed up now because we are not waiting for previous writes (now synchronous) to finish before we call getItem() here. Even though the cache is synced at this point grabbing old values from storage is possible. And when that happens we will call localforage.setItem() with partial data.

2022-03-09_10-23-17

This is bad. But we should also be updating the cache when we call mergeCollection() regardless of whether we set this bad data or not. And it looks like on a fresh sign in the cache can look like this

2022-03-09_10-33-37

And on a page refresh looks fine... however, the data that is making it to the report page does not seem synced with the cache since we can still end up on the ReportActionsView with partial report data at this point (not sure exactly how that part happens - but will look at it next).

@marcaaron
Copy link
Contributor Author

Ok scratch the part about the cache being fine on refresh. It looks like it can be only a partial cache sometimes and with bad data having been set during mergeCollection() we will run into the problem. This is only a small set of the reports I have and if the data in IndexedDB is getting corrupted then when we add a new connection we'll get a broken report.

2022-03-09_10-47-25

I think if we prevent setting the bad partial data somehow we will be good. Will think about how to do this.

@kidroca
Copy link
Contributor

kidroca commented Mar 9, 2022

I'll take a look as well

@kidroca
Copy link
Contributor

kidroca commented Mar 9, 2022

@marcaaron
I think you've actually found the problem that causes: #7579

When I implemented multiMerge I've tried to make it work the same way AsyncStorage works
It would read existing data from disk and then merge new data to it

Making writing happening in sequence just made the problem easier to recreated:

  1. multiMerge is called and asked to write some keys
  2. It tries to retrieve the stored value for this key, but it's null because it's write is not completed yet
  3. It assumes there's no existing value and just writes the delta
  4. Since this happens after it gets written after and overwrites the real value

Another thing that "helped" review the problem is this

maxCachedKeysCount = 55,

When we hit the limit some keys are removed from cache, when we connect to the same key since there's no cache data is read from storage - the partial data that was written to storage

Maybe we can make multiMerge wait and run after pending setItem calls, this would ensure it can read data from disk and merge delta to it

Also the limit of 55 cached keys seems to small nowadays, we had a plan to cache everything and remove only LRU safe eviction keys from cache

@tylerkaraszewski tylerkaraszewski removed their request for review March 9, 2022 21:46
@kidroca
Copy link
Contributor

kidroca commented Mar 9, 2022

Ok, that's not everything, cache is getting in a partial state due to maxCacheKeysCount and merging partial data to cache here

https://github.com/Expensify/react-native-onyx/blob/032ff54b4374c93ee12676792a6e029c61ea1e06/lib/Onyx.js#L771-L772

  // Merge original data to cache
  cache.merge(collection);
  1. As already established mergeCollection sometimes merges deltas to existing data
  2. When we hit the maxCachedKeysCount and remove some keys from cache
  3. Merging delta to cache would result in having partial data in cache

@kidroca
Copy link
Contributor

kidroca commented Mar 9, 2022

@marcaaron I've fixed the problem for myself locally by making multiMerge wait prior write operations to be over and removing cache cleanup here

// Try to free some cache whenever we connect to a safe eviction key
cache.removeLeastRecentlyUsedKeys();
LocalForage diff
Index: node_modules/react-native-onyx/lib/storage/providers/LocalForage.js
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/node_modules/react-native-onyx/lib/storage/providers/LocalForage.js b/node_modules/react-native-onyx/lib/storage/providers/LocalForage.js
--- a/node_modules/react-native-onyx/lib/storage/providers/LocalForage.js	
+++ b/node_modules/react-native-onyx/lib/storage/providers/LocalForage.js	(date 1646865700134)
@@ -7,6 +7,7 @@
 import localforage from 'localforage';
 import _ from 'underscore';
 import lodashMerge from 'lodash/merge';
+import createDeferredTask from '../../createDeferredTask';
 
 localforage.config({
     name: 'OnyxDB'
@@ -14,6 +15,7 @@
 
 const writeQueue = [];
 let isQueueProcessing = false;
+let queueTask = null;
 
 /**
  * Writing very quickly to IndexedDB causes performance issues and can lock up the page and lead to jank.
@@ -25,16 +27,26 @@
         return;
     }
 
+    if (!queueTask) {
+        queueTask = createDeferredTask();
+    }
+
     isQueueProcessing = true;
 
-    const {
-        key, value, resolve, reject,
-    } = writeQueue.shift();
+    const {key, value, resolve, reject} = writeQueue.shift();
+
     localforage.setItem(key, value)
         .then(resolve)
         .catch(reject)
         .finally(() => {
             isQueueProcessing = false;
+
+            if (writeQueue.length === 0) {
+                queueTask.resolve();
+                queueTask = null;
+                return;
+            }
+
             processNextWrite();
         });
 }
@@ -61,6 +73,11 @@
      * @return {Promise<void>}
      */
     multiMerge(pairs) {
+        // If the write queue is pending wait and perform the multiMerge after
+        if (queueTask) {
+            return queueTask.promise.then(() => this.multiMerge(pairs));
+        }
+
         const tasks = _.map(pairs, ([key, partialValue]) => this.getItem(key)
             .then((existingValue) => {
                 const newValue = _.isObject(existingValue)

I think we're safe removing cache cleanup for the moment as it would be very hard to fill cache to levels that would crash the app
IMO you would have to open 1000 reports one by one and scroll each of them to the beginning (let's say each has 1000 comments)

We can capture a task to add better cache handling that deals only with safe eviction keys as they are the ones to watch out for

@marcaaron
Copy link
Contributor Author

Thanks @kidroca! Gonna start by writing some more tests in react-native-onyx. It feels like we could use some coverage there since this was pretty easy to break.

@marcaaron
Copy link
Contributor Author

marcaaron commented Mar 10, 2022

Hmm so as I'm writing these tests I'm realizing a few things...

  • Tests are only running against the AsyncStorage mock. That would mean the web storage isn't tested when most tests run. And the localforage mock doesn't really do anything (maybe it should more accurately represent how the storage works).

  • Onyx.mergeCollection() seems to have a bug where calling mergeCollection() with partial data may set that partial data in the cache.

Not sure if it's likely in practice, but one idea to prevent this is to make sure we always call get() for all keys whenever we are going to call mergeCollection(). Otherwise, it's possible to see that:

  • mergeCollection() will set the data correctly in storage, but not in the cache.
  • cache will be updated with partial data and not the merged result of old and new data.

@marcaaron
Copy link
Contributor Author

@Julesssss @kidroca pushed up a variation of the make multiMerge sync solution + automated test to try and address the weirdness in this thread. Things are running pretty smooth now (but probably will need to clean up the changes a bit still).

@marcaaron marcaaron changed the title [HOLD] Bump Onyx version to help with IndexedDB slow writes Bump Onyx version to help with IndexedDB slow writes Mar 18, 2022
@marcaaron
Copy link
Contributor Author

Taking HOLD off here and added merge commit. @thienlnam are you able to give this a look ? 🙇

Would recommend testing this against the production API with a main Expensify account full of data to see the improvements.

@thienlnam
Copy link
Contributor

Going to test this locally and will report back

@thienlnam
Copy link
Contributor

Hellll yeah, it's much snappier 🔥 🐎

@thienlnam thienlnam merged commit 0b0a51e into main Mar 18, 2022
@thienlnam thienlnam deleted the marcaaron-fixIDBSlowWrites branch March 18, 2022 20:08
@OSBotify
Copy link
Contributor

✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release.

@OSBotify
Copy link
Contributor

🚀 Deployed to staging by @thienlnam in version: 1.1.46-0 🚀

platform result
🤖 android 🤖 success ✅
🖥 desktop 🖥 success ✅
🍎 iOS 🍎 success ✅
🕸 web 🕸 success ✅

@OSBotify
Copy link
Contributor

🚀 Deployed to production by @timszot in version: 1.1.46-3 🚀

platform result
🤖 android 🤖 success ✅
🖥 desktop 🖥 success ✅
🍎 iOS 🍎 failure ❌
🕸 web 🕸 success ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants