Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree Sort on first page load is incorrect #307

Closed
yangjuncode opened this issue Apr 2, 2021 · 16 comments · Fixed by #326 or #336
Closed

Tree Sort on first page load is incorrect #307

yangjuncode opened this issue Apr 2, 2021 · 16 comments · Fixed by #326 or #336
Labels
bug Something isn't working enhancement New feature or request

Comments

@yangjuncode
Copy link

I'm submitting a Bug report

Your Environment

git current master branch

Describe the Bug

Steps to Reproduce

  1. little modification of example 5 line 189
      d['id'] = ''+i;
      d['indent'] = indent;
      d['parentId'] = parentId;
      d['title'] = 'Task ' + i +" P:"+parentId;
  1. yarn dev:watch

  2. switch to example 5, the Hierarchical tree is show sort by title, not count the tree parent/child relation
    2021-04-02_17-23

  3. click title column two time to switch sort back to ascending, now the tree show is correct with the parent/child relation .
    2021-04-02_17-24

Expected Behavior

the tree should show the parent/child relation on first show.

Current Behavior

the tree not show the parent/child relation on first show.

@ghiscoding
Copy link
Owner

Hmm it looks ok to me but I just go on the live demo, I don't really have time to look into this but the code that runs the initial sort is in the Sort Service on this function

/** Process the initial sort, typically it will sort ascending by the column that has the Tree Data unless user specifies a different initialSort */
processTreeDataInitialSort() {
// when a Tree Data view is defined, we must sort the data so that the UI works correctly
if (this._gridOptions && this._gridOptions.enableTreeData && this._gridOptions.treeDataOptions) {
// first presort it once by tree level
const treeDataOptions = this._gridOptions.treeDataOptions;
const columnWithTreeData = this._columnDefinitions.find((col: Column) => col && col.id === treeDataOptions.columnId);
if (columnWithTreeData) {
let sortDirection = SortDirection.ASC;
let sortTreeLevelColumn: ColumnSort = { columnId: treeDataOptions.columnId, sortCol: columnWithTreeData, sortAsc: true };
// user could provide a custom sort field id, if so get that column and sort by it
if (treeDataOptions && treeDataOptions.initialSort && treeDataOptions.initialSort.columnId) {
const initialSortColumnId = treeDataOptions.initialSort.columnId;
const initialSortColumn = this._columnDefinitions.find((col: Column) => col.id === initialSortColumnId);
sortDirection = (treeDataOptions.initialSort.direction || SortDirection.ASC).toUpperCase() as SortDirection;
sortTreeLevelColumn = { columnId: initialSortColumnId, sortCol: initialSortColumn, sortAsc: (sortDirection === SortDirection.ASC) } as ColumnSort;
}
// when we have a valid column with Tree Data, we can sort by that column
if (sortTreeLevelColumn && sortTreeLevelColumn.columnId) {
this.updateSorting([{ columnId: sortTreeLevelColumn.columnId || '', direction: sortDirection }]);
}
}
}
}

There's a lot of recursion when dealing with Tree Data

@ghiscoding ghiscoding changed the title tree sort first time is not corrent Tree Sort on first page load is incorrect Apr 8, 2021
@ghiscoding ghiscoding added the bug Something isn't working label Apr 30, 2021
@ghiscoding
Copy link
Owner

@yangjuncode some good news...

I finally got back to the need to use and create a Tree Data grid (we actually have 3 different tree data grids to do) and found your reported issue on Example 5 and even other issues as well. I expect to have fixes sometime next week. The list of issues I have identified so far are:

  1. the issue you report with initial tree sort
    • it seems to be a racing condition and regardless of that, I will make sure the sort is always called when loading the grid
  2. I seem to have found multiple rendering of the data in the grid and also multiple conversions (from hierarchical to flat and vice versa) which would off course impact the performance
    • I found out that on Example 5, it first load the data without the tree, then it sorts and build the tree and then it again re-render the grid. This is obviously not wanted and is also bad for performance, we should only load and render the grid only once.
  3. The Example 5 seems to require the tree level depth (identified as indent in that demo) but that should not be required at all , a parentId property should be more than enough to build the full Tree.
    • it currently throws an error if we remove all the calculated indent, the end goal is to remove the indent and let the lib do the work to build the tree and calculate that tree level depth. It just shouldn't require the developer to provide that extra depth

I believe that I got proof of concept for all 3 but I need to do more testing to make sure it's all good before releasing anything and again I expect sometime next week to be done with all this.

Thanks for the feedback and patience 😉

@yangjuncode
Copy link
Author

thanks for your hard work, looking forward to the new tree data grid.

ghiscoding-SE pushed a commit that referenced this issue Apr 30, 2021
1. initial sort not always working
2. tree level property should not be required while providing a parentId relation
3. tree data was loading and rendering the grid more than once (at least 1x time before the sort and another time after the tree was built) while it should only be rendered once
@ghiscoding
Copy link
Owner

@yangjuncode
I'll do some more test early next week but if you would give it a try, that'd be nice. It's still an open PR #326
If you do have a chance to try, then let me know if it all works.
Thanks

@yangjuncode
Copy link
Author

thanks for the fix, i am just return to work.
when i open the issue, the performance is not good in tree grid, we test 100K row and it operate laggy, i don't know if that problem is fixed by the patch, i will ask my colleague for more details.

@yangjuncode
Copy link
Author

@ghiscoding
i test with master with example 05 with 100k row and init can't finish processing, then chrome show wait or exit page alert.
my colleague told me that he want checkbox function which is not available now in tree grid. fancytree example

@ghiscoding
Copy link
Owner

ghiscoding commented May 6, 2021

hi sorry but I can't really do much about performance with that many data, Tree Data requires a lot of recursive calls and it often have to switch and convert between flat structure to hierarchical (tree) structure (and vice versa) and for that it uses recursion (for example the sort must be done on the hierarchical structure and once it's done it must convert it back to flat structure then show it on the screen because that is what SlickGrid uses, so there's a lot of conversion back and forth). SlickGrid internals (core lib) doesn't actually work with a hierarchical structure, what I've really done is to add all these convert functions to switch between flat to hierarchical and vice versa. I feel that 100k is way too much data to work with a Tree Data structure (some other grid libs might do a better job at that if their code uses hierarchical directly but SlickGrid does not, so it will always be slower). 100k data should be used with a Backend Service (OData, GraphQL) however it work with Tree Data (because it needs the entire dataset to work since it convert flat to hierarchical back and forth). The other problem with 100k rows is that all these conversions are done on your computer by the CPU and for those users who have a slower CPU, it will be worst for them. Our data usage will never go over 5k rows, so it's not too bad for our use case.

Row Selection does work in every grid type and yes it does work with Tree Data as well, you can see below on the grid that I'm currently working on. You should read the Row Selection - Wiki, every information you need to know is in that Wiki.

Also note that I didn't release a new version yet, I will push a new version probably next week

image

@yangjuncode
Copy link
Author

since slickgrid-universal is operating fluently in 1m rows, we thought it may also ok with 100k in tree.
though 100k rows may be too much for slickgrid-universal now, but think twice the data is actually less, maybe some aux index would solve the problem like in database.

@ghiscoding
Copy link
Owner

ghiscoding commented May 7, 2021

no I think you misunderstand a few things, the problem is not SlickGrid itself, here's a few things to note

  • SlickGrid can work with 100k without being too slow with a regular data structure, for example take the Example 2 and click on the button "50k rows" and you will see that it still works very fast and smooth. Tree Data is not a regular structure (commonly known as a flat data structure)
  • SlickGrid (the core library) does not support Tree Data, please remember that Slickgrid-Universal is a wrapper on top of SlickGrid
  • to get Tree Data to work, I had to create a lot of recursive methods to convert between a flat structure into a tree structure and vice versa (on Example 5, click on these 2 buttons to know the difference between the 2 structures)
    image
  • the recursive methods I created, are convertParentChildArrayToHierarchicalView, convertHierarchicalViewToParentChildArray and sortTreeChild

What is really slow is that currently the steps are

  1. take parent/child array (flat structure) and convert it into hierarchical structure (tree structure) --> this is fast
  2. sort the hierarchical (tree) --> this is fast
  3. convert back to flat structure -> this is slow.....
  4. finally we can now display the flat structure into the browser

if you sort, then we need to repeat steps 1 to 4 and so on.. and so on

We do all this because SlickGrid does not support Tree Data directly, it only supports a flat data structure and that is why Example 2 is fast with 50k rows but Example 5 is much slower with 50k rows.

There might be some small improvements I can do (anytime we load a tree data grid it will do step 1 to 4, maybe we can omit step 2 and 3 if it's already sorted?) Some of the process are slow, I'm not sure if my methods are the fastest, I just wanted it to work (performance improvement can come later).

@yangjuncode
Copy link
Author

yangjuncode commented May 7, 2021

@ghiscoding thanks for the details.

  1. the tree structure can be cached for later use so it need create only once,of course,it will introduce sync work. and i see a deepcopy in convertParentChildArrayToHierarchicalView(i don't know if this can be omit, deepcopy for large data is slow)
  2. sort on tree structure should as fast as flat structure
  3. in my opinion, this should be fast,it is only iterate the tree one time on depth(maybe need profile it)

the problem maybe is too many new object generated for tree grid.

@ghiscoding
Copy link
Owner

the deep copy that is there cannot be removed because it's converting by reference and that mutate (change) the in input array. The only alternative I can think of would be to find an external lib that those a better job at it and is more tested.

I believe your point number 2 is wrong, the only way that I know of to sort a flat structure is really to first convert tree structure then sort and finally back to flat. This 3 steps process is obviously quite expensive (CPU), if you find a better way, feel free to contribute.

@ghiscoding
Copy link
Owner

ghiscoding commented May 8, 2021

Good news, I found some libraries that provide, this one in particular un-flatten-tree which is extremely fast compare to my original implementation, now we are back to some normal numbers

Doing some metrics, with 20k rows and printing the stats only for the number 3 that was my slow code (convert tree to flat) and now it's extremely faster and will work with 100k rows (I would never have that many but apparently you do). The test I'm doing is just to sort the "Title" column.

with old code

  • 3334.135009765625 ms
  • 3354.64892578125 ms

with new code

  • 67.64697265625 ms
  • 70.06689453125 ms

...so does it work with 100k rows now? yes it does but it takes couple of seconds (instead of couple of minutes with old code)

  • 357.5859375 ms

here's a full detailed stats when loading the page with 100k rows and doing a sort on "Title" (the tree structure is only executed once and it's fast).

  1. flat parent/child to tree: 159.07421875 ms --> convert to tree then cache it (only executed once, unless you reload the page)
  2. sort tree array: 80.9580078125 ms --> start the sort on the "Title" column
  3. tree to flat parent/child: 346.322021484375 ms
  4. reconvert to flat parent/child: 346.55126953125 ms

That was mainly POC (proof of concept) and I still have to cleanup all the code and add all unit tests but at least now it's back to a speed that is more expected of 🚀

EDIT

Some more stats after I cleanup the code and fixed couple of other small issues found. Below is another test with 100k rows and a sort on the "Title" column, it's even faster than before (maybe because I cleaned up a lot of old code). However, notice that the most expensive task is the SlickGrid sort in the UI (the slowest part is not the tree sort but this call dataView.setItems which is the core lib). Also note this is a new stat that I just added because I find that the UI takes time to finish the sort so I was curious to see its stat too.

  1. sort tree array: 84.18505859375 ms
  2. flatten the tree: 49.78515625 ms
  3. reconvert to flat parent/child: 49.981689453125 ms
  4. sort changed: 5869.675048828125 ms

So 5.8 sec to wait for the sort on 100k items is not bad but it still is something to wait for. With that in mind, I will add extra events so that we can show a spinner if we want while waiting for the sort to finish (maybe 2 event named onBeforeSortChanged and onAfterSortChanged). Also note that the Title column uses a string, so it uses a regular string sort, I say that because if your column is a different field type that requires a different sort comparer, it might be even time (the most expensive of them all is Date sort).

@ghiscoding
Copy link
Owner

@yangjuncode I'm finally done with the 2nd and last PR for Tree Data, I believed that I fixed everything, the performance is improved by a lot (40-50x times like I said in previous post) and it works nicely with 100k rows (I added a button to show 25k rows in the Example 5). I also added a bunch of new events if you're dealing with large dataset then I strongly suggest that you use them to show a spinner as I did in Example 5 (see them all here). For a complete list of everything that got fixed and improved take a look at the full list of changes/fixes/enhancements (there's a lot) in the PR #336

Also for your info, the GitHub demo page always gets rebuild anytime a PR is merged and so Example 5 already as the new code including the 25k row button to test it out. Give it a try, I'm done with it

I have a couple of other things that I want to look into, which are unrelated to Tree Data, so I'm not expecting to release a new version until at least next week.

@yangjuncode
Copy link
Author

wow,excellent work.

in my pc, for example 05: 25k is nice, 50k: 3s, 100k:>=10s for every sort.

there should be something can be improved, but it should be another issue.

@ghiscoding
Copy link
Owner

ghiscoding commented May 12, 2021

Well this is it, I fixed whatever I wanted to fix and I need to focus on other things, so I'm officially done with it (I'm currently doing a project now, well 3 different teams actually, with Tree Data which is also why I did a few other enhancements and fixes and I think I covered everything I had to cover).
100k rows is a not typical use case, at least it will never be that big on our side.
This project is always open for any contributions and so if you find something to improve then feel free to create a Pull Request.

oh and I also created a new Tree Data - Wiki today.

@ghiscoding
Copy link
Owner

it took some time because I worked on other features and it is now release, see the new version 0.14.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
2 participants