Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web site should be transferring data compressed #12597

Closed
huonw opened this issue Feb 27, 2014 · 5 comments
Closed

Web site should be transferring data compressed #12597

huonw opened this issue Feb 27, 2014 · 5 comments

Comments

@huonw
Copy link
Member

huonw commented Feb 27, 2014

Our search doc indexes are huge JS files, which are very amenable to compression.

We're failing pagespeed because of it: http://developers.google.com/speed/pagespeed/insights/?url=http%3A%2F%2Fstatic.rust-lang.org%2Fdoc%2Fmaster%2Fstd%2Findex.html

@lifthrasiir
Copy link
Contributor

Some quick comparison:

$ curl -s http://static.rust-lang.org/doc/master/std/search-index.js | wc -c
501867
$ curl -s http://static.rust-lang.org/doc/master/std/search-index.js | gzip -c -9 - | wc -c
84310

We can also consider adjusting the index format to exploit deflate/LZ77 algorithms if we want to optimize that further. (Unlikely, but just a possibility)

@huonw
Copy link
Member Author

huonw commented Feb 27, 2014

It appears that gzipping manually may be the only way to do this: http://stackoverflow.com/a/5447158/1256624

:(

@thestinger
Copy link
Contributor

@ehsanul
Copy link
Contributor

ehsanul commented Mar 18, 2014

Doesn't seem like rust-lang.org is using cloudfront, but rather just hosts on S3? In which case, that may not work.

The best solution might be the answer further down, since it won't break on safari and mobile browsers and such.

bors added a commit that referenced this issue Apr 14, 2014
…crichton

This is a series of inter-related commits which depend on #13402 (Prune the paths that do not appear in the index). Please consider this as an early review request; I'll rebase this when the parent PR get merged and rebase is required.

----

This PR aims at reducing the search index without removing the actual information. In my measurement with both library and compiler docs, the search index is 52% smaller before gzipped, and 16% smaller after gzipped:

```
 1719473 search-index-old.js
 1503299 search-index.js (after #13402, 13% gain)
  724955 search-index-new.js (after this PR, 52% gain w.r.t. #13402)

  262711 search-index-old.js.gz
  214205 search-index.js.gz (after #13402, 18.5% gain)
  179396 search-index-new.js.gz (after this PR, 16% gain w.r.t. #13402)
```

Both the uncompressed and compressed size of the search index have been accounted. While the former would be less relevant when #12597 (Web site should be transferring data compressed) is resolved, the uncompressed index will be around for a while anyway and directly affects the UX of docs. Moreover, LZ77 (and gzip) can only remove *some* repeated strings (since its search window is limited in size), so optimizing for the uncompressed size often has a positive effect on the compressed size as well.

Each commit represents the following incremental improvements, in the order:

1. Parent paths were referred by its AST `NodeId`, which tends to be large. We don't need the actual node ID, so we remap them to the smaller sequential numbers. This also means that the list of paths can be a flat array instead of an object.
2. We remap each item type to small predefined numbers. This is strictly intended to reduce the uncompressed size of the search index.
3. We use arrays instead of objects and reconstruct the original objects in the JavaScript code. Since this removes a lot of boilerplates, this affects both the uncompressed and compressed size.
4. (I've found that a centralized `searchIndex` is easier to handle in JS, so I shot one global variable down.)
5. Finally, the repeated paths in the consecutive items are omitted (replaced by an empty string). This also greatly affects both the uncompressed and compressed size.

There had been several unsuccessful attempts to reduce the search index. Especially, I explicitly avoided complex optimizations like encoding paths in a compressed form, and only applied the optimizations when it had a substantial gain compared to the changes. Also, while I've tried to be careful, the lack of proper (non-smoke) tests makes me a bit worry; any advice on testing the search indices would be appreciated.
@alexcrichton
Copy link
Member

Closing, this is now done through the new official domain, http://doc.rust-lang.org/

bors added a commit to rust-lang-ci/rust that referenced this issue Jul 25, 2022
fix: Fix auto-ref completions inserting into wrong locations

Fixes rust-lang/rust-analyzer#8058
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants