Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JavaScript heap out of memory #1426

Closed
JulianOrteil opened this issue Jun 10, 2021 · 12 comments
Closed

JavaScript heap out of memory #1426

JulianOrteil opened this issue Jun 10, 2021 · 12 comments
Labels
bug Something isn't working fixed in next version (main) A fix has been implemented and will appear in an upcoming version

Comments

@JulianOrteil
Copy link

Environment data

  • Language Server version: 2021.6.0 (also fails on 2021.6.1)
  • OS and version: win32 x64
  • Python version (and distribution if applicable, e.g. Anaconda): 3.8.8; with Anaconda
  • python.analysis.indexing: undefined
  • python.analysis.typeCheckingMode: off

Expected behaviour

No crashes

Actual behaviour

The server crashes with the error FATAL ERROR: invalid array length Allocation failed - JavaScript heap out of memory.
Logs are attached below.

Logs

Python Language Server Log

Code Snippet / Additional information

Because my project is so large (over 11k line written by me and just under 5 million autogenerated by Qt) and is proprietary, I can't provide code that would help. So instead, I will try to give you a description of the project:

  • The project's primary dependencies include: PyQt5, opencv, numpy, pillow, loguru, and a couple other smaller ones.
  • The project is well over 11k lines with nearly 5 million extra being auto-generated by PyQt's pyrcc5 utility (images, gifs, etc). That file is basically bytecode, but is a python file. It is required that I load it in at the launch of the application, but I seldom open it in VS Code. This file I expect to be the culprit. I can upload if it is asked for.
  • The language server only fails with this project. I have other, smaller projects that do not encounter this issue.
@erictraut
Copy link
Contributor

The type checker maintains an internal type cache that it can discard if memory usage becomes too high. It checks the high-water mark after analyzing each file. From what you're saying it sounds like the type checker consumes the full available heap size (2GB!) of memory with just this single file. If that assumption is correct, that's well outside of the bounds of what we designed for, and I wouldn't expect to be able to analyze a 5-million-line file.

Perhaps we should simply refuse to analyze a text file above a certain size. How large is this file on disk?

@jakebailey
Copy link
Member

Can you try using a pyrightconfig.json to exclude the file that you think is problematic, perhaps?

https://github.com/microsoft/pyright/blob/main/docs/configuration.md

@erictraut
Copy link
Contributor

Excluding the file won't prevent type checking once the file is open in the editor. Once a ".py" or ".pyi" file is opened, pylance will analyze it.

@JulianOrteil
Copy link
Author

JulianOrteil commented Jun 10, 2021

Perhaps we should simply refuse to analyze a text file above a certain size. How large is this file on disk?

The file is currently 321MB on disk.

Once a ".py" or ".pyi" file is opened, pylance will analyze it.

Does this mean any ".py" or ".pyi" file, or just the ones importing this large one? If I open a ".py" file that does not import this large file, then the server doesn't crash. Even opening files that import other files that import this large file (i.e. "ui.py" imports this large file, but I import "ui.py" into "control.py" and can open "control.py" without the server crashing).

Can you try using a pyrightconfig.json to exclude the file that you think is problematic, perhaps?

No change. Likely because of this: Note that files in the exclude paths may still be included in the analysis if they are referenced (imported) by source files that are not excluded. This is the case for me, I need to be able to edit "ui.py" (in this case) which imports the large file.

@jakebailey
Copy link
Member

Excluding the file won't prevent type checking once the file is open in the editor. Once a ".py" or ".pyi" file is opened, pylance will analyze it.

Yes, of course; I was hoping this was some unreferenced file, but that's clearly not the case.

@erictraut
Copy link
Contributor

erictraut commented Jun 10, 2021

If another file imports this large file, the the large file will be opened, parsed, and "bound" (i.e. lexical scopes are identified and symbol tables populated), but the large file will not be fully analyzed. Full analysis will be done only when the file is opened in the editor. The full analysis is where the vast majority of work takes place (and where most memory is consumed). Theoretically, a file could get so large that even parsing and binding would consume all available memory, but I wouldn't expect that to happen until the file reaches tens of Megabytes in size.

Partial type analysis will be done for a file that hasn't been opened, but this is done only for symbols whose types are needed by the importer. It's theoretically possible analysis of a single symbol's type could be problematic — for example if a large file defined a single variable without a type annotation and assigned to it a tuple expression that contains 100,000 items.

Is this large file checked in to a public github repo? If not, is there a place where you could upload it so we can look at it? I could do some additional exploration and analysis of the problem.

@JulianOrteil
Copy link
Author

Here is a link for download. It does have a PyQt5 dependency: https://drive.google.com/file/d/1711ErbTlayYCh7ArybRn8mSh5rTY1N2q/view?usp=sharing

@erictraut
Copy link
Contributor

OK, thanks for the file.

This file is 335.6MB in size and has 4.9M lines.

If I open the file directly in VS Code, it never sends the language server an "open file" event. I presume VS Code has a cutoff file size beyond which it will not invoke the language server. That would make sense.

The problem is not when the file is opened but when it is imported by another file. Pylance attempts to read the contents of the file into memory, and in doing so it runs out of memory.

I think the correct solution is to add a maximum file size limit. If the file is beyond this limit, pylance should simply refuse to analyze it. This is preferable to crashing.

@erictraut
Copy link
Contributor

I also verified that if I attempt to type check this large source file (or a source file that import it) with the pyright CLI tool, it also crashes there with an out-of-memory error. So the problem isn't specific to pylance.

I've added a limit in the code of 16MB. If the source file is larger than that, it doesn't attempt to read or parse the file. If you import such a file, you won't see an error, but none of the imported symbols will have known types. If you open such a file, nothing will happen because VS Code won't even send pylance a message telling it that the file was opened.

With this limit in place, pyright is now able to type check a workspace that includes this large file. It emits a single error because the file limit is exceeded, but it completes without crashing.

Here's the PR: microsoft/pyright#1973

@erictraut erictraut added bug Something isn't working fixed in next version (main) A fix has been implemented and will appear in an upcoming version and removed triage labels Jun 10, 2021
@JulianOrteil
Copy link
Author

JulianOrteil commented Jun 10, 2021

This file is 335.6MB in size and has 4.9M lines.

My bad! I meant MB not KB lol

But excellent. Thanks for putting in the PR.

@bschnurr
Copy link
Member

This issue has been fixed in version 2021.6.2, which we've just released. You can find the changelog here: https://github.com/microsoft/pylance-release/blob/main/CHANGELOG.md#202162-16-june-2021

@JulianOrteil
Copy link
Author

Can confirm this has been fixed; thanks a ton for doing so. Pylance is so much more polished that Jedi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixed in next version (main) A fix has been implemented and will appear in an upcoming version
Projects
None yet
Development

No branches or pull requests

4 participants