-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When using Google provider + Python the import generated file get so big that the IDEs can't handle it #1606
Comments
We've previously ran into this with the AWS provider. There we explicitly carve up the code into submodules which I believe are more IDE friendly. Could most likely do the same with the Google provider. |
Just chiming in: I am experiencing a similar nuisance with the Azure provider - albeit due to prospector (python) maxing out a core when parsing the file because it's so big - it doesn't crash, just can't finish in a timely fashion. Pylance in VS Code seems to handle it fairly ok, so it can be used as fallback for certain linting purposes, but then you're dependent on VS Code settings for proper linting, instead of a prospector config file. I'm unsure if carving up into submodules would solve that prospector issue, as it would still need to parse just as much code, maybe more - but maybe it can do it more efficiently if it's split up. |
@thomasschockaert You could try that with the AWS provider (which is already split up) and report back. Would be helpful to know 👍 |
Hi @ansgarm, I'm not experiencing the issue that @fbrodrigorezino mentioned with either azure or aws code when using VS Code, so there is that. (making my comment slightly off topic, unless this issue becomes/is relevant for splitting up all provider code). Wall of text below on the how/why, imo, splitting up is relevant for all large provider code bases for dev tools and consequently IDE's. I tested the aws provider as you requested - the slowness is less present, but still is for the larger files - it's definitely a prospector being "slow" issue (more specifically one or more of the tools it runs). I've tried the following files (in linecount order) from the aws package: The first one doesn't complete in a timely manner (+15mins). If I compare that to the azure package we get: => with similar results: too slow. I'm unsure how effective the splitting-up procedure would be, but if it yields files above 40k lines, it's gonna be slow but at least functional, anything above that 40k mark is unbearably slow (on a i7-1185G7 - ymmv of course, but ~500k lines just takes ages). In the name of being thorough, I ran and timed each of the tools manually as well (with the same configs as what prospector would pass) and it seems like it's only pylint and mypy that are causing the biggest slowdowns: for the aws ec2 init file alone, pylint takes 00:01:35 and mypy takes 00:01:17, so even if these are run as efficiently as possible, that's still over a minute to get results. Mypy also drags in a lot of the other files (as that's what it does for type checking), but does so very efficiently. If it needs to draw in that wafv2 init file as well, that number changes drastically though. All-in-all I'm leaning towards the idea that splitting up code won't do much for mypy, but will do it for pylint. So it's ultimately a matter of the tools having to check a library that's 700k+ (azure) or 1.26m+ (aws) lines of code to provide insightful information - and that will inevitably be slow. Moving the mypy and pylint checks to pre-commit instead of handling them "each time when you save a file in IDE" is a necessary workflow modification in my opinion. I do enjoy the split up version of the generated code more than a single big file, because I don't need to tell the IDE to turn off pretty much every feature it has when opening such a large file. VS Code even automatically disables certain things in such case. So my conclusion is: split up == good, 700k+ lines of code == slow for dev tools and in some cases also slow for IDE; split up helps IDE, but does not always help all dev tools (it does for pylint, but not for mypy). Kind regards, Thomas |
Thank you for your extensive profiling @thomasschockaert! The main problem with the wafv2 resources is, that the underlying schema at AWS is recursive which Terraform does not support. So the workaround in the AWS Terraform provider is to explicitly define the schema up until a certain depth. As the CDKTF bases its generated code on that schema, it produces a ridiculous amount of code for that schema instead of using recursion (which would be possible with JSII, I think). So tackling that issue could improve this problem. |
The current splitting approach is really only viable for a handful of providers (~10 I'd say) since it requires manual effort (maybe could be automated some). I checked a few other providers:
It would be nice to have a metric (number of resources / loc / something else) to use as guidance as to when splitting a provider is necessary for IDE functionality. @thomasschockaert would it be possible to exclude the generated files from the tools? |
Hi @jsteinich Both mypy and pylint support ignoring files / patterns, but in the case of mypy this only affects "which files to discover for checking", not "when following imports" (cfr --exclude docs mypy). Pylance for VS Code doesn't seem to have issues with the large files when it comes to providing code hints and completion, so that part of the dev experience is intact when using VS Code. Everything is optional, so maybe it all comes down to guidance on how to use a specific dev environment when dealing with large files. Most tools do just fine, some are just slow, but those things only pose problems in certain workflows like when you want to use the "problems" tab in VS code which implies semi-continuous (on save/edit) running of the tools, versus a workflow that runs the tools only as a pre-commit hook. Maybe this means it's 'just' a "nice to have code split up" as it's not a "full fix" and very dependent on the workflow you choose (i.e. you could change workflow for a project based on the fact that there are a lot of large files). None of this is unique to cdktf; anything with large files will present this very same issue for an IDE, and it's not always fixable per definition. |
JetBrains IDEs by default don't support files with more than 25MB. So the experience gets very poor for CDKTF for a variety of providers. Given the number of users of this IDE, it might be smart as product management to improve the support for it. Another thing, though in VS Code it "works" for me it's been very slow and very very memory-consuming, so I'd say that's far from a nice experience. |
@thomasschockaert and @fbrodrigorezino thanks for the information. It makes sense and is very helpful. After doing some more investigation, this seems to only be an issue with how jsii generates python code. Other languages have a a file per resource (need to double check go). Except for a couple edge cases (aws waf), this results in fairly reasonably sized files. I haven't looked into the feasibility of making python generate a file per resource in jsii, but that seems like a much better route to pursue than just splitting into arbitrary (generally service level) submodules. |
@jsteinich yes, you are correct, you can find this info in the Important Factoids |
For JetBrains you can use Both are in the Help "menu": I am using azurerm + python It really needs to be separated. |
It's not just the IDE. Running mypy on a project that uses these imports (Azure in our case) runs forever (maybe 6 minutes vs ~20 seconds without it). And yes, we're excluding the import folders, but as mentioned above, mypy follows the imports, and we won't have proper typing support if we wouldn't allow that. For reference, the |
Hi 👋 |
I'm going to lock this issue because it has been closed for 30 days. This helps our maintainers find and focus on the active issues. If you've found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Community Note
cdktf & Language Versions
Language: Python
Version: All
Affected Resource(s)
Developer experience.
Expected Behavior
When you install the
google
provider, you can use it like any other python package.Actual Behavior
The contact of the imported
__init__.py
is so big, that IDEs can't read the symbols, so we don't have any code hint.Steps to Reproduce
google
providerImportant Factoids
This is not a problem in the cdktf itself. But how jsii generated python packages.
I'm posting it here, as I think terraform would be interested in collaborating to jsii development and solve it.
You can find extra information here: aws/jsii#2436
The text was updated successfully, but these errors were encountered: