Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix import-before-module check #55822

Closed
wants to merge 1 commit into from
Closed

Conversation

barucden
Copy link
Contributor

@barucden barucden commented Sep 20, 2024

PR #51635 introduced function check_src_module_wrap which checks whether a module declaration is missing or whether a using/import is declared before a module.

The original implementation, based on manual checks via regexes, did not catch all possible cases. In particular, it did not consider multi-line comments, e.g.,

#=
using Foo
=#
module Bar
end

threw errors even though it should be valid.

Single-line comments, multi-line comments, and docstrings make it quite difficult to reliably check that a module is declared and not preceded by any using/import. Some examples are

#= =# using A
module B end

#= #=
=#
using A
=#
module B end

"docs" module B end

Because of that, this commit avoids all manually-written, specialized conditions and employs Meta.parse to parse the code into Exprs. It then becomes very simple to check the condition of interest. The downside is increased runtime.

[WITHOUT the shortcut; see bellow] I measured the time it takes to check three files of different sizes (the first measurements is the current version, the second is master):

 # 63979 lines of code
@btime Base.check_src_module_wrap(Base.PkgId("dummy"), "LibSCIP.jl")
  102.267 ms (535826 allocations: 36.19 MiB)
  5.630 μs (8 allocations: 528 bytes)

 # 1051 lines of code
@btime Base.check_src_module_wrap(Base.PkgId("dummy"), "Primes.jl")
  3.525 ms (21503 allocations: 1.08 MiB)
  5.946 μs (9 allocations: 656 bytes)

 # 79 lines of code
@btime Base.check_src_module_wrap(Base.PkgId("dummy"), "BenchmarkTools.jl")
  106.170 μs (578 allocations: 46.02 KiB)
  3.240 μs (8 allocations: 560 bytes)

The slowdown is dramatic (most of the time is spent reading the source file into a String).

I added a condition that checks if the file content starts with "module ", in which case the function returns immediatelly. This shortcut helps: LibSCIP.jl takes only 5.865 μs (instead of 102.265 ms). Of course, the shortcut is taken only in some cases...

I am submitting this PR anyway for discussion (CC: original author @IanButterworth).

Fixes #55792

PR JuliaLang#51635 introduced function `check_src_module_wrap` which checks
whether a module declaration is missing or whether a `using`/`import` is
declared before a module.

The original implementation, based on manual checks via regexes, did not
catch all possible cases.  In particular, it did not consider multi-line
comments, e.g.,
```
using Foo
=#
module Bar
end
```
threw errors even though it should be valid.

Single-line comments, multi-line comments, and docstrings make it quite
difficult to reliably check that a module is declared and not preceded
by any using/import.  Some examples are
```
module B end

=#
using A
=#
module B end

using A
=#
module B end

"docs" module B end
```

Because of that, this commit avoids all manually-written, specialized
conditions and employs `Meta.parse` to parse the code into `Expr`s.  It
then becomes very simple to check the condition of interest.  The
downside is increased runtime.

[WITHOUT the shortcut; see bellow] I measured the time it takes to
check three files of different sizes (the first measurements is the
current version, the second is master):
```
 # 63979 lines of code
@Btime Base.check_src_module_wrap(Base.PkgId("dummy"), "LibSCIP.jl")
  102.267 ms (535826 allocations: 36.19 MiB)
  5.630 μs (8 allocations: 528 bytes)

 # 1051 lines of code
@Btime Base.check_src_module_wrap(Base.PkgId("dummy"), "Primes.jl")
  3.525 ms (21503 allocations: 1.08 MiB)
  5.946 μs (9 allocations: 656 bytes)

 # 79 lines of code
@Btime Base.check_src_module_wrap(Base.PkgId("dummy"), "BenchmarkTools.jl")
  106.170 μs (578 allocations: 46.02 KiB)
  3.240 μs (8 allocations: 560 bytes)
```
The slowdown is *dramatic* (most of the time is spent reading the source
file into a String).

I added a condition that checks if the file content starts with "module ",
in which case the function returns immediatelly.  This shortcut helps:
LibSCIP.jl takes only 5.865 μs (instead of 102.265 ms). Of course, the
shortcut is taken only in some cases...

I am submitting this PR anyway for discussion.

Fixes JuliaLang#55792
@IanButterworth
Copy link
Sponsor Member

Thanks for trying to fix this. Maybe there's a way to do this in Base.require, checking that it's not importing packages into Main during precompilation.

@IanButterworth
Copy link
Sponsor Member

See #55824

@barucden barucden closed this Sep 23, 2024
IanButterworth added a commit that referenced this pull request Sep 23, 2024
Fixes #55792
Replaces #55822
Improves what #51635 was trying
to do

i.e.
```
ERROR: LoadError: `using/import Printf` outside of a Module detected. Importing a package outside of a module is not allowed during package precompilation.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiline comment parsing error on 1.11-rc3
2 participants