Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hot module reloading #122

Open
Dale-Black opened this issue Aug 25, 2023 · 43 comments · May be fixed by #220
Open

Hot module reloading #122

Dale-Black opened this issue Aug 25, 2023 · 43 comments · May be fixed by #220

Comments

@Dale-Black
Copy link

Is there a way to restart the server on file changes within the directory?

I was thinking something like BetterFileWatcher.jl might be useful for something like this but I'm not sure how to implement it

@frankier
Copy link
Contributor

frankier commented Aug 28, 2023

I tried to take a little look at this myself. What I found was that Revise and includet of my main module alone were not sufficient. Possibly invokelatest needs to be used somewhere in the request/response loop? Or else listen for the revise event and restart the server altogether?

You might get some idea by looking at Genie's code:

https://github.com/search?q=repo%3AGenieFramework%2FGenie.jl%20revise&type=code

Happy to help test/review a PR.

@ndortega
Copy link
Member

ndortega commented Aug 28, 2023

Hi @Dale-Black & @frankier

Thanks for opening this ticket and for the suggestions. This is a good next issue to tackle which would definitely help speed up local development.

I'll definitely try to follow what Genie.jl is doing to get this working, but I'm open to any and all suggestions. I did find a similar approach in this issue: JuliaWeb/HTTP.jl#587

Also I'll let you know when I have something worth testing & reviewing @frankier, thanks for the help!

@Dale-Black
Copy link
Author

That http.jl code looks great. So something like HTTP.listen with Revise should work?

@JanisErdmanis
Copy link
Contributor

Is there any progress in this direction?

@ndortega
Copy link
Member

@JanisErdmanis,

There was some progress made in a PR & discussion about a month ago - but I haven't worked on it since. I've been very busy this December.

@walterl
Copy link

walterl commented Jan 3, 2024

@ndortega I see in #134 you mentioned a "server flapping issue". Where is that, and is there anything we can help with?

@asjir
Copy link

asjir commented Feb 1, 2024

I use Revise and to do it I have:

include("routes.jl")
__revise_mode__ = :eval
Revise.track("routes.jl")
function ReviseHandler(handle)
    req -> begin
        Revise.revise()
        invokelatest(handle, req)
    end
end

@Dale-Black
Copy link
Author

Hmm, and that works? Can you share a bit more of your code and structure? Where are you serving the app from?

@ndortega
Copy link
Member

ndortega commented Feb 1, 2024

Hi guys,

Sorry about the inactivity on this issue. This issue is now on the top of my todo list. As far as direction and approaches, I'm all ears. Like most things, this is new territory for me and I'm open to any and all suggestions on how to tackle this problem.

@asjir thanks for your snippet. Id also like to hear more about your project setup and workflow. I really like how you embedded this feature as a middleware function, which would be a super clean way to integrate behind the scenes

@asjir
Copy link

asjir commented Feb 2, 2024

Hmm, and that works? Can you share a bit more of your code and structure? Where are you serving the app from?

Ofc, though it's very basic: in my repo I have sveltekit projects and one folder for julia server X, in X I have 5 files: Project.toml, Manifest.toml, main.jl, routes.jl, util.jl
at the top of main.jl I have cd(@__DIR__) (so imports run from REPL) and then include("routes.jl") etc.
at the bottom of main.jl I have:

run(; port=2227, async=true) = (!isdefined(Main, :s) || !isopen(Main.s)) && (Main.s = serve(middleware=[CorsHandler, ReviseHandler]; port, async))
run()

this function starts an async server that I can close(s) but reruns of it don't try to start a new server which would overwrite my s variable

and inside routes.jl there's

include("util.jl")
Revise.track("util.jl")

so changes to util are also immediately reflected.

While I'm at it I'll show an example route:

post("/make-qch") do req
    data = json(req, MyStruct)
    saved[] = data
    f(data)  # returns NamedTouple
end

which allows me to call f(saved[]) from the REPL and Infiltrator.@inflitrate inside this function (can't use it in an async call of server response).

@Dale-Black
Copy link
Author

It looks like Fons' new project has built in hot module reloading. I haven't worked with it yet but something to look into
https://github.com/JuliaPluto/PlutoPages.jl

@JanisErdmanis
Copy link
Contributor

I can report that I have had some success with ReviseHandler, which use as follows:

using ModuleA
using Oxygen

function ReviseHandler(handle)
    req -> begin
        Revise.revise()
        invokelatest(handle, req)
    end
end

server = ModuleA.serve(port=2345; middleware=[ReviseHandler])

The ModuleA is defined as:

module ModuleA

using Oxygen; @oxidise

@get "/inside" function(req::Request)
    return "Hello from Inside"
end

end 

where the @oxidise macro will be available after the pull request #158 will be merged.

@Dale-Black
Copy link
Author

That looks clean

@JanisErdmanis
Copy link
Contributor

The pull request is now merged on master. It would be great to get a feedback on the Revise workflow before the release.

@Dale-Black
Copy link
Author

Amazing, works very nicely! (https://github.com/Dale-Black/HTMLStrings.jl/tree/main/examples/TodoApp)

Now I have more motivation to keep exploring "full stack" Julia. Excited to build out a better TodoApp with pure Julia as an example

@JanisErdmanis
Copy link
Contributor

JanisErdmanis commented Feb 15, 2024

@Dale-Black, is there any reason why you have chosen to register routes into the todo() function? For me, it looks like a bizarre thing to do, as the routes are registered at runtime.

@Dale-Black
Copy link
Author

Dale-Black commented Feb 15, 2024

No there isn't. I just changed that

@ahjulstad
Copy link
Contributor

I am following this with great interest. Unfortunately I cannot get the approach from #122 (comment) to work, possibly because I don't understand how to setup folder structures and modules properly. Specifically, I am doing:

server.jl:

includet("submodule.jl")

using ..ModuleA
using Oxygen

function ReviseHandler(handle)
    req -> begin
        Revise.revise()
        invokelatest(handle, req)
    end
end

server = ModuleA.serve(port=8080; middleware=[ReviseHandler])

and submodule.jl:

module ModuleA

using Oxygen; @oxidise

@get "/greet" function(req::Request)
    return "Hello from Inside, changed"
end

end 

Server works, but doesn't reload. What is strange, though, is that I need to restart Julia for changes to submodule.jl to register. A simple Ctrl-C and rerun of the server.jl (using run in REPL from in vscode) is not sufficient.

Is it required to have ModuleA in a proper package that is added as a dev dependency?

@JanisErdmanis
Copy link
Contributor

The approach I outlined earlier works only when the module is loaded statically, which Revise then picks up. To get the setup working, use ] generate ModuleA, put the code in src/ModuleA.jl, use @oxidise macro and activate the project with ] activate . which makes ModuleA available. Then, in the REPL, start the service with ReviseHandler. I haven’t managed to get an includet approach to work. Perhaps it is related to the way globals are revised.

@ahjulstad
Copy link
Contributor

Thanks. I got bit by not understanding modules vs packages and the functionality of ] generate.

I also managed to get it working using PkgTemplates, but this is maybe just more work for little gain, compared to your approach.

using PkgTemplate
t = Template(dir=".")
t("ServerPackage")

The development then happens in this ServerPackage.

Separate from this, a new server.jl resides in a project that has ServerPackage as a development dependency. (]dev ./ServerPackage) I have it in the root folder.


using Revise

import ServerPackage

function ReviseHandler(handle)
    req -> begin
        Revise.revise()
        invokelatest(handle, req)
    end
end

server = ServerPackage.serve(port=8080; middleware=[ReviseHandler])

And then, (just because it is possible), I run a client in separate process that repeatedly pings the API.

@frankier
Copy link
Contributor

frankier commented Aug 8, 2024

I've also managed to get the Revise-before-service-request approach working. Thanks a lot! It is very nice in its simplicity, however, it is in a way a "just-too-late" approach. Revising can take time, and it moves the process from when the changes are ready in the source to when you want the page to be rendered, meaning longer waits/increase latency.

I seem to be getting reasonably results with a combined approach where we have Revise watch the file system and runs asynchronously with the server process and then in-addition run Revise before service request. This, in combination with invokelatest appears to make the changes visible in the server thread. It does not appear to be strictly needed to also have Revise running in the server task, but I added this to try and ensure the request is blocked until revising is finished for longer revisions.

using Revise
using Oxygen
using ServerPackage: ServerPackage

function ReviseHandler(handle)
    req -> begin
        if !isempty(Revise.revision_queue)
            @time "Sync revision" Revise.revise()
        end
        invokelatest(handle, req)
    end
end

ServerPackage.serve(; host="127.0.0.1", port=8001, handler=ws_handler, middleware=[ReviseHandler], async=true)
while true
    wait(Revise.revision_event)
    @info "Starting async revision"
    @time "Async revision" Revise.revise()
end

@JanisErdmanis
Copy link
Contributor

JanisErdmanis commented Aug 8, 2024

The asynchronous Revise looks interesting. I see one race condition when Revise.revise() takes a long time, and the source files are modified within that time. For the HTTP request not to reflect the old state of the codebase, it is necessary to put back Revise.revise() in the ReviseHandler. Also, note that you may also miss Revise.revision_event in case it runs on a different thread from your event loop.

I see that it could be useful when developing a web server with live reloading. By sending a refresh event to the browser after the asynchronous revision, it could be a cool feature to have in development.

@frankier
Copy link
Contributor

frankier commented Aug 9, 2024

I see one race condition when Revise.revise() takes a long time, and the source files are modified within that time.

Good catch! To be perfectly honest I wasn't even thinking about whether revision_event was edge or level triggered. This code was fairly carelessly cribbed from Revise.entr(...), so it looks like the problem may also exist in Revise.entr(...). Luckily it is likely to be very rare in practice, and the synchronous Revise papers over it in terms of correctness. (Robustness was my aim in including both anyway.)

Also, note that you may also miss Revise.revision_event in case it runs on a different thread from your event loop.

Can you please explain this one for me a bit more? Revise.revision_event is already fired from another thread. Is it because Condition isn't threadsafe? At least it claims not to be and that Threads.Condition should be used with locking instead. I don't know how big a problem this is, but if it is a problem, it is a problem also with Revise.jl.

In case multi-threaded Revise is not possible, might we be able to force all the Oxygen.jl + Revise.jl code onto a single thread somehow?

By sending a refresh event to the browser after the asynchronous revision, it could be a cool feature to have in development.

This is even a further step of "live" but it's not clear when it's wanted and when not, so anything like this should be somehow "opt-in". Personally I'm already used to refreshing the page so don't condition it a problem. Bonito's interactive_server https://simondanisch.github.io/Bonito.jl/stable/api.html#Bonito.interactive_server is a LiveView-style implementation of this approach.

@frankier
Copy link
Contributor

frankier commented Aug 9, 2024

Another point of information is that just-in-time (or just-too-late) Revise is what is used for REPLs. See:

https://github.com/timholy/Revise.jl/blob/13a5eb7986ee1239ff938a92043c13fee04579cc/src/packagedef.jl#L1326-L1356

So maybe it's not such a problem after all. I may be having getting slow Revise on 1.11rc2 for some reason or it might just be that a slow loading webpage is somehow a lot more irritating than pending output on a REPL.

@frankier
Copy link
Contributor

frankier commented Aug 9, 2024

I don't think thread safety is a problem in practice here since the file watching tasks run in the main thread (they are started as sticky Tasks) which is the same as the file-change trigger Revise loop. I'm probably making the race condition worse by having yield() in my loop (edited it away for now), but I think the interpreter may be able to yield during Revise.revise() so the problem seems to still be possible.

@JanisErdmanis
Copy link
Contributor

Can you please explain this one for me a bit more? Revise.revision_event is already fired from another thread. Is it because Condition isn't threadsafe? At least it claims not to be and that Threads.Condition should be used with locking instead. I don't know how big a problem this is, but if it is a problem, it is a problem also with Revise.jl.

The thread safety simply means that the condition can be waited from another thread without risking putting internal state of the condition in inconsistent state. It doesn't fix race conditions when a Condition is notified from one thread while on other it is no longer suspended on wait and hence is skipped.

This problem is not present when the notifier and waiter are on the same thread. I guess this is the case here if the task for watching file changes runs on the main thread or the notifications are channeled through one.

@frankier
Copy link
Contributor

frankier commented Aug 9, 2024

I guess first entr(...) should be fixed first if this is a problem. See this issue timholy/Revise.jl#837

@mzy2240
Copy link

mzy2240 commented Sep 7, 2024

Any updates on this? Or is there any elegant way to perform some pseudo-hot-reloading (e.g. restart the server by an external script that monitor the git history)

@JanisErdmanis
Copy link
Contributor

@mzy2240 It is, in my opinion, a solved issue, apart from proper documentation. Server restarts are unnecessary, but to use hot reloading, one must (a bit of a strong word here) develop their service in a package together with @oxidise macro. The solution #122 (comment) is still the best way to start.

@mzy2240
Copy link

mzy2240 commented Sep 7, 2024

that sounds great! if that is the case, I have two sincere suggestions:

  1. Can we add this hot reloading approach to the doc? Imo it is so important that should be documented, even if it is something a bit hacky or "experimental". We can always refine the method later.
  2. Programmatic server restart is still a desired feature to improve the overall service reliability. No matter it is a scheduled restart or controlled by external script, it definitely helps improve the experience (and maybe performance) of Oxygen.jl in the production.

@frankier
Copy link
Contributor

frankier commented Sep 7, 2024

It is, in my opinion, a solved issue, apart from proper documentation.

I don't entirely agree with this. It seems to me that if possible it should be possible to get good reloading behavior with a single function call, macro call or keyword option. We could try and design this so that it would only be usable in the case the user code was actually revisable (i.e. their project was correctly structured). I give another go at put together another pull request, but I would like to hear from @ndortega about whether they agree and what type of API they would like.

The solution #122 (comment) is still the best way to start.

Why is this a better place to start than the lower latency version I added? The race condition is benign and exists also in Revise.jl.

@JanisErdmanis
Copy link
Contributor

JanisErdmanis commented Sep 7, 2024

I don't entirely agree with this. It seems to me that if possible it should be possible to get good reloading behavior with a single function call, macro call or keyword option.

I don't really follow. From my perspective, the ReviseHandler approach already supports this. Regarding the additional features and background precompilation for revision, those, in my opinion, belong to separate issues themselves as those are enhancements. In my opinion, it's not clear enough from a user experience standpoint, for instance, whether they want to conserve their laptop battery or prefer to see knowledge delay as an enactment of their hard work. The race condition is also tricky to address, as it would occur more often for long recompilations, which is also the situation where a background recompilation could be useful. It is a tricky thing that needs to be deliberately discussed in a separate issue.

One thing that could though improve user experience is an automatic addition of ReviseHandler to the middleware in case Revise is already loaded. I suppose that could be done with package extensions. This would also make it easier to write documentation part of the Revise workflow.

Also, I forgot there should be a clear way to communicate to the user that Revises no longer works, for instance, when a type is redefined or a constant change is made. The server could exit, return some HTTP error code, or put some colour in the terminal logging as it is already in Julia REPL.

@JanisErdmanis
Copy link
Contributor

Programmatic server restart is still a desired feature to improve the overall service reliability. No matter it is a scheduled restart or controlled by external script, it definitely helps improve the experience (and maybe performance) of Oxygen.jl in the production.

A generic programmatic service seems to belong to a separate issue, but it is an interesting feature to ask for in the scope of this issue. Using Revise in production seems like a wild idea, but why not? A big issue is that restarts erase the internal service state. So, to implement service restarts, there would need to be some savestate and loadstate hooks, which seems doable.

@mzy2240
Copy link

mzy2240 commented Sep 7, 2024

A generic programmatic service seems to belong to a separate issue, but it is an interesting feature to ask for in the scope of this issue. Using Revise in production seems like a wild idea, but why not? A big issue is that restarts erase the internal service state. So, to implement service restarts, there would need to be some savestate and loadstate hooks, which seems doable.

I really like the idea of persistent state but I think it should be page 2 or 3. Do we have an existing restart solution (scheduled or external triggered) even at the risk of losing state? That would be already quite useful in some simple but reliability-first applications.

@JanisErdmanis
Copy link
Contributor

Not one I am aware of. Perhaps @ndortega can comment on the automatic restarts and whether they could be integrated within Oxygen. @mzy2240 I think you should open a new issue on the restarts and perhaps cross-reference with this one when necessary to keep this issue focused.

@frankier
Copy link
Contributor

frankier commented Sep 9, 2024

From my perspective, the ReviseHandler approach already supports this.

In that case I misunderstood you. It seems that we are probably agreed that this will be "solved issue" when it's actually solved by adding this or something similar to Oxygen (i.e. the issue is solved-in-principle).

In my opinion, it's not clear enough from a user experience standpoint, for instance, whether they want to conserve their laptop battery or prefer to see knowledge delay as an enactment of their hard work.

In terms of user experience, a delay in a REPL due to compiling is more expected than a delay in page loading, which has a significantly less responsive feel. Browsers can actually time out too I think for an especially long revise.

Another alternative or parallel solution could be to push some kind of spinner to the browser due to revising and refresh the page when it's done -- making sure to preserve whatever POST/GET data from the request.

The race condition

Maybe I'm misunderstanding you again. Which race condition are you referring to? That the revise event can be lost in case there are changes during a revision? This isn't a big problem since it will get revised later before the request. Maybe you can outline the exact scenario where you foresee problems.

clear way to communicate to the user that Revises no longer works

I agree that it would be a better user experience for those hacking and testing against a browser for some stuff to be pushed "in-band", but this is an argument for a special servedebug(...) mode, which would be not for production, but could in addition show stack traces from exceptions.

@JanisErdmanis
Copy link
Contributor

With the race condition, I meant the effect of the Revise limitation of being unable to cancel running recompilation. Let's say $T$ is the time it takes to recompile the project for some change. The probability that a change will happen within a recompilation time is proportional to $P \propto T$. If such a change happens, the user can wait $2T$ until they see their last change take effect. Hence, there is a tradeoff.

this is an argument for a special servedebug(...) mode, which would be not for production, but could in addition show stack traces from exceptions.

It would be better to make APIs deep by adding a keyword argument. Also, it would be more intuitive for the user if Oxygen recognised that Revise is loaded and added the handler accordingly.

@ndortega
Copy link
Member

I'm not entirely sure which approach would be best. At the end of the day, I think Oxygen needs auto-reload feature that works out of the box with as little "gotchas" as possible.

I personally don't use Revise.jl all that much, but I know a huge portion of the julia community does and has great success with it. I'm definitely open to either adding it as a dependency / extension to get this feature working. On the other hand, most other frameworks do complete server restarts to ensure the current running application is up to date. This has been a tried-and-true method we could use to solve this problem as well.

I haven't used Genie in a while, but I know they have a similar feature where they use Revise to watch all project files and are able to upload the handlers in real time. Granted, it's a lot easier to do this when you have a predefined folder structure to monitor. This package enforces near-zero architectural restrictions on where code comes from so we can't make as many assumptions about where their code will be stored.

I'm not sure which approach would be best, so I'd like to hear more from you all on which approach you'd prefer based on your experience you have with other web frameworks)

@JanisErdmanis
Copy link
Contributor

On the other hand, most other frameworks do complete server restarts to ensure the current running application is up to date. This has been a tried-and-true method we could use to solve this problem as well.

It is my guess that the restarting function would need to contain Revise.revise() and invokelatest to reflect the changes made within a package code. Hence, the handler approach for the code seems significantly more straightforward. It would also have the same behaviour for type redefinitions.

I haven't used Genie in a while, but I know they have a similar feature where they use Revise to watch all project files and are able to upload the handlers in real time. Granted, it's a lot easier to do this when you have a predefined folder structure to monitor.

Genie had to bolt in Revise with a watcher as it cannot develop the code within a package structure due to globals being erased at compilation time, much like Oxygen before the @oxidise macro. I don't think they would have made it that way if they could say the code could be developed with Revise straightforwardly in a package.

This package enforces near-zero architectural restrictions on where code comes from so we can't make as many assumptions about where their code will be stored.

What you may be able to do is to create a sandbox module within Oxygen and then export an includeox command which could put include statement with eval in that module. Then, the server could start from the internal module and get Revise to work that way. It also seems possible to recreate an internal module when type redefinitions happen and combine it with a server restart, but at the cost of erasing the state.

@frankier
Copy link
Contributor

frankier commented Sep 13, 2024

In terms of "complete restart", it may be possible to replace the process with a whole new Julia process with exec. This would be a lot slower probably less smooth than Revise. It might possible to use SO_REUSEADDR to make things a bit smoother. I think for most people this would be a second choice, but it could be combined with the Revise approach when Revise fails (e.g. struct definitions for now). One consequence here is that when we execute everything again and there is an exception introduced in top level code (e.g. a syntax error) it might be difficult to recover and keep things executing when it is fixed without also having a parent process monitoring everything and restarting.

I'm not sure if there's perfect behavior here, more like a convenience which helps improve coding flow a bit 90% of the time, which is why I'm tempted to think Revise is the sweet spot despite a few caveats, especially since it's behaviour is reasonably well understood within the broader Julia community.

My preference would therefore be to have a debug server mode that is either only available when @oxidise is used, or make it available in other cases, but issue a warning that Revise will only affect code in modules and therefore not the main route handler code.

@frankier
Copy link
Contributor

Shall I put together another PR as a strawman, or does someone else want to? It feels like that might get us further at this point. (Asking because @ndortega was the last to express interest.)

@JanisErdmanis
Copy link
Contributor

JanisErdmanis commented Sep 16, 2024

My preference would therefore be to have a debug server mode that is either only available when @oxidise is used, or make it available in other cases, but issue a warning that Revise will only affect code in modules and therefore not the main route handler code.

I would be in favour of implicit behavior using Revise loading in the main as condition under which the handler is added. This condition could be explicitly disabled with keyword argument when needed. A good way to do this seems to be is to check Main.Revise binding to be defined and add the coresponding handler then.

@ndortega
Copy link
Member

Shall I put together another PR as a strawman, or does someone else want to? It feels like that might get us further at this point. (Asking because @ndortega was the last to express interest.)

@frankier, please feel free to open up a pr here. I agree that going with a Revise-first approach makes the most sense right now - if it really isn't sufficient for most workflows then we could look into adding in a server-restart later.

Personally, I'm interested in getting this functionality added directly to the code (instead of through a Revise package extension) so that people could toggle the revision of their code out of the box with a dev / reload / revise flag. Adding revise as a dependency would go well with the next minor version upgrade where I'll be focusing on upgrading the min julia version to 1.10 (new LTS) and upgrading the existing package extensions (without Requires.jl). Any dependency change will require a minor version bump anyways so I might as well do them both at the same time.

Whatever the implementation looks like, having a clean way to toggle this behavior is my biggest priority. So far, I've avoided building in this direction because my early attempts felt super hacky and required way too much work from the user's perspective.

@frankier frankier linked a pull request Sep 18, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants