jupyter-incubator/kernel_gateway incorporation proposal #12

parente · 2016-02-12T21:38:47Z

This PR proposes graduating the kernel_gateway project from jupyter-incubator. I ran this by @rgbkrk who is the advisor/mentor/sponsor on the incubator proposal before opening. I'd like to open it up for comments.

I'll post this to the jupyter google group soon.

cc'ing folks who commented on the original incubator proposal: @jasongrout @minrk @ellisonbg @sccolbert @blink1073 @freeman-lab @damianavila @Carreau @fperez @ahmadia @lbustelo @vinomaster

(c) Copyright IBM Corp. 2016

rgbkrk · 2016-02-12T21:49:19Z

The kernel gateway is now active and used in a variety of projects while also being the perfect ahem gateway for O'Reilly and others to use for interactive content within the narrative on a page.

jdfreder · 2016-02-14T15:34:51Z

+1

fperez · 2016-02-15T04:01:58Z

jupyter-kernel-gateway-incorporation/jupyter-kernel-gateway-incorporation.md

+1. They spawn kernels using provisioning APIs that run separate, and often remote, from the user-facing web applications themselves.
+2. They communicate with kernels using Websockets rather than directly with ZeroMQ.
+
+Originally, these applications spawned entire Jupyter Notebook servers in order to request kernels using notebook HTTP API and communicate with them over Websockets. This approach is less than ideal, however, for various reasons:


I'm not sure this is 100% correct, I think that thebe was a kernel-only design from the start. @odewahn and his team created it after we had detailed discussions about avoiding the use of the notebook for their needs, and instead building something that would be non-notebook based.

But in any case, it's a minor comment. The main point stands, and is certainly the case for other projects. I'm mostly curious about the answer :)

@odewahn and his team created it after we had detailed discussions about avoiding the use of the notebook for their needs, and instead building something that would be non-notebook based.

I was making a statement about the backend of Thebe, not the frontend JS that lives in the browser, and then a statement only about what I see on the demos site for Thebe, not anywhere else Thebe might be deployed. I can clarify in the text.

Visit https://oreillymedia.github.io/thebe/examples/matplotlib.html

Open the HTML source.

Note the URL used by Thebe: https://oreillyorchard.com:8000

Visit that URL directly.

See tmpnb plus full notebook UIs.

If you use the network inspector in the browser instead of the static source, you can see the exact tmpnb notebook instance URL assigned to you as you run the example on the HTML page. Under the covers, Thebe is only using the programmatic API to request and communicate with kernels. But the full notebook server is still running somewhere.

Tried to make this a bit clearer in the latest push.

fperez · 2016-02-15T04:58:38Z

I'm very supportive of this idea, and would like to commend @parente, @Lull3rSkat3r and team for a very solid job.

At our upcoming dev meeting I'd like to go over this project in more detail, to see how much of what's in here we could reuse further across the entire Jupyter architecture. I would also like to discuss whether it would make sense to have this structured in two layers, one that operates directly over zmq but provides most of this functionality for non-http clients, and then the http layer on top with the REST api. This is only a half-baked idea right now, and I could be wrong in whether it makes sense, but I mention it just to highlight that I see a lot of potential value in this becoming core functionality in the future.

A few comments:

I wish the docstrings across the codebase were documented according to our guidelines (which basically boil down to 'use the [numpy docstring standard]'(https://github.com/numpy/numpy/blob/master/doc/example.py)). The key point there is to describe parameters in function signatures (such as this one).

But this is minor enough that it can be corrected in-project, I am not suggesting holding back the graduation on this.
I also think it would be better to eventually move the docs to the integrated ReadTheDocs deployment; the reason isn't that it doesn't fit in a single markdown file, but rather that eventually we want users to find a comprehensive and coherent set of docs on that site. If key pieces of the architecture only have standalone READMEs on github, it's harder for the users to get the whole project-wide picture.
I'd suggest adding a test code coverage report to the repo. The code looks well tested, but it's best to have the code testing badges and coverage reports directly linked in the main README (like the IPython one does). This both makes the information more readily available and encourages testing coverage improvements over time.

Overall, I am very happy with this. I think it's a great addition to the project. Feel free to make issues of the above points so you don't forget after graduation, but they aren't show stoppers right now.

parente · 2016-02-15T13:15:34Z

At our upcoming dev meeting I'd like to go over this project in more detail, to see how much of what's in here we could reuse further across the entire Jupyter architecture. I would also like to discuss whether it would make sense to have this structured in two layers, one that operates directly over zmq but provides most of this functionality for non-http clients, and then the http layer on top with the REST api.

As we were working on KG, I've quietly wondered if furthering the abstraction inversion that kernel gateway represents is really a good idea. As it stands, kernel gateway is exposing a subset of functionality of the notebook server, and layering on its own. It works, and it avoids disrupting the notebook server project, but from a design standpoint, it's less than ideal.

If we (the entire Jupyter community) were starting from scratch today with 20-20 hindsight about the popularity of the notebook server API, the many use cases for kernels, and the fact that the notebook UI is just one potential client of these APIs, I suspect there would be a project akin to jupyter/kernel-server and another called something like jupyter/notebook-server. The code for exposing the programmatic APIs would live in the former and the latter would contain notebook-specific backend functionality. The tools listed in this proposal could use the former while thejupyter/notebook-server (or other notebook-like tools) could use the handlers, managers, and so on from the latter.

I'm not suggesting that we go hack up the notebook server backend in this manner. In fact, I doubt my strawman split would be as clean as it sounds. But as you say, a discussion of how to make the kernel gateway concept cleaner for reuse in new projects warrants discussion.

EDIT: clarification of we

parente · 2016-02-15T13:17:14Z

I wish the docstrings across the codebase were documented according to our guidelines (which basically boil down to 'use the [numpy docstring standard]'(https://github.com/numpy/numpy/blob/master/doc/example.py)). The key point there is to describe parameters in function signatures (such as this one).

I did not realize there was a docstring standard in use. My bad for not noticing!

Issue opened to correct: jupyter-server/kernel_gateway#100

parente · 2016-02-15T13:21:45Z

I also think it would be better to eventually move the docs to the integrated ReadTheDocs deployment; the reason isn't that it doesn't fit in a single markdown file, but rather that eventually we want users to find a comprehensive and coherent set of docs on that site. If key pieces of the architecture only have standalone READMEs on github, it's harder for the users to get the whole project-wide picture.

Makes sense if that's where all of the doc is ultimately headed.

Are all the other Jupyter projects documented using Sphinx on RTD? Or have you considered the Mkdocs support on RTD as well (http://read-the-docs.readthedocs.org/en/latest/builds.html#mkdocs)? I have no problem using Sphinx, but Mkdocs is attractive (it's just markdown!) when you don't need all the features Sphinx has to offer.

Either way, issue opened: jupyter-server/kernel_gateway#101

parente · 2016-02-15T13:23:30Z

I'd suggest adding a test code coverage report to the repo. The code looks well tested, but it's best to have the code testing badges and coverage reports directly linked in the main README (like the IPython one does). This both makes the information more readily available and encourages testing coverage improvements over time.

Agreed. Tracked as part of jupyter-server/kernel_gateway#14

#12 (diff) (c) Copyright IBM Corp. 2016

minrk · 2016-02-15T13:49:32Z

👍 on promotion. I think this project has gone really well, and provides a good example of derivative applications of the notebook. I look forward to seeing applications like Thebe be simplified by depending on this, especially when based on jupyter-js-services instead of the bundled notebook js.

The only criticism I have is that the notebook-as-service REST API doesn't really seem to be part of the "kernel gateway" idea, but it is part of the app here. That seems like an almost entirely unrelated application. How do you see the notebooks-as-microservices in relation to my understanding of this as "just the kernels service, no notebooks"?

parente · 2016-02-15T14:00:08Z

The only criticism I have is that the notebook-as-service REST API doesn't really seem to be part of the "kernel gateway" idea, but it is part of the app here. That seems like an almost entirely unrelated application. How do you see the notebooks-as-microservices in relation to my understanding of this as "just the kernels service, no notebooks"?

This is the self-criticism I was after, too, with the one con I listed at the bottom:

Con: The kernel gateway does not yet support a clean interface for plugging in additional APIs, transports, and protocols for accessing kernels. The two existing modes are "baked in".

We added the microservice mode here for two reasons:

To show that the kernel gateway could grow to be a place to enable more than one way to access kernels (e.g., Jupyter protocol via Websocket API, RESTful HTTP via notebook-defined API)
Because we were in incubator and it was convenient to just add it here instead of maintaining yet another project while getting things running ;)

I do think a separation of the two is in order, either by splitting them into separate projects or turning the kernel gateway into something with a pluggable API. I wanted to wait on taking any action until having a discussion with the broader community on which approach (or another) makes sense.

minrk · 2016-02-15T15:52:41Z

Makes perfect sense, thanks. I like the goal of making it extensible like that, and the microservice seems like a great test case, whether you do it as a demo of extensibility within the gateway repo, or as a separate package (there's a nonzero cost to splitting repos, as we have learned).

fperez · 2016-02-15T20:18:04Z

@parente, thanks for your thoughtful replies. This sounds indeed like perfect material for our upcoming dev meeting (agenda coming later today for you :). I agree with the line of thought you described above regarding architectural separation, now seems like the right time to dig into that.

I think it's OK to leave that microservice in there for now. I'm with @minrk in having become a bit more hesitant to 'split everything' immediately, given the real-world cost of many repos everywhere. Maybe simply flagging very explicitly in the docs that it's meant mostly as demo/test/illustration code will be enough.
Re. the docs, I think (but @willingc can correct me, she's the one who has worked the most on this) that we're using sphinx throughout, but with Markdown support. That lets us share toolchains more easily across all subprojects, without requiring users to jump into all the complexity of reST when not needed. Does that seem like an OK compromise to you?

willingc · 2016-02-15T20:54:43Z

@fperez Regarding the docs, the highlights are:

Sphinx 1.3 for project documentation with doc files in Markdown and RestructuredText in combination
Project documentation is currently hosted on Read The Docs
Common docs directory structure for Sphinx and suggested table of contents elements (this makes it much simpler across projects and gives a similar look and feel)
Automated builds with Travis or CircleCI and RTD builds
Utilizing Sphinx's tools for checking links and spelling

Sphinx 1.4 alpha1 was just released today and it has some additional flexibility for themes and deploys (as well as an extension for easy deploy to GitHub pages).

Right now, I'm playing around with some Bootstrap/Bootswatch themes that are more visually appealing and flexible than the current Sphinx RTD theme. These themes still use Sphinx for building the documentation but I'm hoping that the themes will be able to give us more flexibility with linking to docs created with other tools (swagger for APIs, JS docs, as well as using notebooks for docs more effectively).

I agree with @parente that mkdocs is visually more appealing than Sphinx out of the box. Yet, the more modern responsive themes in Sphinx will give us the same visual appeal with Sphinx's richer Table of Contents and build tools.

Here's a test version of a Sphinx docs using a Bootswatch theme which is only modified in the Sphinx conf.py configuration settings: http://test-notebook.readthedocs.org/en/sphinx-alpha/index.html

@parente I'm happy to work with you on the docs and iterate ideas which is what @minrk and I did when creating the JupyterHub docs.

fperez · 2016-02-15T21:09:59Z

Thanks for that doc toolchain update, @willingc! Given that setup, I think it makes sense for this project's docs to move in that direction as well. BTW, I'm also planning at the dev meeting on allocating a good chunk of time to doc issues, so @willingc can update us all on this infrastructure, and we can find a good common setup that works across our diverse subprojects (including the python/js differences).

parente · 2016-02-16T20:36:20Z

@fperez @willingc

Re. the docs, I think (but @willingc can correct me, she's the one who has worked the most on this) that we're using sphinx throughout, but with Markdown support. That lets us share toolchains more easily across all subprojects, without requiring users to jump into all the complexity of reST when not needed. Does that seem like an OK compromise to you?

I'm perfectly fine with doing all the docs using Sphinx. I was asking only to see what direction the community is heading. Answer: Sphinx. :)

willingc · 2016-02-16T20:45:12Z

@parente Great! Ping me as needed 🌊

parente · 2016-02-17T03:35:36Z

@willingc Will definitely reach out when we tackle the issue about docs and have a draft in hand to noodle over.

fperez · 2016-02-17T17:53:02Z

@parente, today at the dev meeting I encouraged folks to pitch in with feedback here, so this doesn't linger unnecessarily. Let's give it a few days, but if it stalls and I forget, don't hesitate to ping me!

fperez · 2016-02-24T02:35:01Z

Sorry, I think I said we'd ping the full list too, but I just saw I hadn't (and nobody else did :). I just sent a list-wide ping on this, let's give it a few days and wrap this up. No point in waiting much longer...

parente · 2016-02-24T03:15:53Z

@fperez I think there was a post from @captainsafia (or her IFTTT bot?) when I opened the proposal:

https://groups.google.com/forum/#!topic/jupyter/WxF8M3IFktY

fperez · 2016-02-24T03:21:21Z

Ah, sorry! I missed that, goes to show how swamped I am that I'm forgetting these key things. Yes, @captainsafia has done a terrific job keeping the list appraised of key PRs. This only gives us extra reasons for not waiting too much. We can consider my email a "last call" rather than a first one, and act without unnecessary delay.

parente · 2016-02-24T03:25:46Z

One sidebar that we need to consider: there's a https://github.com/jupyter-incubator/kernel_gateway_demos repo containing examples using the KG proper. If this proposal passes muster, what should happen to that ancillary repo?

I'd hate to pollute the top-level jupyter org with yet-another-repo. I'd also hate to fold them into the kernel_gateway repo and clutter it up. Maybe they stay in incubator?

rgbkrk · 2016-02-24T03:34:46Z

I'd just bring them over, the problem tends to be discovery and not the sheer volume of repositories.

fperez · 2016-02-24T04:07:57Z

I also would bring them over. Even more so than discovery, I think the main
challenge is dependency and deployment management with many interlinked
repos. In this case being just examples, the dependency line is simple
enough not to worry me

On Tue, Feb 23, 2016, 19:34 Kyle Kelley notifications@github.com wrote:

I'd just bring them over, the problem tends to be discovery and not the
sheer volume of repositories.

—
Reply to this email directly or view it on GitHub
#12 (comment)
.

minrk · 2016-02-24T13:35:56Z

👍 to bringing the demos over as well. I don't think more repos on the jupyter org is a big issue.

Eventually, I think we should probably make a few more orgs for things (I've been wishing JuptyerHub was an org for a while)

Address comment #12 (comment) (c) Copyright IBM Corp. 2016

fperez · 2016-03-02T06:23:52Z

OK, I think we've had enough time for feedback on this, and there hasn't been any negative one.

Last call for folks who commented on the original incubator proposal to pitch in. If we don't hear anything further in the next day or two, we should call this done and move forward with the "graduation" process. I feel like we're reaching the point of stalling, and I want to make sure these things move forward. We don't want the incubation process to be a burden.

I'd particularly value an explicit +/-1 from the council folks so we record your opinion directly, at this point we have 4 +1 votes from SC members. We can consider silence as acquiescence, but an explicit word from you all would be even better...

Pinging remaining SC members... @jasongrout @ellisonbg @damianavila @Carreau @jhamrick @takluyver.

jhamrick · 2016-03-02T07:23:18Z

👍 -- looks great to me!

fperez · 2016-03-02T07:33:25Z

Thanks @jhamrick! That puts us at 5 explicit +1's from the SC, plus no explicit negatives after almost three weeks of being open for input.

takluyver · 2016-03-02T10:35:12Z

No objection from me.

fperez · 2016-03-02T17:27:52Z

Thanks @takluyver!

rgbkrk · 2016-03-02T18:12:49Z

Thanks for raising this up @fperez!

ellisonbg · 2016-03-03T20:23:00Z

+1

fperez · 2016-03-04T18:17:30Z

Thanks @ellisonbg!!

OK, I'm calling this one now: we have 7 votes in favor from the SC and zero negative feedback.

@parente and team, congratulations and many thanks!!

Let's now proceed with the incorporation steps as indicated here.

I'm going to be traveling for a couple of days, so I don't want to be the bottleneck on those easy steps. But @Carreau kindly offered to help out with the repo/team transition.

BTW, I want to apologize for the time this took. I think we should set a time limit for feedback/discussion/decision in the future. I've made a PR to that effect.

Again, congrats to the entire IBM team for this, and many thanks to everyone who participated! We look forward to your continued engagement with the project, and welcome to the official team :)

Carreau · 2016-03-04T18:43:43Z

I'm going to be traveling for a couple of days, so I don't want to be the bottleneck on those easy steps. But @Carreau kindly offered to help out with the repo/team transition.

I've create a team with the member of this repo on the Jupyter org. @parente should be able to administrate the team. let me know if I forgot you.

I'll try to see how to move the repo from one organisation to another if no object. Unless any other owner want to do it.

If it is moved the issues and the rest should be at the asme time and GitHub should automatically setup redirect so you shouldn't have to re-clone or change anything to your workflow.

I would really like to thanks all of those who were involved in these discussions, I've been mostly managing things online and I am missing a lot of time to actually contribute code to many project, often even missing discussion altogether.

Would any of the people involved in this like to write a blog post annouce on blog.jupyter.org ?
It's markdown so you can strart drafting something anywhere, and tell me, I can make you a guest account on the blog, so that you can appear as the author when it is published !

Thanks !

parente · 2016-03-04T18:50:16Z

Thanks @fperez and everyone else for moving this along. I'm looking forward to discussing how to continue to mature the KG concept at the dev meeting.

@Carreau:

I'll try to see how to move the repo from one organisation to another if no object. Unless any other owner want to do it.

Thanks and feel free to reassign it in the settings. In the past, I've found that not only will the web URL redirect, but so will the git command line (with a friendly reminder to update remotes.) So it should be pretty seamless.

Would any of the people involved in this like to write a blog post annouce on blog.jupyter.org ?

I can draft a short write-up about what the KG is and where it's (likely) headed based on the text of this proposal if that's what you'd like. If you're looking for something to highlight the graduation from incubator, @fperez is probably the better author.

Carreau · 2016-03-04T19:03:09Z

I can draft a short write-up about what the KG is and where it's (likely) headed based on the text of this proposal if that's what you'd like

That's seem great. Shoudl I make you an account with your gmail address ?

F. will likely not have time to write about graduation process.

rgbkrk · 2016-03-04T19:03:53Z

I'm happy to move them as soon as we merge this. Since I was the sponsor and involved in the project, I assumed it wouldn't be me to merge it in.

Carreau · 2016-03-04T19:15:14Z

I'm happy to move them as soon as we merge this. Since I was the sponsor and involved in the project, I assumed it wouldn't be me to merge it in.

I'll read one last time and merge this afternoon.

Carreau · 2016-03-04T19:15:30Z

(unless somebody beats me to it)

fperez · 2016-03-06T17:51:02Z

OK, I'm merging this now. I'm disappearing for a few days on travel, and I don't want this to linger. Thanks everyone!

Graduate the [Kernel Gateway project](jupyter-incubator/kernel_gateway) from the incubation stage into being an official Jupyter project. Further development will happen in the Jupyter org, once repos have been moved and teams created.

minrk · 2016-03-06T20:28:44Z

🎓 🎉 🍰

rgbkrk · 2016-03-07T00:32:47Z

🎉 Moved the two repositories over 🎉

Thanks all!

parente · 2016-03-07T00:35:32Z

@rgbkrk Thanks! Can you check my perms? I seem to have lost settings and merge privileges.

rgbkrk · 2016-03-07T00:41:03Z

Definitely. I saw a team already existed with kernel gateway so I assigned it to that.

willingc · 2016-03-07T00:42:15Z

Well that was trippy. Submitted a PR and the repo changed. @rgbkrk 😮

rgbkrk · 2016-03-07T00:43:33Z

@parente Fixed the kernel gateway team up and bumped your perms. The default when adding repos to a team is "read" access, which is just about the silliest thing in an open source repository.

@willingc 😲

damianavila · 2016-03-15T20:22:34Z

My belated (post-vacation) 👍 to this, congrats @parente and others who work on this.

parente added 2 commits February 12, 2016 10:37

fperez reviewed Feb 15, 2016
View reviewed changes

parente mentioned this pull request Feb 15, 2016

Rewrite docstrings in numpy syntax jupyter-server/kernel_gateway#100

Closed

parente mentioned this pull request Feb 15, 2016

Create Sphinx docs for ReadTheDocs deploy jupyter-server/kernel_gateway#101

Closed

betatim mentioned this pull request Feb 24, 2016

Firming up the idea everpub/openscienceprize#9

Open

Clarify statement about use of notebook server

8dc7dce

Address comment #12 (comment) (c) Copyright IBM Corp. 2016

fperez mentioned this pull request Mar 4, 2016

Add specific guidance about time for evaluation of incubation proposals. jupyter/governance#15

Merged

fperez merged commit 3e2be10 into jupyter:master Mar 6, 2016

srl295 pushed a commit to srl295/jupyter-enhancement-proposals that referenced this pull request Jun 10, 2016

Fix missing links

48e4b23

jupyter#12 (diff) (c) Copyright IBM Corp. 2016

srl295 pushed a commit to srl295/jupyter-enhancement-proposals that referenced this pull request Jun 10, 2016

Clarify statement about use of notebook server

2234239

Address comment jupyter#12 (comment) (c) Copyright IBM Corp. 2016

jupyter-incubator/kernel_gateway incorporation proposal #12

jupyter-incubator/kernel_gateway incorporation proposal #12

Conversation

parente commented Feb 12, 2016

rgbkrk commented Feb 12, 2016

jdfreder commented Feb 14, 2016

fperez Feb 15, 2016

Choose a reason for hiding this comment

parente Feb 15, 2016

Choose a reason for hiding this comment

parente Feb 25, 2016

Choose a reason for hiding this comment

fperez commented Feb 15, 2016

parente commented Feb 15, 2016

parente commented Feb 15, 2016

parente commented Feb 15, 2016

parente commented Feb 15, 2016

minrk commented Feb 15, 2016

parente commented Feb 15, 2016

minrk commented Feb 15, 2016

fperez commented Feb 15, 2016

willingc commented Feb 15, 2016

fperez commented Feb 15, 2016

parente commented Feb 16, 2016

willingc commented Feb 16, 2016

parente commented Feb 17, 2016

fperez commented Feb 17, 2016

fperez commented Feb 24, 2016

parente commented Feb 24, 2016

fperez commented Feb 24, 2016 via email

parente commented Feb 24, 2016

rgbkrk commented Feb 24, 2016

fperez commented Feb 24, 2016

minrk commented Feb 24, 2016

fperez commented Mar 2, 2016

jhamrick commented Mar 2, 2016

fperez commented Mar 2, 2016

takluyver commented Mar 2, 2016

fperez commented Mar 2, 2016

rgbkrk commented Mar 2, 2016

ellisonbg commented Mar 3, 2016

fperez commented Mar 4, 2016

Carreau commented Mar 4, 2016

parente commented Mar 4, 2016

Carreau commented Mar 4, 2016

rgbkrk commented Mar 4, 2016

Carreau commented Mar 4, 2016

Carreau commented Mar 4, 2016

fperez commented Mar 6, 2016

minrk commented Mar 6, 2016

rgbkrk commented Mar 7, 2016

parente commented Mar 7, 2016

rgbkrk commented Mar 7, 2016

willingc commented Mar 7, 2016

rgbkrk commented Mar 7, 2016

damianavila commented Mar 15, 2016