-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unrecognized alias: '--profile=xxx', it will probably have no effect. #309
Comments
Sorry, missed that in the examples. Fixed by #310. |
Out of curiosity, and to possibly clear up some confusion that I have seen on stackoverflow and such, how would one now specify startup initialization type options for Jupyter? A specific scenario I am thinking of is with pySpark. |
See the ML discussion: but you shouldn't need a profile for that. PySpark can just be a kernel, if you really want it to be. |
Hi. I am trying to create a profile for pyspark too. Could you please tell me how to proceed ? Thanks |
There is no notion of profile in jupyter and for the notebook. It's roughly like asking to dual boot a computer because you want to use vim and emacs, As stated in the mailing list thread, you can if you like, it would be something like
If should auto create the needed files in You most likely just want a separate kernel, or just import pySpark as a library. Still without knowing more of what you want to do it's hard to give you an answer... |
I would like to use pySpark in the ipython notebook. Either by calling it On 25 August 2015 at 11:00, Matthias Bussonnier notifications@github.com
|
Ok, here is what I just did durring the last 1/2 h, for me on OS X
enter the following: import findspark
import os
findspark.init()
import pyspark
sc = pyspark.SparkContext()
lines = sc.textFile(os.path.exapnduser('~/dev/ipython/setup.py'))
lines_nonempty = lines.filter( lambda x: len(x) > 0 )
lines_nonempty.count() execute :
Yayyyyy ! |
(Note, installing/downloading java took 20 minutes) |
After running: import pyspark I get this error:
|
You should get this error if you get the wrong java (the 60M download instead of the 200+M download) |
I actually got jdk-8u60-macosx-x64.dmg which is 238.1 MB. Maybe I should restart the machine |
hum, I did not had to restart IIRC. |
does the following works?
Python 2.7 or 3 ? |
It works now: In [2]: sc Thanks a lot for your help. I've spent a loooong time trying to fix this |
🍰 🍸 🎉 ! Happy Sparking ! |
@vherasme What did you do to make it work in the end? Thanks! |
I followed the steps @Carreau recommends above: .....
enter the following: import findspark I also had these two in .bash_profile: export SPARK_HOME="/Users/victor/Downloads/spark-1.4.1" On 26 August 2015 at 20:16, Aaron Schumacher notifications@github.com
|
I hope this change of policy about profiles is mentioned (more explicitly?) in the docs. I tried to put up a server on an Amazon EC2 image but following the instructions on the ipython docs didn't work because with ipython==4.0 it no longer accepted the --profile option. |
IPython 4.0 still have profile, you are just mistaking If you want different configuration for the notebook, you need to set the jupyter config dir environement variable, if you want profile for your kernel, you can set it in your kernelspec. |
I tried both I think a separate tutorial for setting up jupyter remote server would
|
How did you get help to give you hints about profile ? And again, --profile does not work with the notebook application, only with ipython/ipython kernel. |
I did Right I get it now that --profile no longer works with the notebook, but I'm saying the doc should be made clearer so that the in the future, people switching from lower versions of ipython shouldn't have to look far to get an answer. For example, if I google 'set up remote server jupyter' the first result is http://ipython.org/ipython-doc/1/interactive/public_server.html, and nowhere in there does it say that --profile no longer works for ipython/jupyter 4. Indeed, one of the instructions is "You can then start the notebook and access it later by pointing your browser to Other top results are about jupyter hub, which requires python3. I don't think I saw a single mention that the --profile option no longer works for ipython/jupyter 4 among them. Maybe you guys wrote a doc, but google is just being dumb for the moment. Nevertheless I never find it, and I searched for a long time before finding this issue posted here. |
O_o do you have both IPython 4.x and notebook 4.x ?
Well it's hard to bias google. For whatever reason people are still looking referencing docs for 1.0 and google put it on top. We'll try to find a solution. |
I had ipython 4 initially but that kept giving errors as I said, so I On Tue, Sep 1, 2015 at 3:43 AM, Matthias Bussonnier <
|
Is there a way to avoid typing following code in each notebook: and just make sure whenever you launch notebook it already hooked up the spark? It isn't too hard but it feels like a jury-rigging which i hate. |
You can add it to a startup file, e.g. |
I got the same issue, and the steps from @vherasme didn't work. Python 2.7.10 |
@wlsherica, I had that same issue. For me, this was being caused by a bad spark configuration. Specifically, I had: export PYSPARK_SUBMIT_ARGS="--master local[2]" So I just removed that. |
@Carreau amazing work. thanks so much. Carreau commented on 25 Aug 2015 Install apache-spark ($ brew install apache-spark) import findspark import pyspark 221 |
@Carreau thanks your answer |
Thanks @Carreau for the step-by-step instructions! Stumbled upon this issue when following instructions for IPython 3.x In case anyone want more detailed instruction and explanation, I have wrote http://flummox-engineering.blogspot.com/2016/01/how-to-configure-ipython4-for-apache-spark.html |
Using the findspark setup, are you able to use jars which are added via SparkConf
gets loaded when SparkContext is started:
still:
|
using the shell in spark tutorial is also a good solution to the issue. |
I am having the problem that I am running spark jobs in a hadoop cluster triggered by Jupyter notebook. The problem is that each cell of code consumes the number of configured executors but they are never released. so after a number of executed cell blocks all the resources of the cluster are blocked. Has anyone had this problem? |
The --profile option in jupyter appears to be ignored now when it's run with the notebook command. The usage for it still lists:
Examples
The text was updated successfully, but these errors were encountered: