Unrecognized alias: '--profile=xxx', it will probably have no effect. #309

k-dahl · 2015-08-18T03:20:29Z

The --profile option in jupyter appears to be ignored now when it's run with the notebook command. The usage for it still lists:

Examples

ipython notebook                       # start the notebook
ipython notebook --profile=sympy       # use the sympy profile
ipython notebook --certfile=mycert.pem # use SSL/TLS certificate

The text was updated successfully, but these errors were encountered:

minrk · 2015-08-18T04:58:58Z

Sorry, missed that in the examples. Fixed by #310.

k-dahl · 2015-08-19T16:54:12Z

Out of curiosity, and to possibly clear up some confusion that I have seen on stackoverflow and such, how would one now specify startup initialization type options for Jupyter?

A specific scenario I am thinking of is with pySpark.

Carreau · 2015-08-19T18:21:33Z

See the ML discussion:
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/jupyter/7q02jjksvFU

but you shouldn't need a profile for that. PySpark can just be a kernel, if you really want it to be.
a kernel is basically just a way of launching a process, so it can be different envs, or different location.
some national lab have name kernel depending on which physical machine the process will run for example.

vherasme · 2015-08-25T08:50:18Z

Hi. I am trying to create a profile for pyspark too. Could you please tell me how to proceed ? Thanks

Carreau · 2015-08-25T09:00:33Z

There is no notion of profile in jupyter and for the notebook.

It's roughly like asking to dual boot a computer because you want to use vim and emacs,
and get 2 hard drive just to set your $EDITOR differently.

As stated in the mailing list thread, you can if you like, it would be something like

$ JUPTYER_CONFIG_DIR=~/jupyter_pyspark_foo jupyter notebook

If should auto create the needed files in ~/jupyter_pyspark_foo, but it is likely not what you want.

You most likely just want a separate kernel, or just import pySpark as a library. Still without knowing more of what you want to do it's hard to give you an answer...

vherasme · 2015-08-25T09:23:13Z

I would like to use pySpark in the ipython notebook. Either by calling it
as a library or by creating a profile/kernel/ etc.

On 25 August 2015 at 11:00, Matthias Bussonnier notifications@github.com
wrote:

There is no notion of profile in jupyter and for the notebook.

It's roughly like asking to dual boot a computer because you want to use
vim and emacs,
and get 2 hard drive just to set your $EDITOR differently.

As stated in the mailing list thread, you can if you like, it would be
something like

$ JUPTYER_CONFIG_DIR=~/jupyter_pyspark_foo jupyter notebook

If should auto create the needed files in ~/jupyter_pyspark_foo, but it
is likely not what you want.

You most likely just want a separate kernel, or just import pySpark as a
library. Still without knowing more of what you want to do it's hard to
give you an answer...

—
Reply to this email directly or view it on GitHub
#309 (comment).

Carreau · 2015-08-25T09:41:03Z

Ok, here is what I just did durring the last 1/2 h, for me on OS X

Install apache-spark ($ brew install apache-spark)
install findspark ( pip install -e . after cloning https://github.com/minrk/findspark, and cd findspark)
install java (from here)
fire a notebook (jupyter notebook)

enter the following:

import findspark
import os
findspark.init()

import pyspark
sc = pyspark.SparkContext()
lines = sc.textFile(os.path.exapnduser('~/dev/ipython/setup.py'))
lines_nonempty = lines.filter( lambda x: len(x) > 0 )
lines_nonempty.count()

execute :

Yayyyyy !

Carreau · 2015-08-25T09:44:49Z

(Note, installing/downloading java took 20 minutes)

vherasme · 2015-08-26T07:17:22Z

After running:
import findspark
import os
findspark.init()

import pyspark
sc = pyspark.SparkContext()

I get this error:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-1-0e2dcc62fef1> in <module>()
      4 
      5 import pyspark
----> 6 sc = pyspark.SparkContext()

/Users/victor/Downloads/spark-1.4.1/python/pyspark/context.pyc in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    108         """
    109         self._callsite = first_spark_call() or CallSite(None, None, None)
--> 110         SparkContext._ensure_initialized(self, gateway=gateway)
    111         try:
    112             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

/Users/victor/Downloads/spark-1.4.1/python/pyspark/context.pyc in _ensure_initialized(cls, instance, gateway)
    227         with SparkContext._lock:
    228             if not SparkContext._gateway:
--> 229                 SparkContext._gateway = gateway or launch_gateway()
    230                 SparkContext._jvm = SparkContext._gateway.jvm
    231 

/Users/victor/Downloads/spark-1.4.1/python/pyspark/java_gateway.pyc in launch_gateway()
     87                 callback_socket.close()
     88         if gateway_port is None:
---> 89             raise Exception("Java gateway process exited before sending the driver its port number")
     90 
     91         # In Windows, ensure the Java child processes do not linger after Python has exited.

Exception: Java gateway process exited before sending the driver its port number

Carreau · 2015-08-26T08:26:41Z

You should get this error if you get the wrong java (the 60M download instead of the 200+M download)

vherasme · 2015-08-26T08:41:06Z

I actually got jdk-8u60-macosx-x64.dmg which is 238.1 MB. Maybe I should restart the machine

Carreau · 2015-08-26T08:47:32Z

hum, I did not had to restart IIRC.

Carreau · 2015-08-26T08:48:17Z

does the following works?

$ java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)

Python 2.7 or 3 ?

vherasme · 2015-08-26T08:50:45Z

It works now:

In [2]: sc
Out[2]: pyspark.context.SparkContext at 0x106296cd0

Thanks a lot for your help. I've spent a loooong time trying to fix this

Carreau · 2015-08-26T08:51:28Z

🍰 🍸 🎉 !

Happy Sparking !

ajschumacher · 2015-08-26T18:16:06Z

@vherasme What did you do to make it work in the end? Thanks!

vherasme · 2015-08-27T05:58:11Z

I followed the steps @Carreau recommends above:

.....

Install apache-spark ($ brew install apache-spark) In my case I had
Spark installed already
install findspark ( pip install . after cloning
https://github.com/minrk/findspark, and cd findspark)
install java (from here
http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)
This must be java 1.8.0_60
fire a notebook (jupyter notebook)

enter the following:

import findspark
findspark.init()
import pyspark
sc = pyspark.SparkContext()

I also had these two in .bash_profile:

export SPARK_HOME="/Users/victor/Downloads/spark-1.4.1"
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"

On 26 August 2015 at 20:16, Aaron Schumacher notifications@github.com
wrote:

@vherasme https://github.com/vherasme What did you do to make it work
in the end? Thanks!

—
Reply to this email directly or view it on GitHub
#309 (comment).

eulerreich · 2015-08-31T08:25:30Z

I hope this change of policy about profiles is mentioned (more explicitly?) in the docs. I tried to put up a server on an Amazon EC2 image but following the instructions on the ipython docs didn't work because with ipython==4.0 it no longer accepted the --profile option.

Carreau · 2015-08-31T12:33:50Z

I hope this change of policy about profiles is mentioned (more explicitly?) in the docs. I tried to put up a server on an Amazon EC2 image but following the instructions on the ipython docs didn't work because with ipython==4.0 it no longer accepted the --profile option.

IPython 4.0 still have profile, you are just mistaking Notebook for IPython.

If you want different configuration for the notebook, you need to set the jupyter config dir environement variable, if you want profile for your kernel, you can set it in your kernelspec.

eulerreich · 2015-08-31T23:01:17Z

I tried both ipython notebook --profile=xxx and jupyter notebook --profile=xxx and both have the same error. (While the --help options for
both have the same erroneous suggestion that --profile still works)

I think a separate tutorial for setting up jupyter remote server would
help, since I'm sure currently people would just go look at the ipython doc
and be confused like I was. At least note in the ipython doc that this is
now different for jupyter.
On Aug 31, 2015 7:33 AM, "Matthias Bussonnier" notifications@github.com
wrote:

I hope this change of policy about profiles is mentioned (more
explicitly?) in the docs. I tried to put up a server on an Amazon EC2 image
but following the instructions on the ipython docs didn't work because with
ipython==4.0 it no longer accepted the --profile option.

IPython 4.0 still have profile, you are just mistaking Notebook for
IPython.

If you want different configuration for the notebook, you need to set the
jupyter config dir environement variable, if you want profile for your
kernel, you can set it in your kernelspec.

—
Reply to this email directly or view it on GitHub
#309 (comment).

Carreau · 2015-09-01T07:36:45Z

How did you get help to give you hints about profile ?

And again, --profile does not work with the notebook application, only with ipython/ipython kernel.
If you want profile for your kernel you need to modify your kernelspec. Use jupyter kernelspec list --debug to see where your kernelspec are.

eulerreich · 2015-09-01T08:25:35Z

I did ipython notebook --help and jupyter notebook --help and both gave the same thing. The examples OP posted are listed at the end of the help output.

Right I get it now that --profile no longer works with the notebook, but I'm saying the doc should be made clearer so that the in the future, people switching from lower versions of ipython shouldn't have to look far to get an answer.

For example, if I google 'set up remote server jupyter' the first result is http://ipython.org/ipython-doc/1/interactive/public_server.html, and nowhere in there does it say that --profile no longer works for ipython/jupyter 4. Indeed, one of the instructions is

"You can then start the notebook and access it later by pointing your browser to https://your.host.com:9999 with ipython notebook --profile=nbserver."

Other top results are about jupyter hub, which requires python3. I don't think I saw a single mention that the --profile option no longer works for ipython/jupyter 4 among them.

Maybe you guys wrote a doc, but google is just being dumb for the moment. Nevertheless I never find it, and I searched for a long time before finding this issue posted here.

Carreau · 2015-09-01T08:43:56Z

I did ipython notebook --help and jupyter notebook --help and both gave the same thing. The examples OP posted are listed at the end of the help output.

O_o do you have both IPython 4.x and notebook 4.x ?

Right I get it now that --profile no longer works with the notebook, but I'm saying the doc should be made clearer so that the in the future, people switching from lower versions of ipython shouldn't have to look far to get an answer.

Well it's hard to bias google. For whatever reason people are still looking referencing docs for 1.0 and google put it on top. We'll try to find a solution.

eulerreich · 2015-09-01T12:42:41Z

I had ipython 4 initially but that kept giving errors as I said, so I
installed jupyter, but that didn't solve anything.

On Tue, Sep 1, 2015 at 3:43 AM, Matthias Bussonnier <
notifications@github.com> wrote:

I did ipython notebook --help and jupyter notebook --help and both gave
the same thing. The examples OP posted are listed at the end of the help
output.

O_o do you have both IPython 4.x and notebook 4.x ?

Right I get it now that --profile no longer works with the notebook, but
I'm saying the doc should be made clearer so that the in the future, people
switching from lower versions of ipython shouldn't have to look far to get
an answer.

Well it's hard to bias google. For whatever reason people are still
looking referencing docs for 1.0 and google put it on top. We'll try to
find a solution.

—
Reply to this email directly or view it on GitHub
#309 (comment).

Ablomis · 2015-11-07T23:57:15Z

Is there a way to avoid typing following code in each notebook:
import findspark
findspark.init()
import pyspark
sc = pyspark.SparkContext()

and just make sure whenever you launch notebook it already hooked up the spark?

It isn't too hard but it feels like a jury-rigging which i hate.

minrk · 2015-11-09T10:07:50Z

You can add it to a startup file, e.g. ~/.ipython/profile_default/startup/initspark.py

wlsherica · 2015-11-21T04:11:04Z

I got the same issue, and the steps from @vherasme didn't work.

Python 2.7.10
Spark 1.4.1
java version "1.8.0_65"
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)

sdlin · 2015-12-05T08:05:28Z

@wlsherica, I had that same issue. For me, this was being caused by a bad spark configuration.

Specifically, I had:

export PYSPARK_SUBMIT_ARGS="--master local[2]"

So I just removed that.

Digital2Slave · 2016-01-04T07:08:01Z

@Carreau amazing work. thanks so much.

Carreau commented on 25 Aug 2015
Ok, here is what I just did durring the last 1/2 h, for me on OS X

Install apache-spark ($ brew install apache-spark)
install findspark ( pip install -e . after cloning https://github.com/minrk/findspark, and cd findspark)
install java (from here)
fire a notebook (jupyter notebook)
enter the following:

import findspark
import os
findspark.init()

import pyspark
sc = pyspark.SparkContext()
lines = sc.textFile(os.path.exapnduser('~/dev/ipython/setup.py'))
lines_nonempty = lines.filter( lambda x: len(x) > 0 )
lines_nonempty.count()
execute :

221
Yayyyyy !

M2shad0w · 2016-01-22T09:52:15Z

@Carreau thanks your answer

hanxue · 2016-01-27T16:40:52Z

Thanks @Carreau for the step-by-step instructions! Stumbled upon this issue when following instructions for IPython 3.x

In case anyone want more detailed instruction and explanation, I have wrote http://flummox-engineering.blogspot.com/2016/01/how-to-configure-ipython4-for-apache-spark.html

fabboe · 2016-02-03T19:13:55Z

Using the findspark setup, are you able to use jars which are added via SparkConf spark.jars?

from pyspark import SparkConf, SparkContext
from pyspark.sql import HiveContext
conf = SparkConf().set("spark.jars","/usr/local/opt/spark-csv_2.10-1.3.0.jar")
sc = SparkContext(conf=conf)
sqlContext = HiveContext(sc)

gets loaded when SparkContext is started:

16/02/03 11:03:47 INFO SparkContext: Added JAR /usr/local/opt/spark-csv_2.10-1.3.0.jar at http://1.2.3.4:49318/jars/spark-csv_2.10-1.3.0.jar with timestamp 1454526227905

still:

df = sqlContext.read.format('com.databricks.spark.csv')\
.options(header='true', delimiter=',', inferschema=True)\
.load(csvpath)

Py4JJavaError: An error occurred while calling o247.load.
: java.lang.ClassNotFoundException: Failed to load class for data source: com.databricks.spark.csv.
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:67)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:87)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.databricks.spark.csv.DefaultSource
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:60)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:60)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:60)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:60)
at scala.util.Try.orElse(Try.scala:82)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scal

versemonger · 2016-02-15T22:45:02Z

using the shell in spark tutorial is also a good solution to the issue.

marianobilli · 2018-10-23T13:54:02Z

I am having the problem that I am running spark jobs in a hadoop cluster triggered by Jupyter notebook. The problem is that each cell of code consumes the number of configured executors but they are never released. so after a number of executed cell blocks all the resources of the cluster are blocked.

Has anyone had this problem?

minrk mentioned this issue Aug 18, 2015

remove old IPython examples #310

Merged

k-dahl mentioned this issue Aug 18, 2015

Fixed the command continuation characters that were missing and skipp… prabeesh/pyspark-notebook#4

Closed

Carreau closed this as completed in #310 Aug 18, 2015

willingc mentioned this issue Dec 5, 2015

Add a doc page for tips on using Spark #788

Open

minrk added this to the 4.1 milestone Dec 14, 2015

github-actions bot added the status:resolved-locked label Apr 4, 2021

github-actions bot locked as resolved and limited conversation to collaborators Apr 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unrecognized alias: '--profile=xxx', it will probably have no effect. #309

Unrecognized alias: '--profile=xxx', it will probably have no effect. #309

k-dahl commented Aug 18, 2015

minrk commented Aug 18, 2015

k-dahl commented Aug 19, 2015

Carreau commented Aug 19, 2015

vherasme commented Aug 25, 2015

Carreau commented Aug 25, 2015

vherasme commented Aug 25, 2015

Carreau commented Aug 25, 2015

Carreau commented Aug 25, 2015

vherasme commented Aug 26, 2015

Carreau commented Aug 26, 2015

vherasme commented Aug 26, 2015

Carreau commented Aug 26, 2015

Carreau commented Aug 26, 2015

vherasme commented Aug 26, 2015

Carreau commented Aug 26, 2015

ajschumacher commented Aug 26, 2015

vherasme commented Aug 27, 2015

eulerreich commented Aug 31, 2015

Carreau commented Aug 31, 2015

eulerreich commented Aug 31, 2015

Carreau commented Sep 1, 2015

eulerreich commented Sep 1, 2015

Carreau commented Sep 1, 2015

eulerreich commented Sep 1, 2015

Ablomis commented Nov 7, 2015

minrk commented Nov 9, 2015

wlsherica commented Nov 21, 2015

sdlin commented Dec 5, 2015

Digital2Slave commented Jan 4, 2016

M2shad0w commented Jan 22, 2016

hanxue commented Jan 27, 2016

fabboe commented Feb 3, 2016

versemonger commented Feb 15, 2016

marianobilli commented Oct 23, 2018

Unrecognized alias: '--profile=xxx', it will probably have no effect. #309

Unrecognized alias: '--profile=xxx', it will probably have no effect. #309

Comments

k-dahl commented Aug 18, 2015

Examples

minrk commented Aug 18, 2015

k-dahl commented Aug 19, 2015

Carreau commented Aug 19, 2015

vherasme commented Aug 25, 2015

Carreau commented Aug 25, 2015

vherasme commented Aug 25, 2015

Carreau commented Aug 25, 2015

Carreau commented Aug 25, 2015

vherasme commented Aug 26, 2015

Carreau commented Aug 26, 2015

vherasme commented Aug 26, 2015

Carreau commented Aug 26, 2015

Carreau commented Aug 26, 2015

vherasme commented Aug 26, 2015

Carreau commented Aug 26, 2015

ajschumacher commented Aug 26, 2015

vherasme commented Aug 27, 2015

eulerreich commented Aug 31, 2015

Carreau commented Aug 31, 2015

eulerreich commented Aug 31, 2015

Carreau commented Sep 1, 2015

eulerreich commented Sep 1, 2015

Carreau commented Sep 1, 2015

eulerreich commented Sep 1, 2015

Ablomis commented Nov 7, 2015

minrk commented Nov 9, 2015

wlsherica commented Nov 21, 2015

sdlin commented Dec 5, 2015

Digital2Slave commented Jan 4, 2016

M2shad0w commented Jan 22, 2016

hanxue commented Jan 27, 2016

fabboe commented Feb 3, 2016

versemonger commented Feb 15, 2016

marianobilli commented Oct 23, 2018