Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import the right pandas from conda #4419

Merged
merged 2 commits into from
Dec 23, 2021

Commits on Dec 22, 2021

  1. Import the right pandas from conda

    There is a pandas package in the spark v3.2.0 or later binary dir: python/pyspark/pandas,
        the udf-cudf test includes this dir via PYTHONPATH, which causes test failure as below:
        'AttributeError: partially initialized module 'pandas' has no attribute __version__'
    
    To fix, put conda package path ahead of the env 'PYTHONPATH', to import the right pandas
        from conda instead of spark3.2.0 or later binary path.
    
    Signed-off-by: Tim Liu <timl@nvidia.com>
    NvTimLiu committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    c0fbbce View commit details
    Browse the repository at this point in the history
  2. conda version of pandas works across all Spark versions

    Signed-off-by: Tim Liu <timl@nvidia.com>
    NvTimLiu committed Dec 22, 2021
    Configuration menu
    Copy the full SHA
    3afbab2 View commit details
    Browse the repository at this point in the history