Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for Python 3.7 #1988

Merged
merged 19 commits into from
Sep 15, 2023
Merged

Conversation

SimonYansenZhao
Copy link
Collaborator

@SimonYansenZhao SimonYansenZhao commented Sep 5, 2023

Description

This PR dropped the support for Python 3.7 since it'll reach the end of life on Jun 27, 2023.
This PR is moved from #1937, because of the sign-off requirement.

  • Update docs
  • Update Conda env script

Related Issues

References

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.
  • This PR is being made to staging branch and not to main branch.

@SimonYansenZhao
Copy link
Collaborator Author

When using Python 3.11, some package requires numpy 1.19.4 which may not support Python 3.11. See https://github.com/recommenders-team/recommenders/actions/runs/6079503723/job/16492093035

@miguelgfierro
Copy link
Collaborator

When using Python 3.11, some package requires numpy 1.19.4 which may not support Python 3.11. See https://github.com/recommenders-team/recommenders/actions/runs/6079503723/job/16492093035

I see two options:

  1. Start with 3.8-3.10, and leave 3.11 to a future PR
  2. Consider adding different numpy versions depending on the python version as we did with cornac:
    "cornac>=1.1.2,<1.15.2;python_version<='3.7'",
    "cornac>=1.15.2,<2;python_version>='3.8'",

What do you think @SimonYansenZhao?

@SimonYansenZhao
Copy link
Collaborator Author

When using Python 3.11, some package requires numpy 1.19.4 which may not support Python 3.11. See https://github.com/recommenders-team/recommenders/actions/runs/6079503723/job/16492093035

I see two options:

  1. Start with 3.8-3.10, and leave 3.11 to a future PR
  2. Consider adding different numpy versions depending on the python version as we did with cornac:
    "cornac>=1.1.2,<1.15.2;python_version<='3.7'",
    "cornac>=1.15.2,<2;python_version>='3.8'",

What do you think @SimonYansenZhao?

@miguelgfierro I tried option 2, but failed, see https://github.com/recommenders-team/recommenders/actions/runs/6079732708/job/16496793669?pr=1988

I'll try python 3.8 - 3.10 in this PR and leave 3.11 in another future PR.

@SimonYansenZhao
Copy link
Collaborator Author

SimonYansenZhao commented Sep 7, 2023

Some notes from the test results:

  1. Some dependencies require numpy of some old version which does not support Python 3.10+, so numpy needs to be compiled against Python 3.10+ to build its binary file during setup of recommenders. So we need to find out which dependences require old version of numpy.
  2. scrapbook is very old and is not actively developed now. The latest version was published in 2021. Some code of scrapbook uses DataFrame.append() (See https://github.com/recommenders-team/recommenders/actions/runs/6108846153/job/16578689728?pr=1988#step:3:2639) which is removed from pandas since pandas 2.0. So an alternative to scrapbook should be used, or we should bypass the code in scrapbook that uses DataFrame.append().

@miguelgfierro
Copy link
Collaborator

I asked ChatGPT how to replace papermill and scrapbook, and it looks that there is a dependency that we already have called nbconvert. This would be the code:

import nbformat
from nbconvert.preprocessors import ExecutePreprocessor

# Step 1: Prepare Your Jupyter Notebook

# Ensure your Jupyter Notebook is set up with parameters that can be modified and read.
# Use Markdown cells to specify parameters that need modification and code cells to set parameters that need to be read.

# Step 2: Modify Parameters Programmatically

# Load the Jupyter Notebook
notebook_path = "your_notebook.ipynb"
with open(notebook_path, "r") as notebook_file:
    notebook_content = nbformat.read(notebook_file, as_version=4)

# Define the parameters to modify
new_param1 = "New Value1"
new_param2 = "New Value2"

# Search for and replace parameter values in code cells
for cell in notebook_content.cells:
    if cell.cell_type == "code":
        cell.source = cell.source.replace("param1 = 'Default Value'", f"param1 = '{new_param1}'")
        cell.source = cell.source.replace("param2 = 'Default Value'", f"param2 = '{new_param2}'")

# Save the modified notebook
modified_notebook_path = "modified_notebook.ipynb"
with open(modified_notebook_path, "w", encoding="utf-8") as modified_notebook_file:
    nbformat.write(notebook_content, modified_notebook_file)

print("Parameters modified and saved to", modified_notebook_path)

# Step 3: Execute the Modified Notebook with nbconvert

# Load the modified notebook
with open(modified_notebook_path, "r") as modified_notebook_file:
    modified_notebook_content = nbformat.read(modified_notebook_file, as_version=4)

# Create an execution preprocessor
execute_preprocessor = ExecutePreprocessor(timeout=600, kernel_name="python3")

# Execute the notebook
executed_notebook, _ = execute_preprocessor.preprocess(modified_notebook_content, {"metadata": {"path": "./"}})

# Save the executed notebook
executed_notebook_path = "executed_notebook.ipynb"
with open(executed_notebook_path, "w", encoding="utf-8") as executed_notebook_file:
    nbformat.write(executed_notebook, executed_notebook_file)

print("Notebook execution complete. Results saved to", executed_notebook_path)

# Step 4: Read Parameters Programmatically

# Load the executed notebook
with open(executed_notebook_path, "r") as executed_notebook_file:
    executed_notebook_content = nbformat.read(executed_notebook_file, as_version=4)

# Extract parameter values from code cells
param3 = None
for cell in executed_notebook_content.cells:
    if cell.cell_type == "code":
        if "param3" in cell.source:
            # Extract param3 value from code cell
            exec(cell.source)
            break

# Now you can use param3 in your workflow
if param3 is not None:
    print("Value of param3:", param3)
else:
    print("param3 not found in the executed notebook.")

what do you think @SimonYansenZhao?

@miguelgfierro
Copy link
Collaborator

@SimonYansenZhao @anargyri I'm trying to replicate papermill miguelgfierro/pybase#73 if I do the basic functionality, we can remove 2 deps, and will depend only on nbconvert, which is a library supported by Jupyter, and re. Still WIP.

@SimonYansenZhao
Copy link
Collaborator Author

@miguelgfierro I think using nbconvert is doable. But now could you help me with the errors in the test result: https://github.com/recommenders-team/recommenders/actions/runs/6116241630/job/16601154561?pr=1988#step:3:2654

NameError: name 'XDeepFMModel' is not defined

@SimonYansenZhao
Copy link
Collaborator Author

SimonYansenZhao commented Sep 8, 2023

And I find ploomber may be used to replace scrapbook. Ploomer is built on top of papermill.

@anargyri
Copy link
Collaborator

anargyri commented Sep 8, 2023

  1. scrapbook is very old and is not actively developed now. The latest version was published in 2021. Some code of scrapbook uses DataFrame.append() (See https://github.com/recommenders-team/recommenders/actions/runs/6108846153/job/16578689728?pr=1988#step:3:2639) which is removed from pandas since pandas 2.0. So an alternative to scrapbook should be used, or we should bypass the code in scrapbook that uses DataFrame.append().

But you have pandas < 1.6 in setup.py, don't you? Or I am missing something?

@SimonYansenZhao
Copy link
Collaborator Author

3. scrapbook is very old and is not actively developed now. The latest version was published in 2021. Some code of scrapbook uses DataFrame.append() (See https://github.com/recommenders-team/recommenders/actions/runs/6108846153/job/16578689728?pr=1988#step:3:2639) which is removed from pandas since pandas 2.0. So an alternative to scrapbook should be used, or we should bypass the code in scrapbook that uses DataFrame.append().

But you have pandas < 1.6 in setup.py, don't you? Or I am missing something?

@anargyri after I found pandas 2.0 didn't work for scrapbook, I tried to set pandas < 1.6 in setup.py to see if pandas < 1.6 would work. Now scrapbook works with pandas < 1.6, but we cannot use pandas 2.0+ in the future if scrapbook won't upgrade.

Screenshot 2023-09-09 at 09 06 51

@miguelgfierro
Copy link
Collaborator

miguelgfierro commented Sep 9, 2023

I'm looking into this:

@miguelgfierro I think using nbconvert is doable. But now could you help me with the errors in the test result: https://github.com/recommenders-team/recommenders/actions/runs/6116241630/job/16601154561?pr=1988#step:3:2654

(recommenders) u@unicorn:~/MS/recommenders$ python --version
Python 3.9.16
(recommenders) u@unicorn:~/MS/recommenders$ pip list | grep pandas
pandas                       1.5.3
(recommenders) u@unicorn:~/MS/recommenders$ pytest tests/unit/recommenders/models/test_deeprec_model.py::test_xdeepfm_component_definition --disable-warnings
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.9.16, pytest-7.3.2, pluggy-1.0.0
rootdir: /home/u/MS/recommenders
configfile: pyproject.toml
plugins: typeguard-4.0.0, cov-4.1.0, hypothesis-6.79.1, mock-3.11.1, anyio-3.7.0
collected 1 item

tests/unit/recommenders/models/test_deeprec_model.py .                                                                                                              [100%]

===================================================================== 1 passed, 20 warnings in 3.70s ======================================================================

(recommenders) u@unicorn:~/MS/recommenders$ pip install pandas --upgrade
(recommenders) u@unicorn:~/MS/recommenders$ pip list | grep pandas
pandas                       2.1.0
(recommenders) u@unicorn:~/MS/recommenders$ pytest tests/unit/recommenders/models/test_deeprec_model.py::test_xdeepfm_component_definition --disable-warnings
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.9.16, pytest-7.3.2, pluggy-1.0.0
rootdir: /home/u/MS/recommenders
configfile: pyproject.toml
plugins: typeguard-4.0.0, cov-4.1.0, hypothesis-6.79.1, mock-3.11.1, anyio-3.7.0
collected 1 item

tests/unit/recommenders/models/test_deeprec_model.py .                                                                                                              [100%]

===================================================================== 1 passed, 20 warnings in 3.82s ======================================================================
(recommenders) u@unicorn:~/MS/recommenders$

(reco_gpu) u@unicorn:~/MS/recommenders$ python --version
Python 3.8.13
(reco_gpu) u@unicorn:~/MS/recommenders$ pip list | grep pandas
pandas                        1.4.2
(reco_gpu) u@unicorn:~/MS/recommenders$ pytest tests/unit/recommenders/models/test_deeprec_model.py::test_xdeepfm_component_definition --disable-warnings
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.8.13, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/u/MS/recommenders, configfile: pyproject.toml
plugins: hypothesis-6.46.3, rerunfailures-10.2, cov-3.0.0, mock-3.7.0
collected 1 item

tests/unit/recommenders/models/test_deeprec_model.py .                                                                                                              [100%]

===================================================================== 1 passed, 11 warnings in 14.74s =====================================================================
(reco_gpu) u@unicorn:~/MS/recommenders$ pip install pandas --upgrade
(reco_gpu) u@unicorn:~/MS/recommenders$ pip list | grep pandas
pandas                        2.0.3
(reco_gpu) u@unicorn:~/MS/recommenders$ pytest tests/unit/recommenders/models/test_deeprec_model.py::test_xdeepfm_component_definition --disable-warnings
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.8.13, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/u/MS/recommenders, configfile: pyproject.toml
plugins: hypothesis-6.46.3, rerunfailures-10.2, cov-3.0.0, mock-3.7.0
collected 0 items / 1 error

================================================================================= ERRORS ==================================================================================
__________________________________________________ ERROR collecting tests/unit/recommenders/models/test_deeprec_model.py __________________________________________________
tests/unit/recommenders/models/test_deeprec_model.py:7: in <module>
    from recommenders.datasets import movielens
recommenders/datasets/movielens.py:36: in <module>
    import pandera as pa
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/pandera/__init__.py:48: in <module>
    from . import errors, pandas_accessor, typing
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/pandera/typing/__init__.py:9: in <module>
    from . import dask, fastapi, geopandas, modin, pyspark
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/pandera/typing/dask.py:9: in <module>
    import dask.dataframe as dd
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/dask/dataframe/__init__.py:3: in <module>
    from dask.dataframe import backends, dispatch, rolling
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/dask/dataframe/backends.py:20: in <module>
    from dask.dataframe.core import DataFrame, Index, Scalar, Series, _Frame
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/dask/dataframe/core.py:29: in <module>
    from dask.dataframe import methods
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/dask/dataframe/methods.py:21: in <module>
    from dask.dataframe.utils import is_dataframe_like, is_index_like, is_series_like
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/dask/dataframe/utils.py:18: in <module>
    from dask.dataframe import (  # noqa: F401 register pandas extension types
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/dask/dataframe/_dtypes.py:3: in <module>
    from dask.dataframe.extensions import make_array_nonempty, make_scalar
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/dask/dataframe/extensions.py:6: in <module>
    from dask.dataframe.accessor import (
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/dask/dataframe/accessor.py:190: in <module>
    class StringAccessor(Accessor):
../../anaconda/envs/reco_gpu/lib/python3.8/site-packages/dask/dataframe/accessor.py:275: in StringAccessor
    @derived_from(pd.core.strings.StringMethods)
E   AttributeError: module 'pandas.core.strings' has no attribute 'StringMethods'
========================================================================= short test summary info =========================================================================
ERROR tests/unit/recommenders/models/test_deeprec_model.py - AttributeError: module 'pandas.core.strings' has no attribute 'StringMethods'
============================================================================ 1 error in 1.36s =============================================================================
ERROR: not found: /home/u/MS/recommenders/tests/unit/recommenders/models/test_deeprec_model.py::test_xdeepfm_component_definition
(no name '/home/u/MS/recommenders/tests/unit/recommenders/models/test_deeprec_model.py::test_xdeepfm_component_definition' in any of [<Module test_deeprec_model.py>])

(reco_gpu) u@unicorn:~/MS/recommenders$

@SimonYansenZhao @anargyri it seems that with python 3.8, we need pandas<2, while with 3.9 we can have pandas>1.5

Trying different pandas versions with different python -> https://github.com/recommenders-team/recommenders/actions/runs/6129325061
[UPDATE] In the logs of python 3.8 I see pandas 1.5.3 and it fails.
In local it works:

(reco_gpu) u@unicorn:~/MS/recommenders$ python --version
Python 3.8.13
(reco_gpu) u@unicorn:~/MS/recommenders$ pip list | grep pandas
pandas                        1.5.3
(reco_gpu) u@unicorn:~/MS/recommenders$ pytest tests/unit/recommenders/models/test_deeprec_model.py::test_xdeepfm_component_definition --disable-warnings
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.8.13, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/u/MS/recommenders, configfile: pyproject.toml
plugins: hypothesis-6.46.3, rerunfailures-10.2, cov-3.0.0, mock-3.7.0
collected 1 item

tests/unit/recommenders/models/test_deeprec_model.py .                                                                                                              [100%]

===================================================================== 1 passed, 11 warnings in 6.88s ======================================================================
(reco_gpu) u@unicorn:~/MS/recommenders$

So it seems that the error doesn't come from pandas.

@SimonYansenZhao have you been able to replicate the error NameError: name 'XDeepFMModel' is not defined in local?

@miguelgfierro
Copy link
Collaborator

I looked at LightGBM https://github.com/microsoft/LightGBM, and they have support for Python 3.7. Wouldn't it be easier if we keep 3.7?

Thoughts @SimonYansenZhao @anargyri

@SimonYansenZhao
Copy link
Collaborator Author

@miguelgfierro I cannot replicate the error NameError: name 'XDeepFMModel' is not defined, which is why I am always waiting for the test results in DevOps.

setup.py Outdated
"Operating System :: POSIX :: Linux",
],
extras_require=extras_require,
keywords="recommendations recommendation recommenders recommender system engine "
"machine learning python spark gpu",
install_requires=install_requires,
package_dir={"recommenders": "recommenders"},
python_requires=">=3.6, <3.10",
python_requires=">=3.8, <=3.9",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python_requires=">=3.8, <=3.9",
python_requires=">=3.6",

@SimonYansenZhao we had a discussion in the reco meeting. There are a couple of things we are proposing:

  1. leave python requires with >= 3.6 so people can install recommenders with any version of python. Right now, for example, people can't install the pypi package on 3.10 Then we say that we officially only support 3.8-3.10. With that we avoid issues like [ASK] No module named 'recommenders' after installing recommenders in Colab #1928 where people can't install recommenders on versions that are not in the requires, and at the same time, we officially only tests the versions we support.
  2. Maybe it would be easier to split this PR into 2, first we remove the support of 3.7, and then in another PR we add 3.10.

What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@SimonYansenZhao SimonYansenZhao changed the title Remove support for Python 3.7, add Python 3.10 and 3.11 Remove support for Python 3.7 Sep 12, 2023
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
is_sequence() is deleted in TensorFlow 2.13.0.  See
https://github.com/tensorflow/tensorflow/releases/tag/v2.13.0

Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
…der.tokenize_text()

Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
Signed-off-by: Simon Zhao <simonyansenzhao@gmail.com>
@SimonYansenZhao
Copy link
Collaborator Author

@miguelgfierro @anargyri I find ./tools/generate_conda_file.py is out of sync with setup.py. Is it still needed?

As discussed in #1988 (comment), I only dropped the support for Python 3.7 in this PR, and will create another PR for adding support for 3.10.

@miguelgfierro
Copy link
Collaborator

@miguelgfierro @anargyri I find ./tools/generate_conda_file.py is out of sync with setup.py. Is it still needed?

No, we can drop it

Copy link
Collaborator

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should change python_requires=">=3.6", as of this comment: #1988 (comment) or if we don't like 3.6, we can do 3.7+

Apart from that, this is great

@miguelgfierro miguelgfierro merged commit 8d67ef5 into staging Sep 15, 2023
20 checks passed
@miguelgfierro miguelgfierro deleted the simonz-dep-upgrade-20230905 branch September 15, 2023 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants