Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QST: pandas.DataFrame() converts pyarrow.array() to numpy series #54057

Open
2 tasks done
yoonghm opened this issue Jul 9, 2023 · 4 comments
Open
2 tasks done

QST: pandas.DataFrame() converts pyarrow.array() to numpy series #54057

yoonghm opened this issue Jul 9, 2023 · 4 comments
Labels
Arrow pyarrow functionality Bug Constructors Series/DataFrame/Index/pd.array Constructors Dtype Conversions Unexpected or buggy dtype conversions

Comments

@yoonghm
Copy link

yoonghm commented Jul 9, 2023

Research

  • I have searched the [pandas] tag on StackOverflow for similar questions.

  • I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

https://stackoverflow.com/questions/76648782/pandas-dataframe-converts-pyarrow-array-to-numpy-series

Question about pandas

No response

@yoonghm yoonghm added Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Jul 9, 2023
@btparrish
Copy link

from stackoverflow: As of pandas 2.0.x, the pandas constructors do not recognize pyarrow objects. In order to get a pyarrow dtype, you'll need to pass dtype=string[pyarrow]". I expect this will change in an upcoming pandas version.

@lithomas1
Copy link
Member

I think this should work, and is a bug.

We should be preserving pyarrow dtypes if they are passed in.

cc @phofl

@lithomas1 lithomas1 added Dtype Conversions Unexpected or buggy dtype conversions Constructors Series/DataFrame/Index/pd.array Constructors Arrow pyarrow functionality labels Jul 10, 2023
@mroeschke
Copy link
Member

Just noting the current supported way for this to work is to pass your pyarrow objects to pd.arrays.ArrowExtensionArray https://pandas.pydata.org/docs/user_guide/pyarrow.html#data-structure-integration

@jbrockmendel
Copy link
Member

The solution here is to add in sanitize_array a check for lib.is_pyarrow_array. The difficult part is ensuring that we find all the other places that may need the same check (off the top of my head pd.array)

@lithomas1 lithomas1 added Bug and removed Usage Question Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality Bug Constructors Series/DataFrame/Index/pd.array Constructors Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

No branches or pull requests

5 participants