Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing value handling options #58

Open
sandorkertesz opened this issue Jun 14, 2023 · 6 comments
Open

Add missing value handling options #58

sandorkertesz opened this issue Jun 14, 2023 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@sandorkertesz
Copy link
Collaborator

sandorkertesz commented Jun 14, 2023

Currently read_bufr does not offer control over missing values during the extraction and we have to filter the resulting Pandas dataframe to remove them.

Option 1

Add option missing_value_policy with the following values: "include", ignore" (default="include")

df = pdbufr.read_bufr(...., missing_value_policy="ignore")

Option 2

Add option skip_missing as a bool (default=False)

df = pdbufr.read_bufr(...., skip_missing=True)

Option 3

Add option skip_na_values as a bool (default=False)

df = pdbufr.read_bufr(...., skip_na_values=True)
@sandorkertesz sandorkertesz added the enhancement New feature or request label Jun 14, 2023
@tlmquintino
Copy link
Member

tlmquintino commented Jun 14, 2023

how about:
df = pdbufr.read_bufr(...., missing_values="ignore")

I think we should use as policies (identified by strings), but the key does not need to be verbose to include _policy

@sandorkertesz
Copy link
Collaborator Author

Yes, the shorter the better!

@pmaciel
Copy link
Member

pmaciel commented Jun 14, 2023

Maybe missing_values=None?

@iainrussell
Copy link
Member

In the case where the user wants to extract five variables from the data and just one of them is missing for a given row, would missing_values="ignore" remove the row, or would it only remove the row if all five variables are missing? And so, do we need an option to disambiguate this?

@sandorkertesz
Copy link
Collaborator Author

Maybe missing_values=None?

What other values than None could we specify for missing_values? What would they mean?

@shahramn
Copy link

I prefer option 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants