Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ensemble name filter to get_units_from_file_object #169

Merged
merged 5 commits into from
Nov 2, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion ce/api/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ def get_time_value(timeset, time_idx):
run_result = result[data_file_variable.file.run.name] = {
"data": {},
"units": get_units_from_run_object(
data_file_variable.file.run, variable
data_file_variable.file.run, variable, ensemble_name
),
"modtime": data_file_variable.file.index_time,
}
Expand Down
21 changes: 13 additions & 8 deletions ce/api/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,17 @@
from ce.api.geo import wkt_to_masked_array


def get_files_from_run_variable(run, variable):
return [
file_
for file_ in run.files
if variable in [dfv.netcdf_variable_name for dfv in file_.data_file_variables]
]
def get_files_from_run_variable(run, variable, ensemble_name):
files = []
for file_ in run.files:
for dfv in file_.data_file_variables:
for ensemble in dfv.ensembles:
if (
variable == dfv.netcdf_variable_name
and ensemble.name == ensemble_name
):
files.append(file_)
return files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a tighter and more efficient way of accomplishing this:

return list(filter(
    lambda file_: any(
        variable == dfv.netcdf_variable_name and 
            any(ensemble.name == ensemble_name for ensemble in dfv.ensembles)
        for dfv in file_.data_file_variables
    ),
    run.files
))

Generally speaking, as in R, loops are a code smell, or at least a reason to look at alternatives. Almost always there is a comprehension or a filter expression that will do the same job as a something that builds up a result by weeding through another iterable.

You don't have to wrap the filter in list unless the result must be a list instead of a generic iterable.

Not sure if Black will agree with my spacing there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an alternative that uses a comprehension instead of a filter. It may be easier to read:

return (
    file_ for file_ in run.files if any(...)
)

The parentheses (return (...)) make this a generator. If you used brackets (return [...]) it would be a list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are pretty much identical formulations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the advice! I have just been working on this when my other projects are slow.



def get_units_from_netcdf_file(nc, variable):
Expand All @@ -34,8 +39,8 @@ def get_units_from_file_object(file_, varname):
)


def get_units_from_run_object(run, varname):
files = get_files_from_run_variable(run, varname)
def get_units_from_run_object(run, varname, ensemble_name):
files = get_files_from_run_variable(run, varname, ensemble_name)
units = {get_units_from_file_object(file_, varname) for file_ in files}

if len(units) != 1:
Expand Down