Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add section on variable naming #7

Merged
merged 45 commits into from
Jul 3, 2024
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
9b9e204
refactor slides to break into separate sections.
jatkinson1000 Jun 18, 2024
9ec93cf
add slide on naming standards
AmyOctoCat Jun 19, 2024
4fb12be
add warning about the use of f strings in logging statements
AmyOctoCat Jun 19, 2024
1c2418f
Adding instructions for naming part of exercise.
AmyOctoCat Jun 19, 2024
7d7aed6
changing some of the naming in the final version of precipitation_cli…
AmyOctoCat Jun 19, 2024
a07e7c4
hopefully made more readable
AmyOctoCat Jun 19, 2024
d273ee7
naming tweeks
AmyOctoCat Jun 19, 2024
8383480
revert changes to pluralise name for array. Not sure what best practi…
AmyOctoCat Jun 19, 2024
99042f7
add a line about boolean naming
AmyOctoCat Jun 20, 2024
1083b88
add a line about boolean naming
AmyOctoCat Jun 20, 2024
3ef0967
formatting
AmyOctoCat Jun 20, 2024
9652abe
Update exercises/00_final/precipitation_climatology.py
AmyOctoCat Jun 20, 2024
c37087c
fix excpetion raising bug introduced in this branch
AmyOctoCat Jun 20, 2024
97cd877
Merge branch 'add_section_on_variable_naming' of github.com:Cambridge…
AmyOctoCat Jun 20, 2024
6752f63
grammar in slide.
AmyOctoCat Jun 20, 2024
3c87e6f
pull naming into it's own section
AmyOctoCat Jun 20, 2024
e88749e
resolve merge conflict
AmyOctoCat Jun 20, 2024
2d2ace5
return matplotlib import to the standard plt and add small fix
AmyOctoCat Jun 20, 2024
6a8fdb0
fix merge conflicts
AmyOctoCat Jun 20, 2024
694c8fe
further renaming and some additional documentation
AmyOctoCat Jun 26, 2024
fa01f3b
further renaming
AmyOctoCat Jun 26, 2024
93caff0
further naming changes
AmyOctoCat Jun 26, 2024
e4c2fd5
further naming changes
AmyOctoCat Jun 26, 2024
09c52cc
revert naming of columns in the netcdf as editing netcdf file is too …
AmyOctoCat Jun 26, 2024
67cb57e
remove comment as have confirmed that this hasn't introduced a runtim…
AmyOctoCat Jun 26, 2024
a55b60c
add some examples into the slides
AmyOctoCat Jun 26, 2024
b60049f
add to example slide
AmyOctoCat Jun 26, 2024
471dfec
finish renaming in exercise 5
AmyOctoCat Jun 26, 2024
40c5912
renumber exercises
AmyOctoCat Jun 26, 2024
c4e37c6
include the naming slides in the main quarto file
AmyOctoCat Jun 26, 2024
7b694b1
add base code for exercise on naming
AmyOctoCat Jun 26, 2024
6710ba1
update naming in exercise 4
AmyOctoCat Jun 26, 2024
e6168aa
update exercise 4 for renaming
AmyOctoCat Jun 26, 2024
0fc9ec8
run black in all the exercises after black
AmyOctoCat Jun 26, 2024
f6ef57b
modified the wrong exercise
AmyOctoCat Jun 26, 2024
c71c31b
missed file naming
AmyOctoCat Jun 26, 2024
28e9359
renumber exercises in slides and a dd a bit of extra detail
AmyOctoCat Jun 27, 2024
fe0c39e
reformatting and splitting black and pylint sections
AmyOctoCat Jun 27, 2024
61e5ced
Update exercises 1 and 2 with blank lines to match changes to later e…
jatkinson1000 Jun 27, 2024
cf759c8
remove redundant use of xr.DataArray wrapping around array multiplica…
AmyOctoCat Jul 3, 2024
99ef3a9
change font size in slides
AmyOctoCat Jul 3, 2024
8e478f2
remove remaining uses of assert in production code
AmyOctoCat Jul 3, 2024
f9a016d
formatting changes to slides on naming
AmyOctoCat Jul 3, 2024
50e4972
Merge pull request #13 from Cambridge-ICCS/jatkinson1000-naming-patch
AmyOctoCat Jul 3, 2024
6539266
Minor typographical updates.
jatkinson1000 Jul 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
183 changes: 88 additions & 95 deletions exercises/00_final/precipitation_climatology.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,34 +12,39 @@
import regionmask


def convert_pr_units(darray):
def convert_precipitation_units(precipitation_in_kg_per_m_squared_s):
jatkinson1000 marked this conversation as resolved.
Show resolved Hide resolved
"""
Convert precipitation units from [kg m-2 s-1] to [mm day-1].

Parameters
----------
darray : xarray.DataArray
precipitation_in_kg_per_m_squared_s : xarray.DataArray
xarray DataArray containing model precipitation data

Returns
-------
darray : xarray.DataArray
precipitation_in_mm_per_day : xarray.DataArray
the input DataArray with precipitation units modified
"""
# density 1000 kg m-3 => 1 kg m-2 == 1 mm
# There are 60*60*24 = 86400 seconds per day
darray.data = darray.data * 86400
darray.attrs["units"] = "mm/day"
precipitation_in_mm_per_day = xr.DataArray(precipitation_in_kg_per_m_squared_s * 86400)

assert (
darray.data.min() >= 0.0
), "There is at least one negative precipitation value"
assert darray.data.max() < 2000, "There is a precipitation value/s > 2000 mm/day"
precipitation_in_mm_per_day.attrs["units"] = "mm/day"

return darray
if precipitation_in_mm_per_day.data.min() < 0.0:
raise ValueError("There is at least one negative precipitation value")
if precipitation_in_mm_per_day.data.max() > 2000:
raise ValueError("There is a precipitation value/s > 2000 mm/day")

return precipitation_in_mm_per_day

def plot_zonal(data):

# I think it would be good to give more detail here about what the dimensions and
# content of the input array needs to be, to save the reader having to go through
# the code and infer it in order to use the method. Would it also be possible to give
# the input variable a more specific name?
def plot_zonally_averaged_precipitation(data):
AmyOctoCat marked this conversation as resolved.
Show resolved Hide resolved
"""
Plot zonally-averaged precipitation data and save to file.

Expand All @@ -53,29 +58,30 @@ def plot_zonal(data):
None

"""
zonal_pr = data["pr"].mean("lon", keep_attrs=True)
zonal_precipitation = data["precipitation"].mean("longitude", keep_attrs=True)

fig, ax = plt.subplots(nrows=4, ncols=1, figsize=(12, 8))
figure, axes = plt.subplots(nrows=4, ncols=1, figsize=(12, 8))

zonal_pr.sel(lat=[0]).plot.line(ax=ax[0], hue="lat")
zonal_pr.sel(lat=[-20, 20]).plot.line(ax=ax[1], hue="lat")
zonal_pr.sel(lat=[-45, 45]).plot.line(ax=ax[2], hue="lat")
zonal_pr.sel(lat=[-70, 70]).plot.line(ax=ax[3], hue="lat")
zonal_precipitation.sel(lat=[0]).plot.line(ax=axes[0], hue="latitude")
zonal_precipitation.sel(lat=[-20, 20]).plot.line(ax=axes[1], hue="latitude")
zonal_precipitation.sel(lat=[-45, 45]).plot.line(ax=axes[2], hue="latitude")
zonal_precipitation.sel(lat=[-70, 70]).plot.line(ax=axes[3], hue="latitude")

plt.tight_layout()
for axis in ax:
for axis in axes:
AmyOctoCat marked this conversation as resolved.
Show resolved Hide resolved
axis.set_ylim(0.0, 1.0e-4)
axis.grid()
plt.savefig("zonal.png", dpi=200) # Save figure to file

fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(12, 5))
figure, axes = plt.subplots(nrows=1, ncols=1, figsize=(12, 5))

zonal_pr.T.plot()
zonal_precipitation.T.plot()

plt.savefig("zonal_map.png", dpi=200) # Save figure to file


def get_country_ann_avg(data, countries):
# could data be named more specifically and more detail be given in the docstring about
# what the dimension and contents of the array are?
def get_country_annual_average(data, countries):
"""
Calculate annual precipitation averages for countries and save to file.

Expand All @@ -92,44 +98,30 @@ def get_country_ann_avg(data, countries):
None

"""
data_avg = data["pr"].groupby("time.year").mean("time", keep_attrs=True)
data_avg = convert_pr_units(data_avg)
data_average = data["precipitation"].groupby("time.year").mean("time",
keep_attrs=True)
data_average = convert_precipitation_units(data_average)

land = regionmask.defined_regions.natural_earth_v5_0_0.countries_110.mask(data_avg)
# would it be possible to give land a more specific name?
land = (regionmask.
defined_regions.
natural_earth_v5_0_0
.countries_110.mask(data_average))

with open("data.txt", "w", encoding="utf-8") as datafile:
# could k and v be named more specifically?
for k, v in countries.items():
data_avg_mask = data_avg.where(land.cf == v)

# Debugging - plot countries to make sure mask works correctly
# fig, geo_axes = plt.subplots(nrows=1, ncols=1, figsize=(12,5),
# subplot_kw={'projection': ccrs.PlateCarree(central_longitude=180)})
# data_avg_mask.sel(year = 2010).plot.contourf(ax=geo_axes,
# extend='max',
# transform=ccrs.PlateCarree(),
# cbar_kwargs={'label': data_avg_mask.units},
# cmap=cmocean.cm.haline_r)
# geo_axes.add_feature(cfeature.COASTLINE, lw=0.5)
# gl = geo_axes.gridlines(crs=ccrs.PlateCarree(), draw_labels=True,
# linewidth=2, color='gray', alpha=0.5, linestyle='--')
# gl.top_labels = False
# gl.left_labels = True
# gl.xlocator = mticker.FixedLocator([-180, -90, 0, 90])
# gl.ylocator = mticker.FixedLocator([-66, -23, 0, 23, 66])
# gl.xformatter = LONGITUDE_FORMATTER
# gl.yformatter = LATITUDE_FORMATTER
# gl.xlabel_style = {'size': 15, 'color': 'gray'}
# gl.ylabel_style = {'size': 15, 'color': 'gray'}
# print("show %s" %k)
# plt.show()

for yr in data_avg_mask.year.values:
precip = data_avg_mask.sel(year=yr).mean().values
datafile.write(f"{k.ljust(25)} {yr} : {precip:2.3f} mm/day\n")
data_avg_mask = data_average.where(land.cf == v)

for year in data_avg_mask.year.values:
precipitation = data_avg_mask.sel(year=year).mean().values
datafile.write(f"{k.ljust(25)} {year} : {precipitation:2.3f} mm/day\n")
datafile.write("\n")


def plot_enso(data):
# could data be named more specifically and more detail be given in the docstring about
# what the dimension and contents of the array are?
def plot_enso_hovmoller_diagram(data):
"""
Plot Hovmöller diagram of equatorial precipitation to visualise ENSO.

Expand All @@ -144,31 +136,32 @@ def plot_enso(data):

"""
enso = (
data["pr"]
data["precipitation"]
.sel(lat=slice(-1, 1))
.sel(lon=slice(120, 280))
.mean(dim="lat", keep_attrs=True)
.mean(dim="latitude", keep_attrs=True)
)

enso.plot()
plt.savefig("enso.png", dpi=200) # Save figure to file


def create_plot(clim, model, season, mask=None, gridlines=False, levels=None):
def create_precipitation_climatology_plot(climatology_data, model_name, season, mask=None, plot_gridlines=False, levels=None):
"""
Plot the precipitation climatology.

Parameters
----------
clim : xarray.DataArray
climatology_data : xarray.DataArray
Precipitation climatology data
model : str
model_name : str
Name of the climate model
season : str
Climatological season (one of DJF, MAM, JJA, SON)
mask : optional str
mask to apply to plot (one of "land" or "ocean")
gridlines : bool
plot_gridlines : bool

Select whether to plot gridlines
levels : list
Tick mark values for the colorbar
Expand All @@ -188,12 +181,12 @@ def create_plot(clim, model, season, mask=None, gridlines=False, levels=None):
subplot_kw={"projection": ccrs.PlateCarree(central_longitude=180)},
)

clim.sel(season=season).plot.contourf(
climatology_data.sel(season=season).plot.contourf(
ax=geo_axes,
levels=levels,
extend="max",
transform=ccrs.PlateCarree(),
cbar_kwargs={"label": clim.units},
cbar_kwargs={"label": climatology_data.units},
cmap=cmocean.cm.rain,
)

Expand All @@ -211,52 +204,52 @@ def create_plot(clim, model, season, mask=None, gridlines=False, levels=None):
alpha=0.75,
)

if gridlines:
gl = geo_axes.gridlines(
if plot_gridlines:
gridlines = geo_axes.gridlines(
crs=ccrs.PlateCarree(),
draw_labels=True,
linewidth=2,
color="gray",
alpha=0.5,
linestyle="--",
)
gl.top_labels = False
gl.left_labels = True
gridlines.top_labels = False
gridlines.left_labels = True
# gl.xlines = False
gl.xlocator = mticker.FixedLocator([-180, -90, 0, 90, 180])
gl.ylocator = mticker.FixedLocator(
gridlines.xlocator = mticker.FixedLocator([-180, -90, 0, 90, 180])
gridlines.ylocator = mticker.FixedLocator(
[-66, -23, 0, 23, 66]
) # Tropics & Polar Circles
gl.xformatter = LONGITUDE_FORMATTER
gl.yformatter = LATITUDE_FORMATTER
gl.xlabel_style = {"size": 15, "color": "gray"}
gl.ylabel_style = {"size": 15, "color": "gray"}
gridlines.xformatter = LONGITUDE_FORMATTER
gridlines.yformatter = LATITUDE_FORMATTER
gridlines.xlabel_style = {"size": 15, "color": "gray"}
gridlines.ylabel_style = {"size": 15, "color": "gray"}

title = f"{model} precipitation climatology ({season})"
title = f"{model_name} precipitation climatology ({season})"
plt.title(title)


def main(
pr_file,
season="DJF",
output_file="output.png",
gridlines=False,
mask=None,
cbar_levels=None,
countries=None,
precipitation_netcdf_file,
season="DJF",
output_file="output.png",
plot_gridlines=False,
mask=None,
cbar_levels=None,
countries=None,
):
"""
Run the program for producing precipitation plots.

Parameters
----------
pr_file : str
precipitation_netcdf_file : str
netCDF filename to read precipitation data from
season : optional str
Climatological season (one of DJF, MAM, JJA, SON)
output_file : optional str
filename to save main image to
gridlines : optional bool
plot_gridlines : optional bool
Select whether to plot gridlines
mask : optional str
mask to apply to plot (one of "land" or "ocean")
Expand All @@ -274,34 +267,34 @@ def main(
if countries is None:
countries = {"United Kingdom": "GB"}

dset = xr.open_dataset(pr_file)
input_data = xr.open_dataset(precipitation_netcdf_file)

plot_zonal(dset)
plot_enso(dset)
get_country_ann_avg(dset, countries)
plot_zonally_averaged_precipitation(input_data)
plot_enso_hovmoller_diagram(input_data)
get_country_annual_average(input_data, countries)

clim = dset["pr"].groupby("time.season").mean("time", keep_attrs=True)
climatology = input_data["precipitation"].groupby("time.season").mean("time", keep_attrs=True)

try:
input_units = clim.attrs["units"]
input_units = climatology.attrs["units"]
except KeyError as exc:
raise KeyError(
"Precipitation variable in {pr_file} must have a units attribute"
) from exc

if input_units == "kg m-2 s-1":
clim = convert_pr_units(clim)
climatology = convert_precipitation_units(climatology)
elif input_units == "mm/day":
pass
else:
raise ValueError("""Input units are not 'kg m-2 s-1' or 'mm/day'""")

create_plot(
clim,
dset.attrs["source_id"],
create_precipitation_climatology_plot(
climatology,
input_data.attrs["source_id"],
season,
mask=mask,
gridlines=gridlines,
plot_gridlines=plot_gridlines,
levels=cbar_levels,
)

Expand All @@ -316,8 +309,8 @@ def main(
config_name = "default_config"
else:
print(f"Using configuration in '{config_name}.json'.")
configfile = config_name + ".json"
with open(configfile, encoding="utf-8") as json_file:
config_file = config_name + ".json"
with open(config_file, encoding="utf-8") as json_file:
config = json.load(json_file)

output_filename = f"{config_name}_output.png"
Expand All @@ -329,6 +322,6 @@ def main(
season=config["season_to_plot"],
output_file=output_filename,
mask=config["mask_id"],
gridlines=config["gridlines_on"],
plot_gridlines=config["gridlines_on"],
countries=config["countries_to_record"],
)
7 changes: 7 additions & 0 deletions exercises/05_better_code/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,13 @@ Is the intent clearer?\
Is the layout of the data written to file easier to understand?


## Naming

Look through the code for any names of methods or variables that could be improved or
clarified and update them. Note if you are using an IDE you can use automatic renaming.
Does this make the code easier to follow?


## Configuration settings

The original author of the code has helpfully put a list of the configurable
Expand Down
Loading