Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate CESM ponds (tr_pond_cesm) #733

Merged
merged 3 commits into from
Jul 31, 2022

Conversation

dabail10
Copy link
Contributor

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    Deprecate the tr_ponds_cesm option.
  • Developer(s):
    dabail10 (D. Bailey)
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    Did a full test suite compared against the weekly baselines.
    5945 measured results of 5945 total results
    5856 of 5945 tests PASSED
    0 of 5945 tests PENDING
    0 of 5945 tests MISSING data
    89 of 5945 tests FAILED

Of the 89 tests, the majority of the changed answers were alt05 related which had tr_pond_cesm = .true.

The remaining test failures were as follows. I'm not sure why.

FAIL cheyenne_intel_smoke_gx3_8x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_intel_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data
FAIL cheyenne_pgi_smoke_gx3_8x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_pgi_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data
FAIL cheyenne_pgi_smoke_gx1_144x2_gx1prod_long_run10year run -1 -1 -1
FAIL cheyenne_pgi_smoke_gx1_144x2_gx1prod_long_run10year test
FAIL cheyenne_gnu_smoke_gx3_8x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data

  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
      Removes all tests with tr_ponds_cesm = .true.
  • Does this PR create or have dependencies on Icepack or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:
    There is an option UNDEPRECATE_CESMPONDS to reverse this. It also requires adding tr_pond_cesm and restart_pond_cesm back to ice_in.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

elseif (trcr_depend(it) == 2+nt_apnd .and. &
tr_pond_cesm .or. tr_pond_topo) then
#else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting. I wonder whether the .and. or the .or. took precedence in the logic for this elseif statement.
i.e.
(trcr_depend(it)==2+nt_apnd .and. tr_pond_cesm) .or. tr_pond_topo
or
trcr_depend(it)==2+nt_apnd .and. (tr_pond_cesm .or. tr_pond_topo)
?
The latter is correct, but what was the code actually doing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://www.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/language-reference/expressions-and-assignment-statements/expressions/summary-of-operator-precedence.html the .and. would have precedence on the .or. so the code would do the first thing you listed:

(trcr_depend(it)==2+nt_apnd .and. tr_pond_cesm) .or. tr_pond_topo

i.e. the wrong thing if I understand correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This line is in the state_to_work routine, which is only used for upwind advection. Tracers may not work properly with upwind advection -- see #267 and #542. This faulty logic could be part of the reason why, but it probably doesn't fix the entire issue, which is that many tracers are not implemented at all for upwind.

@eclare108213
Copy link
Contributor

I approved this PR, but looking a little more closely at the test results noted above, I think it would be good to understand why the failures are happening, at least for the dynpicard / reprosum option. These changes shouldn't affect those tests, should they?

@phil-blain
Copy link
Member

Here are the failures quoted above, one per line (easier to read):

FAIL cheyenne_intel_smoke_gx3_8x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_intel_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data
FAIL cheyenne_pgi_smoke_gx3_8x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_pgi_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data
FAIL cheyenne_pgi_smoke_gx1_144x2_gx1prod_long_run10year run -1 -1 -1
FAIL cheyenne_pgi_smoke_gx1_144x2_gx1prod_long_run10year test
FAIL cheyenne_gnu_smoke_gx3_8x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data

In general I do not currently expect the VP solver as it stands on main to be BFB between different decompositions, even with reprosum. So these 3 failures are all expected:

FAIL cheyenne_intel_smoke_gx3_8x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_intel_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data
FAIL cheyenne_pgi_smoke_gx3_8x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_pgi_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data
FAIL cheyenne_gnu_smoke_gx3_8x1_cmplogrest_dynpicard_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_6x4_dynpicard_reprosum_run10day different-data

I do not expect the code in this PR to change anything in this result, I would expect these tests to also fail without these changes.

My upcoming PR which I mentioned last week at the telecon should fix those.

@eclare108213
Copy link
Contributor

Thanks @phil-blain! So the outstanding tests are gx1prod, which might just be timing out, @dabail10?

@dabail10
Copy link
Contributor Author

Good catch! I will fix this.

@eclare108213 eclare108213 self-assigned this Jul 25, 2022
@dabail10
Copy link
Contributor Author

The failure is more cryptic. Seems to be dying in icepack_mechred.F90. Trying to resubmit this test.

@dabail10
Copy link
Contributor Author

I resubmitted and got the same error. It is dying with the following output:

MPT: #11 0x00000000007d5f93 in icepack_mechred::ridge_shift (ntrcr=25,
MPT: dt=4.5672879188504911, ncat=0, hin_max=..., aicen=..., trcrn=...,
MPT: vicen=..., vsnon=..., aice0=4.9409825417387207e-316, trcr_depend=...,
MPT: trcr_base=..., n_trcr_strata=..., nt_strata=..., krdg_redist=16594084,
MPT: aksum=6.9531164811065294e-310, apartic=..., hrmin=..., hrmax=...,
MPT: hrexp=..., krdg=..., closing_net=6.953116481124711e-310,
MPT: opning=6.9531164811243157e-310, ardg1=6.9531164811140392e-310,
MPT: ardg2=6.9531164811144344e-310, virdg=6.9531164811148297e-310,
MPT: aopen=6.9531164811152249e-310, ardg1nn=..., ardg2nn=..., virdgnn=...,
MPT: nslyr=19083552, n_aero=19083564, msnow_mlt=6.9531164811183869e-310,
MPT: esnow_mlt=6.9531164811187822e-310, maero=..., miso=...,
MPT: mpond=6.9531164811136439e-310, aredistn=..., vredistn=...)
MPT: at /glade/work/dbailey/CICE_cesmponds/icepack/columnphysics/icepack_mechred.F90:1582
MPT: #12 0x00000000007cfe7a in icepack_mechred::ridge_ice (dt=3600, ndtd=1,
MPT: ncat=19083540, n_aero=19083564, nilyr=7, nslyr=1, ntrcr=17272192,
MPT: hin_max=..., rdg_conv=5.2161873830377285e-316,
MPT: rdg_shear=5.216720973935237e-316, aicen=..., trcrn=..., vicen=...,
MPT: vsnon=..., aice0=4.9409825417387207e-316, trcr_depend=..., trcr_base=...,
MPT: n_trcr_strata=..., nt_strata=..., krdg_partic=16594080,
MPT: krdg_redist=16594084, mu_rdg=8.198908722030818e-317, tr_brine=17272412,
MPT: dardg1dt=5.1494885208491602e-316, dardg2dt=5.1500221117466688e-316,
MPT: dvirdgdt=5.1505557026441773e-316, opening=5.1510892935416858e-316,
MPT: fpond=5.1868398836747584e-316, fresh=5.187373474572267e-316,
MPT: fhocn=5.1884406563672841e-316, faero_ocn=..., fiso_ocn=..., aparticn=...,
MPT: krdgn=..., aredistn=..., vredistn=..., dardg1ndt=..., dardg2ndt=...,
MPT: dvirdgndt=..., araftn=..., vraftn=..., closing_flag=3745908980,
MPT: closing=6.9531164811689793e-310)

So, it looks like underflow conditions. I will try rerunning in debug mode.

@dabail10
Copy link
Contributor Author

Ok. Are we happy with the results here? Should we merge this before the other two?

@eclare108213
Copy link
Contributor

I think we can merge and it's fine with me to do the CESM ponds deprecation PR(s) first, then the others. Does it make sense to do all of them in Icepack first, then update the Icepack version here and merge this PR, then pull the new Icepack into the other PR? OR does it make more sense to merge the Icepack CESM ponds PR first, then update Icepack here and merge, then merge the Icepack 0-layer and krdg PR, update Icepack again and finally merge the CICE 0-layer and krdg PR?

@apcraig
Copy link
Contributor

apcraig commented Jul 28, 2022

If everything is just about ready, we could merge the two PRs together first in Icepack, then in CICE (after updating the Icepack submodule in one of the CICE PRs). But, it might be better to keep the two distinct set of changes separate in case they don't play well together, in case we want to undo one of them, or if there might be a few days between readiness. So, if CESM ponds is ready, we should merge the icepack PR, update the CICE submodule in the CICE PR and merge the CICE PR. Then we would do the same with the 0layer/ridging PRs.

Is there any additional testing required for the CESM ponds PRs? Would you like me to test independently? Happy not to if others feel it's ready. I'll do a quick review now.

@apcraig
Copy link
Contributor

apcraig commented Jul 28, 2022

Looks like testing is fine. alt05 will change answers and the dynpicard is failing in the baseline. It's always good to look at the baseline results to check whether the failures are new. Have you tested with UNDEPRECATE_CESMPONDS? Not saying it's required, just curious.

Is there any other documentation of CESM ponds that needs to be removed from the user guide?

@apcraig apcraig merged commit 9be1c35 into CICE-Consortium:main Jul 31, 2022
dabail10 added a commit to ESCOMP/CICE that referenced this pull request Oct 4, 2022
* Deprecate CESM ponds

* Namelist changes for deprecating cesmponds

* Update documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants