Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FSD fixes for conservation #495

Open
wants to merge 32 commits into
base: main
Choose a base branch
from
Open

Conversation

dabail10
Copy link
Contributor

@dabail10 dabail10 commented Jul 29, 2024

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    This fixes the conservation issues in CICE with the FSD, but also for non-FSD cases!
  • Developer(s):
    @dabail10 , @lettie-roach , @cmbitz
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    https://github.com/CICE-Consortium/Test-Results/wiki/icepack_by_hash_forks#925bdea8760a8d833f0550b408226e80a8969378
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on CICE or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please document the changes in detail, including why the changes are made. This will become part of the PR commit log.
    This is a bug fix plus some rearranging for conservation in CICE. This impacts FSD cases as well as non-FSD cases.

There are two main aspects to the bug fix:

  1. Initial ice and snow volume values at the beginning of the lateral melt routine are now used for updating the snow and ice enthalpy as well other tracers for lateral melting. This is answer changing in all cases!
  2. The other fix is to move the computation of vi0new_lat (lateral growth from FSD) outside of the subroutine fsd_lateral_growth and a check to make sure the category vi0new (vin0new) does not exceed the total vi0new. This is a warning and it recomputes to ensure conservation. This is only answer changing in FSD cases.

The rest of the changes were to move the computation of rsiden and get rid of fside which is no longer needed.

There is an associated CICE tag to go with this.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial read-through of the code diffs, trying to understand what's happening here and noting a few things that stand out.
What exactly is the conservation bug fix? There are a lot of changes, so it's not clear.

configuration/driver/icedrv_arrays_column.F90 Outdated Show resolved Hide resolved
configuration/driver/icedrv_arrays_column.F90 Outdated Show resolved Hide resolved
configuration/driver/icedrv_arrays_column.F90 Outdated Show resolved Hide resolved
configuration/driver/icedrv_flux.F90 Show resolved Hide resolved
configuration/driver/icedrv_InitMod.F90 Outdated Show resolved Hide resolved
columnphysics/icepack_therm_itd.F90 Show resolved Hide resolved
columnphysics/icepack_therm_itd.F90 Show resolved Hide resolved
@@ -1251,37 +1172,37 @@ subroutine lateral_melt (dt, ncat, &
if (z_tracers) &
call lateral_melt_bgc(dt, &
ncat, nblyr, &
rside, vicen_init, & !echmod: use rsiden
rsiden(1), vicen_init, & !echmod: use rsiden !CMB this should be sent rsiden
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete comment

columnphysics/icepack_therm_itd.F90 Outdated Show resolved Hide resolved
columnphysics/icepack_therm_itd.F90 Show resolved Hide resolved
@eclare108213
Copy link
Contributor

GHActions/IcepackTesting failed, so I relaunched it.

@dabail10
Copy link
Contributor Author

One of the failed tests is a Picard convergence. I will do some offline testing.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some responses -

configuration/driver/icedrv_arrays_column.F90 Outdated Show resolved Hide resolved
columnphysics/icepack_therm_itd.F90 Show resolved Hide resolved
configuration/driver/icedrv_arrays_column.F90 Outdated Show resolved Hide resolved
@apcraig
Copy link
Contributor

apcraig commented Jul 30, 2024

This seems to change answers for all configurations. Is that expected?

There is also an "out of memory" error for one of the tests. The derecho test suite, https://github.com/CICE-Consortium/Test-Results/wiki/icepack_by_hash_forks#925bdea8760a8d833f0550b408226e80a8969378, seems to have the same errors thrown up by github actions.

Please add more information to the PR section "Please document the changes in detail, including why the changes are made. This will become part of the PR commit log" listing and explaining what things changed in greater detail. Also, you indicate a dependency on CICE, does this mean public Icepack interfaces changed and if so, please document that. Good to also explain why answers changed in a little greater detail. Thanks.

@dabail10
Copy link
Contributor Author

This test is failing in github actions. This same test is fine on derecho_intel.

testsuite.macos-latest/conda_macos_restart_col_1x1_swccsm3.macos-latest/logs/icepack.runlog.240730-182013


(picard_solver) picard_solver: Picard solver non-convergence
(icepack_warnings_setabort) T :file /Users/runner/icepack/columnphysics/icepack_therm_mushy.F90 :line 1399
(icepack_warnings_aborted) ... (picard_solver)
(icepack_warnings_aborted) ... (two_stage_solver_nosnow)
(icepack_warnings_aborted) ... (temperature_changes_salinity)
(temperature_changes_salinity)temperature_changes_salinity: Picard solver non-convergence (no snow)
(icepack_warnings_aborted) ... (thermo_vertical)
(icepack_warnings_aborted) ... (icepack_step_therm1)
(icepack_step_therm1) ice: Vertical thermo error, cat 1
(icepack_warnings_aborted) ... (icepack_step_therm1)
(icepack_warnings_aborted) ... (icepack_step_therm1)
(icepack_warnings_aborted) ... (icepack_step_therm1)

(icedrv_system_abort) ABORTED:
(icedrv_system_abort) called from /Users/runner/icepack/configuration/driver/icedrv_step.F90
(icedrv_system_abort) line number 420
(icedrv_system_abort) string = (step_therm1)

@apcraig
Copy link
Contributor

apcraig commented Jul 30, 2024

You could try running that test on derecho with other compilers (gnu, cray?) and also turn on the debug flags to see if it picks anything up.

Could this be a fluke that is related to #494?

Have you tried running in CICE and does the test suite run fine there?

Are there some modifications that could be impacting this test in particular?

@dabail10
Copy link
Contributor Author

I am able to reproduce it on my macos conda environment. There are NaN's in the mushy-layer solution. This case is using the old shortwave (ccsm3) with ktherm=2. I don't think we should be testing this.

@apcraig
Copy link
Contributor

apcraig commented Jul 30, 2024

I am able to reproduce it on my macos conda environment. There are NaN's in the mushy-layer solution. This case is using the old shortwave (ccsm3) with ktherm=2. I don't think we should be testing this.

Can you suggest changes to the swccsm3 test? Always good to review our test suite and continue to make improvements. Feel free to also suggest a couple new tests.

@dabail10
Copy link
Contributor Author

I am testing this out with ktherm=1. I think this might be better.

@apcraig apcraig added the Physics label Aug 8, 2024
@dabail10
Copy link
Contributor Author

What else is needed here? Can I change the swccsm3 to use ktherm=1?

@eclare108213
Copy link
Contributor

Can I change the swccsm3 to use ktherm=1?

Yes.

I'll look at my previous comments to see what can be marked as resolved.

@dabail10
Copy link
Contributor Author

Ok. I have cleaned up the driver code and only the changes related to rside / rsiden are now there.

@dabail10
Copy link
Contributor Author

Weird. It is emailing me and telling me gh-actions is failing.

@apcraig
Copy link
Contributor

apcraig commented Aug 22, 2024

The GH Actions failures are happening with some regularity these days. Restarting them seems to work. Not clear what's going on.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is very close, just a few clean-up items.

@@ -884,13 +884,13 @@ subroutine lateral_melt (dt, ncat, &
fresh, fsalt, &
fhocn, faero_ocn, &
fiso_ocn, &
rside, meltl, &
fside, wlat, &
rsiden, meltl, &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix column alignment (picky, sorry, but might as well fix it now)

@@ -2276,14 +2210,15 @@ subroutine icepack_step_therm2 (dt, ncat, nltrcr, &
fresh, fsalt, &
fhocn, faero_ocn, &
fiso_ocn, &
rside, meltl, &
fside, wlat, &
rsiden, meltl, &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment

faero_ocn(k) = faero_ocn(k) + (vsnon_init(n) &
*(trcrn(nt_aero +4*(k-1),n) &
+ trcrn(nt_aero+1+4*(k-1),n)) &
+ vicen_init(n) &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment could be fixed here too:

faero_ocn(k) = faero_ocn(k) &
             + (vsnon_init(n) * (trcrn(nt_aero  +4*(k-1),n)   &
                               + trcrn(nt_aero+1+4*(k-1),n))  &
             +  vicen_init(n) * (trcrn(nt_aero+2+4*(k-1),n)   &
                               + trcrn(nt_aero+3+4*(k-1),n))) &
             * rsiden(n) / dt


end if

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove blank line

use icedrv_calendar, only: yday
use icedrv_domain_size, only: ncat, nilyr, nslyr, n_aero, n_iso, nx
use icedrv_flux, only: frzmlt, sst, Tf, strocnxT, strocnyT, rside, fside, wlat, &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like fside is no longer used at all. Are there any references to it in the docs?

@dabail10
Copy link
Contributor Author

Well, I think I fixed the fsdfixes PR with the latest BGC updates. This was a bit of a mess. Please let me know if you find stuff here.

@apcraig
Copy link
Contributor

apcraig commented Sep 20, 2024

Sorry about that @dabail10. Thanks for doing that. The main thing is that the BGC is merged now and we can move forward.

@apcraig
Copy link
Contributor

apcraig commented Sep 20, 2024

Can you rerun some tests, both technical and scientific. The Github Actions is failing with a build error. Lets just make sure things are still working. If you want some help from me, just ask. Thanks!

@dabail10
Copy link
Contributor Author

Found the one syntax error from the BGC merge. My tests are still waiting in the queue on derecho.

@dabail10
Copy link
Contributor Author

Here are the updated test results. The answer changes are due to some reworking of the FSD code and bug fixes which impact non FSD cases.

https://github.com/CICE-Consortium/Test-Results/wiki/icepack_by_hash_forks#b7270c6404e186df30923781dbdd04e9a3cbef6e

@apcraig
Copy link
Contributor

apcraig commented Sep 24, 2024

Can you create a PR for the CICE changes (minus the update to Icepack). Have we done a QC test with CICE yet? Given all answers change, I think we should also run a full test suite on all the compilers on Derecho for Icepack and CICE. I'm happy to do that whenever it's the appropriate time after further review.

@dabail10
Copy link
Contributor Author

We should do a QC test with CICE I would guess. I am just working on the CICE tag now.

@eclare108213
Copy link
Contributor

CICE PR: CICE-Consortium/CICE#975

Copy link
Contributor

@apcraig apcraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There need to be checks added somewhere (in icepack_step_therm1 or frzmlt_bottom_lat) that check for present of afsdn, floe_rad_c, and floe_binwidth if tr_fsd is true. Also, it looks like floe_rad_c and floe_binwidth are static fields defined at initialization in Icepack and then passed back and forth between the driver/CICE and Icepack. Maybe we should create some permanent space in fsd to store those variables and stop passing them back and forth. We want to avoid passing in static data over and over, that's not what the interfaces are for.

I think aicen should NOT be an optional argument in frzmlt_bottom_lat. That is not public and is called in one place in Icepack.

There are many places where the alignment could be improved.

Let me know if you need some help with any of these changes.

@dabail10
Copy link
Contributor Author

dabail10 commented Oct 2, 2024

aicen is only used when tr_fsd is .true. However, you are proposing that afsdn, floe_binwidth, floe_rad_c, and aicen are all brought in via a use statement instead?

@apcraig
Copy link
Contributor

apcraig commented Oct 2, 2024

aicen is only used when tr_fsd is .true. However, you are proposing that afsdn, floe_binwidth, floe_rad_c, and aicen are all brought in via a use statement instead?

I'm thinking aicen should be an argument but not optional. No reason to make it optional as far as I can tell. That argument is internal to icepack and it will always be passed whether fsd is on or not. Why make it optional if it'll always be passed and the public interfaces are unaffected?

In icepack_step_therm1, we could debate whether afsdn should be optional. I would lean slightly towards saying making it optional (for backwards compatibility) but then it has to be checked whether it's passed when tr_fsd is true.

With floe_rad_c and floe_binwidth, I'm wondering if we should change how those variables are implemented. Could they live in icepack_fsd such that when fsd is initialized, they are initialized in fsd as module data. Then we could get rid of those arguments in all the fsd calls. If the driver/CICE need to know the values of those variables, we could place them in icepack_parameters and add a query routine or we could simply make them in/out in the fsd init routine but still keep a copy in icepack_fsd for use internally. If those variables are static and identical for all gridcells, then we should store them in Icepack and stop passing them back and forth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants