Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running a test suite with additional options (-s / --sets) breaks when tests are BFB-compared with other tests #569

Closed
phil-blain opened this issue Mar 5, 2021 · 6 comments · Fixed by #610
Assignees

Comments

@phil-blain
Copy link
Member

phil-blain commented Mar 5, 2021

I just ran into this issue again and realized I already knew about that but had not opened an issue to document it.

In the context of #560 I ran the decomp_suite in a debug configuration, i.e. I added -s debug to the cice.setup command line.

However, it seems our testing framework is unprepared for that use case if any test in the test suite have an entry under the BFB-compare column in the <test_suite>.ts definition file. For example, for the decomp suite:

# Test Grid PEs Sets BFB-compare
restart gx3 4x2x25x29x4 dslenderX2
decomp gx3 4x2x25x29x5
sleep 30
restart gx3 1x1x50x58x4 droundrobin,thread restart_gx3_4x2x25x29x4_dslenderX2
restart gx3 4x1x25x116x1 dslenderX1,thread restart_gx3_4x2x25x29x4_dslenderX2

The name of the test in the BFB-compare column is written to the ICE_BFBCOMP variable in cice.settings verbatim, i.e. it does not take into account additional options that were added on the command line. So the BFB compare part of the test fails with "missing-data" because they are not looking at the right place, i.e. the test on line 5 will look for restart_gx3_4x2x25x29x4_dslenderX2 but should look for restart_gx3_4x2x25x29x4_debug_dslenderX2.

So cice.setup should ideally add additional options from the command line before writing ICE_BFBCOMP in cice.settings for each test.

@phil-blain phil-blain changed the title Running a test suite with additional options (-s / --sets) breaks when test are BFB-compared with other tests Running a test suite with additional options (-s / --sets) breaks when tests are BFB-compared with other tests Mar 5, 2021
@phil-blain
Copy link
Member Author

Also (that might be harder, but still I'll mention it): when adding -s debug, obviously the runs take longer (because of -O0) and so some tests go beyond their wallclock. So it would be nice to have an option to increase wallclock by some factor, something like -s double_wallclock or similar, that multiply the wallclock by 2 for each test, so that the jobs do not run out of time.

@apcraig
Copy link
Contributor

apcraig commented Mar 5, 2021

It's going to be hard to support this feature. I think we'll just have to live with the fact that the bfb-compare feature will not be working. The easy alternative is to create a custom test suite file with the bfb baseline casename modified as needed.

With regard to time outs, we can already support this. Try -s debug,short or -s debug,medium. That will turn on debug and extend the wall time.

@phil-blain
Copy link
Member Author

phil-blain commented Mar 5, 2021

Yeah I understand it's kind of an edge case, I just wanted to open an issue to document it.

Reagrding short/medium: yes, I can use that, but I would have prefered a factor, so as not to "lie" to the scheduler (and so being scheduled later). So tests that are defined as 15 minutes become 30 minutes, those that are defined as 30 become 60, etc.

But I can live with that :)

@phil-blain
Copy link
Member Author

phil-blain commented Jun 15, 2021

This bit me again yesterday and I came up with this patch to cice.setup. It uses the same approach as is used to sort the sets a few lines above:

diff --git i/cice.setup w/cice.setup
index 8dc4600..cc5e6ff 100755
--- i/cice.setup
+++ w/cice.setup
@@ -644,11 +644,6 @@ EOF
       set bfbcomp = "$bfbcomp_tmp"
     endif
 
-    set fbfbcomp = ${spval}
-    if ($bfbcomp != ${spval}) then
-      set fbfbcomp = ${machcomp}_${bfbcomp}
-    endif
-
     #------------------------------------------------------------
     # Parse pesx with strict checking, limit pes for machine
 
@@ -768,6 +763,26 @@ EOF
       set testname_base = "${machcomp}_${test}_${grid}_${pesx}${soptions}.${testid}"
       set testname = "${tsdir}/${testname_base}"
       set case = ${testname}
+
+      # Get sets in bfbcomp, add sets from cice.setup argument (-s / --sets) and sort them
+      set bfbcomp_regex="\(.*_[0-9x]*\)_\(.*\)"
+      set bfbcomp_test_grid_pes=`echo ${bfbcomp} | sed "s/${bfbcomp_regex}/\1/"`
+      set bfbcomp_sets=`echo ${bfbcomp} | sed "s/${bfbcomp_regex}/\2/" | sed 's/_/,/g' `
+      set bfbcomp_sets="${bfbcomp_sets},${sets_base}"
+      set bfbcomp_soptions = ""
+      # Create sorted array and remove duplicates and "none"
+      set bfbcomp_setsarray = `echo ${bfbcomp_sets} | sed 's/,/ /g' | fmt -1 | sort -u`
+      if ("${bfbcomp_setsarray}" != "") then
+        foreach field (${bfbcomp_setsarray})
+          if (${field} != "none") then
+            set bfbcomp_soptions = ${bfbcomp_soptions}"_"${field}
+          endif
+        end
+      endif
+      set fbfbcomp = ${spval}
+      if ($bfbcomp != ${spval}) then
+        set fbfbcomp = ${machcomp}_${bfbcomp_test_grid_pes}${bfbcomp_soptions}
+      endif
     endif
 
     if (-d ${case}) then

it seems to do the trick: additional sets from the command line are correctly written to the test name appearing in ICE_BFBCOMP in cice.settings.

@apcraig
Copy link
Contributor

apcraig commented Jun 15, 2021

Thanks @phil-blain, this seems like a good solution. I can include it in my next PR unless you create a PR first.

@apcraig apcraig self-assigned this Jun 19, 2021
apcraig added a commit to apcraig/CICE that referenced this issue Jun 19, 2021
- Fix bugs in history/restart frequency associated with new calendar (CICE-Consortium#589)
- Define frequency in absolute terms relative to 0000-01-01-00000 and document (CICE-Consortium#589)
- Update set_nml.histall to include hourly output (CICE-Consortium#589)
- Update test scripts to cleanly abort if run fails where possible (CICE-Consortium#608)
- Update decomp test so it's rerunable, remove restart at start of run (CICE-Consortium#601)
- Add ability to do bfbcomp tests with additional options set on command line (CICE-Consortium#569)
- Update documentation of calendar frequency computation, calendar types, and closed boundaries (CICE-Consortium#541)
- Add optional doabort flag to abort_ice to control whether the method aborts.  This
  is useful for testing and code coverage statistics, although doabort=.false.
  will not call the actual abort method, but we can test the interfaces and
  rest of the code.
@apcraig
Copy link
Contributor

apcraig commented Jun 19, 2021

Just a quick note that this seems to work well. We do need to keep the original implementation around for "cases" but the fix works well for suites.

apcraig added a commit that referenced this issue Jul 2, 2021
* Fix history/restart frequency bugs and update scripts

- Fix bugs in history/restart frequency associated with new calendar (#589)
- Update set_nml.histall to include hourly output (#589)
- Update test scripts to cleanly abort if run fails where possible (#608)
- Update decomp test so it's rerunable, remove restart at start of run (#601)
- Add ability to do bfbcomp tests with additional options set on command line (#569)
- Update documentation of calendar frequency computation, calendar types, and closed boundaries (#541)
- Add optional doabort flag to abort_ice to control whether the method aborts.  This
  is useful for testing and code coverage statistics, although doabort=.false.
  will not call the actual abort method, but we can test the interfaces and
  rest of the code.
- Add histfreq_base and dumpfreq_base ('init' or 'zero') to specify
  reference data for history and restart output.  Defaults are
  'zero' and 'init' respectively for hist and dump.
  Setting histfreq_base to 'zero' allows for consistent output
  across multiple runs.  Setting dumpfreq_base to 'init' allows
  the standard testing which requires restarts be written,
  for example, 5 days after the start of the run.
- Remove extra abort calls in bcstchk and sumchk on runs that
  complete fine but don't pass checks.  These aborts should never
  have been there.
- Update documentation.

- Clean up some of the unit tests to better support regression testing

- modify initial/restart implementation
- restart namelist is deprecated, now computed internally
- modify initial/continue init checks and set restart and use_restart_time as needed
- create compute_relative_elapsed method in ice_calendar to improve code reuse
- update documentation with regard to initial/continue modes
- Set default use_restart_time to false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants