Running a test suite with additional options (-s / --sets) breaks when tests are BFB-compared with other tests #569

phil-blain · 2021-03-05T14:26:11Z

I just ran into this issue again and realized I already knew about that but had not opened an issue to document it.

In the context of #560 I ran the decomp_suite in a debug configuration, i.e. I added -s debug to the cice.setup command line.

However, it seems our testing framework is unprepared for that use case if any test in the test suite have an entry under the BFB-compare column in the <test_suite>.ts definition file. For example, for the decomp suite:

CICE/configuration/scripts/tests/decomp_suite.ts

Lines 1 to 6 in 0ac4cd9

    
           # Test         Grid    PEs        Sets    BFB-compare 
        
           restart        gx3     4x2x25x29x4   dslenderX2 
        
           decomp         gx3     4x2x25x29x5 
        
           sleep 30 
        
           restart        gx3     1x1x50x58x4   droundrobin,thread restart_gx3_4x2x25x29x4_dslenderX2 
        
           restart        gx3     4x1x25x116x1  dslenderX1,thread  restart_gx3_4x2x25x29x4_dslenderX2

The name of the test in the BFB-compare column is written to the ICE_BFBCOMP variable in cice.settings verbatim, i.e. it does not take into account additional options that were added on the command line. So the BFB compare part of the test fails with "missing-data" because they are not looking at the right place, i.e. the test on line 5 will look for restart_gx3_4x2x25x29x4_dslenderX2 but should look for restart_gx3_4x2x25x29x4_debug_dslenderX2.

So cice.setup should ideally add additional options from the command line before writing ICE_BFBCOMP in cice.settings for each test.

The text was updated successfully, but these errors were encountered:

phil-blain · 2021-03-05T16:40:54Z

Also (that might be harder, but still I'll mention it): when adding -s debug, obviously the runs take longer (because of -O0) and so some tests go beyond their wallclock. So it would be nice to have an option to increase wallclock by some factor, something like -s double_wallclock or similar, that multiply the wallclock by 2 for each test, so that the jobs do not run out of time.

apcraig · 2021-03-05T16:47:32Z

It's going to be hard to support this feature. I think we'll just have to live with the fact that the bfb-compare feature will not be working. The easy alternative is to create a custom test suite file with the bfb baseline casename modified as needed.

With regard to time outs, we can already support this. Try -s debug,short or -s debug,medium. That will turn on debug and extend the wall time.

phil-blain · 2021-03-05T16:50:04Z

Yeah I understand it's kind of an edge case, I just wanted to open an issue to document it.

Reagrding short/medium: yes, I can use that, but I would have prefered a factor, so as not to "lie" to the scheduler (and so being scheduled later). So tests that are defined as 15 minutes become 30 minutes, those that are defined as 30 become 60, etc.

But I can live with that :)

phil-blain · 2021-06-15T15:02:03Z

This bit me again yesterday and I came up with this patch to cice.setup. It uses the same approach as is used to sort the sets a few lines above:

diff --git i/cice.setup w/cice.setup
index 8dc4600..cc5e6ff 100755
--- i/cice.setup
+++ w/cice.setup
@@ -644,11 +644,6 @@ EOF
       set bfbcomp = "$bfbcomp_tmp"
     endif
 
-    set fbfbcomp = ${spval}
-    if ($bfbcomp != ${spval}) then
-      set fbfbcomp = ${machcomp}_${bfbcomp}
-    endif
-
     #------------------------------------------------------------
     # Parse pesx with strict checking, limit pes for machine
 
@@ -768,6 +763,26 @@ EOF
       set testname_base = "${machcomp}_${test}_${grid}_${pesx}${soptions}.${testid}"
       set testname = "${tsdir}/${testname_base}"
       set case = ${testname}
+
+      # Get sets in bfbcomp, add sets from cice.setup argument (-s / --sets) and sort them
+      set bfbcomp_regex="\(.*_[0-9x]*\)_\(.*\)"
+      set bfbcomp_test_grid_pes=`echo ${bfbcomp} | sed "s/${bfbcomp_regex}/\1/"`
+      set bfbcomp_sets=`echo ${bfbcomp} | sed "s/${bfbcomp_regex}/\2/" | sed 's/_/,/g' `
+      set bfbcomp_sets="${bfbcomp_sets},${sets_base}"
+      set bfbcomp_soptions = ""
+      # Create sorted array and remove duplicates and "none"
+      set bfbcomp_setsarray = `echo ${bfbcomp_sets} | sed 's/,/ /g' | fmt -1 | sort -u`
+      if ("${bfbcomp_setsarray}" != "") then
+        foreach field (${bfbcomp_setsarray})
+          if (${field} != "none") then
+            set bfbcomp_soptions = ${bfbcomp_soptions}"_"${field}
+          endif
+        end
+      endif
+      set fbfbcomp = ${spval}
+      if ($bfbcomp != ${spval}) then
+        set fbfbcomp = ${machcomp}_${bfbcomp_test_grid_pes}${bfbcomp_soptions}
+      endif
     endif
 
     if (-d ${case}) then

it seems to do the trick: additional sets from the command line are correctly written to the test name appearing in ICE_BFBCOMP in cice.settings.

apcraig · 2021-06-15T16:04:30Z

Thanks @phil-blain, this seems like a good solution. I can include it in my next PR unless you create a PR first.

- Fix bugs in history/restart frequency associated with new calendar (CICE-Consortium#589) - Define frequency in absolute terms relative to 0000-01-01-00000 and document (CICE-Consortium#589) - Update set_nml.histall to include hourly output (CICE-Consortium#589) - Update test scripts to cleanly abort if run fails where possible (CICE-Consortium#608) - Update decomp test so it's rerunable, remove restart at start of run (CICE-Consortium#601) - Add ability to do bfbcomp tests with additional options set on command line (CICE-Consortium#569) - Update documentation of calendar frequency computation, calendar types, and closed boundaries (CICE-Consortium#541) - Add optional doabort flag to abort_ice to control whether the method aborts. This is useful for testing and code coverage statistics, although doabort=.false. will not call the actual abort method, but we can test the interfaces and rest of the code.

apcraig · 2021-06-19T18:06:15Z

Just a quick note that this seems to work well. We do need to keep the original implementation around for "cases" but the fix works well for suites.

* Fix history/restart frequency bugs and update scripts - Fix bugs in history/restart frequency associated with new calendar (#589) - Update set_nml.histall to include hourly output (#589) - Update test scripts to cleanly abort if run fails where possible (#608) - Update decomp test so it's rerunable, remove restart at start of run (#601) - Add ability to do bfbcomp tests with additional options set on command line (#569) - Update documentation of calendar frequency computation, calendar types, and closed boundaries (#541) - Add optional doabort flag to abort_ice to control whether the method aborts. This is useful for testing and code coverage statistics, although doabort=.false. will not call the actual abort method, but we can test the interfaces and rest of the code. - Add histfreq_base and dumpfreq_base ('init' or 'zero') to specify reference data for history and restart output. Defaults are 'zero' and 'init' respectively for hist and dump. Setting histfreq_base to 'zero' allows for consistent output across multiple runs. Setting dumpfreq_base to 'init' allows the standard testing which requires restarts be written, for example, 5 days after the start of the run. - Remove extra abort calls in bcstchk and sumchk on runs that complete fine but don't pass checks. These aborts should never have been there. - Update documentation. - Clean up some of the unit tests to better support regression testing - modify initial/restart implementation - restart namelist is deprecated, now computed internally - modify initial/continue init checks and set restart and use_restart_time as needed - create compute_relative_elapsed method in ice_calendar to improve code reuse - update documentation with regard to initial/continue modes - Set default use_restart_time to false

phil-blain changed the title ~~Running a test suite with additional options (-s / --sets) breaks when test are BFB-compared with other tests~~ Running a test suite with additional options (-s / --sets) breaks when tests are BFB-compared with other tests Mar 5, 2021

phil-blain added Scripts Testing labels Mar 5, 2021

apcraig self-assigned this Jun 19, 2021

apcraig mentioned this issue Jun 19, 2021

Fix history and restart frequency, new features to scripts #610

Merged

16 tasks

apcraig closed this as completed in #610 Jul 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running a test suite with additional options (-s / --sets) breaks when tests are BFB-compared with other tests #569

Running a test suite with additional options (-s / --sets) breaks when tests are BFB-compared with other tests #569

phil-blain commented Mar 5, 2021 •

edited

Loading

phil-blain commented Mar 5, 2021

apcraig commented Mar 5, 2021

phil-blain commented Mar 5, 2021 •

edited

Loading

phil-blain commented Jun 15, 2021 •

edited

Loading

apcraig commented Jun 15, 2021

apcraig commented Jun 19, 2021

Running a test suite with additional options (-s / --sets) breaks when tests are BFB-compared with other tests #569

Running a test suite with additional options (-s / --sets) breaks when tests are BFB-compared with other tests #569

Comments

phil-blain commented Mar 5, 2021 • edited Loading

phil-blain commented Mar 5, 2021

apcraig commented Mar 5, 2021

phil-blain commented Mar 5, 2021 • edited Loading

phil-blain commented Jun 15, 2021 • edited Loading

apcraig commented Jun 15, 2021

apcraig commented Jun 19, 2021

phil-blain commented Mar 5, 2021 •

edited

Loading

phil-blain commented Mar 5, 2021 •

edited

Loading

phil-blain commented Jun 15, 2021 •

edited

Loading