Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML correction on CPU and GPU #2909

Closed
wants to merge 46 commits into from
Closed

Conversation

elynnwu
Copy link
Contributor

@elynnwu elynnwu commented Jul 18, 2024

In this PR, we add the capability of running ML correction on GPU. The call to the python code differs due to how pybind11 handles the exchange. On cpu, we continue to rely on pybind11 to pass the array pointer using its integration with numpy. On gpu, we pass the pointer then rebuild it as a cupy array on the python side (gpu allows unmanaged memory access). By doing so, the rest of the code is unchanged since xarray can work with both numpy and cupy arrays. As a result, the actual calls to do ML correction is identical between cpu and gpu.

We also introduce a few features in this PR:

  • new option for only correcting for temperature and not specific humidity is added.
  • surface radiative flux correction now overwrites all direct and diffused component of sfc_flux as well as sfc_flux_sw_net and sfc_flux_sw_dn

We have also started focusing on using perlmutter cpu and gpu as our main machine for ML corrective work. A shared python env is now maintained at: /global/common/software/m4492/fv3net-shared-py39

frodre and others added 30 commits February 26, 2024 14:48
This commit fixes issues with the implementation of precipitation
adjustment in ML when running on GPU's.

Additionally this commit turns on property checks
to ensure that ML cannot produce an unrealistic state.
Option to do ML correction on temperature only
@mahf708 mahf708 dismissed their stale review July 24, 2024 23:36

defer to ndk

@E3SM-Autotester
Copy link
Collaborator

The base branch has been updated since the last successful testing.

  • last PASS base branch sha: 53ac170
  • current base branch sha : a1b89eb
    The AutoTester will discard the last PASS, and re-test the PR from scratch

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5695
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 1842bb0
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5931
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 1842bb0
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: feature/scream-ml-gpu-only
  • SHA: 1842bb0
  • Mode: TEST_REPO

Pull Request Author: elynnwu

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5695
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 1842bb0
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5931
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 1842bb0
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM
SCREAM_PullRequest_Autotester_Mappy # 5695 PASSED (click to see last 100 lines of console output)

Finished XML for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 in 0.349779 seconds (PASS)
Starting SETUP for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 with 1 procs
Finished SETUP for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 2.561920 seconds (PASS)
Starting SHAREDLIB_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 with 1 procs
Finished SETUP for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 in 2.101166 seconds (PASS)
Finished SETUP for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 2.591681 seconds (PASS)
Finished SETUP for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 2.593907 seconds (PASS)
Starting SHAREDLIB_BUILD for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 procs
Starting SHAREDLIB_BUILD for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 with 1 procs
Starting SHAREDLIB_BUILD for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 procs
Finished SETUP for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 in 2.558289 seconds (PASS)
Starting SHAREDLIB_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 with 1 procs
Finished SHAREDLIB_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics in 56.582903 seconds (PASS)
Starting MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics with 16 procs
Finished SHAREDLIB_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci in 56.177420 seconds (PASS)
Finished SHAREDLIB_BUILD for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 in 55.869504 seconds (PASS)
Starting MODEL_BUILD for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 with 16 procs
Starting MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci with 16 procs
Finished SHAREDLIB_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 in 56.210957 seconds (PASS)
Starting MODEL_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 with 16 procs
Finished SHAREDLIB_BUILD for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu in 75.617672 seconds (PASS)
Finished SHAREDLIB_BUILD for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 in 73.318366 seconds (PASS)
Finished SHAREDLIB_BUILD for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 96.942739 seconds (PASS)
Finished SHAREDLIB_BUILD for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 184.300142 seconds (PASS)
Finished SHAREDLIB_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 208.104160 seconds (PASS)
Starting MODEL_BUILD for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 16 procs
Finished MODEL_BUILD for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 in 201.625341 seconds (PASS)
Starting RUN for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 with 1 proc on interactive node and 64 procs on compute nodes
Finished MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics in 202.450126 seconds (PASS)
Starting RUN for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 in 1.017522 seconds (PEND). [COMPLETED 1 of 9]
Finished RUN for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics in 0.999329 seconds (PEND). [COMPLETED 2 of 9]
Starting MODEL_BUILD for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu with 16 procs
Starting MODEL_BUILD for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 with 16 procs
Finished MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci in 205.385312 seconds (PASS)
Starting RUN for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci in 1.213220 seconds (PEND). [COMPLETED 3 of 9]
Starting MODEL_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 with 16 procs
Finished MODEL_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 in 208.080854 seconds (PASS)
Starting RUN for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 in 2.173403 seconds (PEND). [COMPLETED 4 of 9]
Starting MODEL_BUILD for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 16 procs
Finished MODEL_BUILD for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 in 419.165649 seconds (PASS)
Starting RUN for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 with 1 proc on interactive node and 64 procs on compute nodes
Finished MODEL_BUILD for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu in 419.900553 seconds (PASS)
Starting RUN for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu with 1 proc on interactive node and 16 procs on compute nodes
Finished RUN for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 in 0.925786 seconds (PEND). [COMPLETED 5 of 9]
Finished RUN for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu in 0.915043 seconds (PEND). [COMPLETED 6 of 9]
Finished MODEL_BUILD for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 455.000823 seconds (PASS)
Starting RUN for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 6.227222 seconds (PEND). [COMPLETED 7 of 9]
Finished MODEL_BUILD for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 562.472670 seconds (PASS)
Starting RUN for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 1.060522 seconds (PEND). [COMPLETED 8 of 9]
Finished MODEL_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 681.784847 seconds (PASS)
Starting RUN for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 1.820817 seconds (PEND). [COMPLETED 9 of 9]
Waiting for tests to finish
PASS ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20240724_185031_91gfqh
PASS ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4.C.20240724_185031_91gfqh
PASS ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5.C.20240724_185031_91gfqh
PASS ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2.C.20240724_185031_91gfqh
PASS ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu.C.20240724_185031_91gfqh
PASS PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20240724_185031_91gfqh
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20240724_185031_91gfqh
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20240724_185031_91gfqh
PASS SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3.C.20240724_185031_91gfqh
test-scheduler took 986.3452866077423 seconds'
+ [[ 0 != 0 ]]
+ set +x
$ ssh-agent -k
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 28958 killed;
[ssh-agent] Stopped.
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel

[SCREAM_PullRequest_Autotester_Mappy] $ /bin/bash -le /tmp/jenkins16374636370419701946.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Finished: SUCCESS

SCREAM_PullRequest_Autotester_Weaver # 5931 FAILED (click to see last 100 lines of console output)

CMake Error at /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/cmake/ctest_script.cmake:76 (message):
  Test had fails

===============================================================================
Testing '''10da1384cb5c1da00debce66959e5ca2dcab9b99''' for test '''full_debug'''

RUN: taskset -c 0-51 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/ctest-build/full_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/ctest-build/full_debug -DBUILD_NAME_MOD=full_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=True -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/ctest-build/full_debug

Testing '''10da1384cb5c1da00debce66959e5ca2dcab9b99''' for test '''release'''

RUN: taskset -c 104-155 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/ctest-build/release/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/ctest-build/release -DBUILD_NAME_MOD=release -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Release -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/release" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/ctest-build/release

Testing '''10da1384cb5c1da00debce66959e5ca2dcab9b99''' for test '''full_sp_debug'''

RUN: taskset -c 52-103 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/ctest-build/full_sp_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/ctest-build/full_sp_debug -DBUILD_NAME_MOD=full_sp_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DSCREAM_DOUBLE_PRECISION=False -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_sp_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx/ctest-build/full_sp_debug
Build type full_debug failed at testing time. Here'''s a list of failed tests:
108:mam4_aci_standalone_baseline_cmp

Build type full_sp_debug failed at testing time. Here'''s a list of failed tests:
93:mam4_aci_standalone_baseline_cmp

Build type release failed at testing time. Here'''s a list of failed tests:
107:mam4_aci_standalone_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx
weaver failed'

  • errors='Build type full_debug failed at testing time. Here'''s a list of failed tests:
    108:mam4_aci_standalone_baseline_cmp

Build type full_sp_debug failed at testing time. Here'''s a list of failed tests:
93:mam4_aci_standalone_baseline_cmp

Build type release failed at testing time. Here'''s a list of failed tests:
107:mam4_aci_standalone_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx
weaver failed'

  • SA_FAILURES_DETAILS+='Build type full_debug failed at testing time. Here'''s a list of failed tests:
    108:mam4_aci_standalone_baseline_cmp

Build type full_sp_debug failed at testing time. Here'''s a list of failed tests:
93:mam4_aci_standalone_baseline_cmp

Build type release failed at testing time. Here'''s a list of failed tests:
107:mam4_aci_standalone_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx
weaver failed'

  • [[ 1 == 0 ]]
  • [[ weaver == \m\a\p\p\y ]]
  • set +x
    ######################################################
    FAILS DETECTED:
    SCREAM STANDALONE TESTING FAILED!
    Build type full_debug failed at testing time. Here's a list of failed tests:
    108:mam4_aci_standalone_baseline_cmp

Build type full_sp_debug failed at testing time. Here's a list of failed tests:
93:mam4_aci_standalone_baseline_cmp

Build type release failed at testing time. Here's a list of failed tests:
107:mam4_aci_standalone_baseline_cmp

Error(s) occurred during test phase
OVERALL STATUS: FAIL
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5931/scream/components/eamxx
weaver failed
######################################################
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins6671772201402649710.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: lbertag@sandia.gov
Finished: FAILURE

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5712
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5946
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: feature/scream-ml-gpu-only
  • SHA: 40e0fd5
  • Mode: TEST_REPO

Pull Request Author: elynnwu

@E3SM-Autotester
Copy link
Collaborator

NOTICE: The AutoTester has encountered an internal error (usually a Communications Timeout), testing will be restarted, previous tests may still be running but will be ignored by the AutoTester...

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5715
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5949
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: feature/scream-ml-gpu-only
  • SHA: 40e0fd5
  • Mode: TEST_REPO

Pull Request Author: elynnwu

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5715
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5949
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS NOT BEEN REVIEWED YET!

@E3SM-Autotester
Copy link
Collaborator

All Jobs Finished; status = PASSED, target_sha=6891ee3cb825adf849cab1238f2f6fc7bbc3217d, However Inspection must be performed before merge can occur...

@E3SM-Autotester
Copy link
Collaborator

The base branch has been updated since the last successful testing.

  • last PASS base branch sha: 6891ee3
  • current base branch sha : 308996b
    The AutoTester will discard the last PASS, and re-test the PR from scratch

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5720
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5954
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: feature/scream-ml-gpu-only
  • SHA: 40e0fd5
  • Mode: TEST_REPO

Pull Request Author: elynnwu

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5720
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5954
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

@E3SM-Autotester
Copy link
Collaborator

All Jobs Finished; status = PASSED, target_sha=308996be7151e6b08edf5c9a3d2e7925a001a806, However Inspection must be performed before merge can occur...

@E3SM-Autotester
Copy link
Collaborator

The base branch has been updated since the last successful testing.

  • last PASS base branch sha: 308996b
  • current base branch sha : 5e7b019
    The AutoTester will discard the last PASS, and re-test the PR from scratch

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5726
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5957
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: feature/scream-ml-gpu-only
  • SHA: 40e0fd5
  • Mode: TEST_REPO

Pull Request Author: elynnwu

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5726
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5957
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

@E3SM-Autotester
Copy link
Collaborator

All Jobs Finished; status = PASSED, target_sha=5e7b019a3a529d817a464557e473fecb2a2b67d0, However Inspection must be performed before merge can occur...

@E3SM-Autotester
Copy link
Collaborator

The base branch has been updated since the last successful testing.

  • last PASS base branch sha: 5e7b019
  • current base branch sha : 9d845ad
    The AutoTester will discard the last PASS, and re-test the PR from scratch

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5730
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5961
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: feature/scream-ml-gpu-only
  • SHA: 40e0fd5
  • Mode: TEST_REPO

Pull Request Author: elynnwu

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5730
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5961
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

@E3SM-Autotester
Copy link
Collaborator

All Jobs Finished; status = PASSED, target_sha=9d845ad6c53611729bee876309f2c006c81c2493, However Inspection must be performed before merge can occur...

@E3SM-Autotester
Copy link
Collaborator

The base branch has been updated since the last successful testing.

  • last PASS base branch sha: 9d845ad
  • current base branch sha : 0da0f0c
    The AutoTester will discard the last PASS, and re-test the PR from scratch

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5735
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5966
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Using Repos:

Repo: SCREAM (E3SM-Project/scream)
  • Branch: feature/scream-ml-gpu-only
  • SHA: 40e0fd5
  • Mode: TEST_REPO

Pull Request Author: elynnwu

@E3SM-Autotester
Copy link
Collaborator

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 2 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: SCREAM_PullRequest_Autotester_Mappy

  • Build Num: 5735
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM

Build Information

Test Name: SCREAM_PullRequest_Autotester_Weaver

  • Build Num: 5966
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS
PULLREQUESTNUM 2909
SCREAM_SOURCE_REPO https://github.com/E3SM-Project/scream
SCREAM_SOURCE_SHA 40e0fd5
SCREAM_TARGET_BRANCH master
SCREAM_TARGET_REPO https://github.com/E3SM-Project/scream
SCREAM_TARGET_SHA 17dda18
TEST_REPO_ALIAS SCREAM
SCREAM_PullRequest_Autotester_Mappy # 5735 FAILED (click to see last 100 lines of console output)

Starting SHAREDLIB_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 with 1 procs
Finished SHAREDLIB_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics in 55.943090 seconds (PASS)
Starting MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics with 16 procs
Finished SHAREDLIB_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci in 55.701409 seconds (PASS)
Starting MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci with 16 procs
Finished SHAREDLIB_BUILD for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 in 55.639924 seconds (PASS)
Starting MODEL_BUILD for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 with 16 procs
Finished SHAREDLIB_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 in 55.474142 seconds (PASS)
Starting MODEL_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 with 16 procs
Finished SHAREDLIB_BUILD for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu in 74.673903 seconds (PASS)
Finished SHAREDLIB_BUILD for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 in 72.924309 seconds (PASS)
Finished SHAREDLIB_BUILD for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 95.392666 seconds (PASS)
Finished SHAREDLIB_BUILD for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 191.320711 seconds (PASS)
Finished SHAREDLIB_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 211.607849 seconds (PASS)
Starting MODEL_BUILD for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 with 16 procs
Finished MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci in 206.697871 seconds (PASS)
Finished MODEL_BUILD for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics in 207.708733 seconds (PASS)
Starting RUN for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci with 1 proc on interactive node and 64 procs on compute nodes
Starting RUN for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci in 1.035115 seconds (PEND). [COMPLETED 1 of 9]
Finished RUN for test SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics in 1.028527 seconds (PEND). [COMPLETED 2 of 9]
Starting MODEL_BUILD for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 16 procs
Starting MODEL_BUILD for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu with 16 procs
Finished MODEL_BUILD for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 in 208.155439 seconds (PASS)
Starting RUN for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 in 0.959503 seconds (PEND). [COMPLETED 3 of 9]
Starting MODEL_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 with 16 procs
Finished MODEL_BUILD for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 in 210.114890 seconds (PASS)
Starting RUN for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 in 1.028451 seconds (PEND). [COMPLETED 4 of 9]
Starting MODEL_BUILD for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 16 procs
Finished MODEL_BUILD for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 in 432.592982 seconds (PASS)
Starting RUN for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 in 1.201530 seconds (PEND). [COMPLETED 5 of 9]
Finished MODEL_BUILD for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu in 455.442014 seconds (PASS)
Starting RUN for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu with 1 proc on interactive node and 16 procs on compute nodes
Finished RUN for test ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu in 6.533588 seconds (PEND). [COMPLETED 6 of 9]
Finished MODEL_BUILD for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 498.774109 seconds (PASS)
Starting RUN for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 0.899684 seconds (PEND). [COMPLETED 7 of 9]
Finished MODEL_BUILD for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 597.970069 seconds (PASS)
Starting RUN for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 in 1.234004 seconds (PEND). [COMPLETED 8 of 9]
Finished MODEL_BUILD for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 707.318602 seconds (PASS)
Starting RUN for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 with 1 proc on interactive node and 64 procs on compute nodes
Finished RUN for test ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 in 0.950923 seconds (PEND). [COMPLETED 9 of 9]
Waiting for tests to finish
PASS ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERP_D_Lh4.ne4_ne4.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20240730_153227_ldzse7
PASS ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERP_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-4.C.20240730_153227_ldzse7
PASS ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_D_Ln22.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-rad_frequency_2--scream-output-preset-5.C.20240730_153227_ldzse7
PASS ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_Ln9.ne4_ne4.F2000-SCREAMv1-AQP1.mappy_gnu.scream-output-preset-2.C.20240730_153227_ldzse7
PASS ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/ERS_P16_Ln22.ne30_ne30.F2010-SCREAMv1-DP-DYCOMSrf01.mappy_gnu.C.20240730_153227_ldzse7
PASS PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/PET_Ln9_P32x2.ne4pg2_ne4pg2.F2010-SCREAMv1.mappy_gnu.scream-output-preset-1.C.20240730_153227_ldzse7
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-aci.C.20240730_153227_ldzse7
PASS SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.mappy_gnu.scream-mam4xx-optics.C.20240730_153227_ldzse7
PASS SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3 RUN
    Case dir: /ascldap/users/e3sm-jenkins/acme/scratch/SMS_D_Ln9.ne4_ne4.F2010-SCREAMv1-noAero.mappy_gnu.scream-output-preset-3.C.20240730_153227_ldzse7
test-scheduler took 1013.2672662734985 seconds'
+ [[ 0 != 0 ]]
+ set +x
######################################################
FAILS DETECTED:
  SCREAM STANDALONE TESTING FAILED!
Build type debug_nopack_fpe failed at build time. Here's the build log:
Starting analysis on mappy with cmd: cd /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5735/scream/components/eamxx && module purge && module load sems-archive-env acme-env acme-cmake/3.26.3 acme-gcc/11.2.0 sems-archive-git/2.10.1 acme-openmpi/4.1.4 acme-netcdf/4.7.4/acme && export GATOR_INITIAL_MB=4000MB && export PATH=/ascldap/users/jgfouca/packages/valgrind-3.22.0/bin:$PATH && export OMP_PROC_BIND=spread && true &&  ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m mappy
RUN: cd /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5735/scream/components/eamxx && module purge && module load sems-archive-env acme-env acme-cmake/3.26.3 acme-gcc/11.2.0 sems-archive-git/2.10.1 acme-openmpi/4.1.4 acme-netcdf/4.7.4/acme && export GATOR_INITIAL_MB=4000MB && export PATH=/ascldap/users/jgfouca/packages/valgrind-3.22.0/bin:$PATH && export OMP_PROC_BIND=spread && true &&  ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m mappy
FROM: /home/e3sm-jenkins/jenkins-ws/workspace/SCREAM_PullRequest_Autotester_Mappy/5735/scream/components/eamxx
mappy failed
######################################################
Build step 'Execute shell' marked build as failure
$ ssh-agent -k
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 101566 killed;
[ssh-agent] Stopped.
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh

We're having issues with some test-launcher job hanging forever. So let's make sure we clean all penting test-launcher jobs

squeue -o"%.7i %u %40j" | grep e3sm-jenkins | grep test-launcher | awk '{ print $1 }' | xargs -r scancel

[SCREAM_PullRequest_Autotester_Mappy] $ /bin/bash -le /tmp/jenkins15275537211056372.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: lbertag@sandia.gov
Finished: FAILURE

SCREAM_PullRequest_Autotester_Weaver # 5966 PASSED (click to see last 100 lines of console output)

        Start 125: homme_shoc_cld_p3_rrtmgp_np1
125/141 Test #125: homme_shoc_cld_p3_rrtmgp_np1 ............................   Passed   11.24 sec
        Start 126: homme_shoc_cld_p3_rrtmgp_baseline_cmp
126/141 Test #126: homme_shoc_cld_p3_rrtmgp_baseline_cmp ...................   Passed    0.11 sec
        Start 127: homme_shoc_cld_p3_rrtmgp_pg2_np1
127/141 Test #127: homme_shoc_cld_p3_rrtmgp_pg2_np1 ........................   Passed   10.50 sec
        Start 128: homme_shoc_cld_p3_rrtmgp_pg2_baseline_cmp
128/141 Test #128: homme_shoc_cld_p3_rrtmgp_pg2_baseline_cmp ...............   Passed    0.08 sec
        Start 129: model_baseline
129/141 Test #129: model_baseline ..........................................   Passed   11.90 sec
        Start 130: model_initial
130/141 Test #130: model_initial ...........................................   Passed    5.91 sec
        Start 131: model_restart
131/141 Test #131: model_restart ...........................................   Passed    6.96 sec
        Start 132: restarted_vs_monolithic_check_np1
132/141 Test #132: restarted_vs_monolithic_check_np1 .......................   Passed    0.10 sec
        Start 133: homme_shoc_cld_spa_p3_rrtmgp_np1
133/141 Test #133: homme_shoc_cld_spa_p3_rrtmgp_np1 ........................   Passed   11.63 sec
        Start 134: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp
134/141 Test #134: homme_shoc_cld_spa_p3_rrtmgp_baseline_cmp ...............   Passed    0.12 sec
        Start 135: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1
135/141 Test #135: homme_shoc_cld_spa_p3_rrtmgp_128levels_np1 ..............   Passed    8.72 sec
        Start 136: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1
136/141 Test #136: homme_shoc_cld_spa_p3_rrtmgp_128levels_tend_check_np1 ...   Passed    1.41 sec
        Start 137: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp
137/141 Test #137: homme_shoc_cld_spa_p3_rrtmgp_128levels_baseline_cmp .....   Passed    0.61 sec
        Start 138: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1
138/141 Test #138: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_np1 .................   Passed   13.22 sec
        Start 139: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp
139/141 Test #139: homme_shoc_cld_spa_p3_rrtmgp_pg2_dp_baseline_cmp ........   Passed    0.09 sec
        Start 140: homme_shoc_cld_p3_mam_optics_rrtmgp_np1
140/141 Test #140: homme_shoc_cld_p3_mam_optics_rrtmgp_np1 .................   Passed   17.36 sec
        Start 141: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp
141/141 Test #141: homme_shoc_cld_p3_mam_optics_rrtmgp_baseline_cmp ........   Passed    0.20 sec

100% tests passed, 0 tests failed out of 141

Label Time Summary:
baseline_cmp = 82.97 secproc (17 tests)
baseline_gen = 208.61 sec
proc (19 tests)
bfbhash = 0.87 secproc (1 test)
check = 0.88 sec
proc (1 test)
cld = 38.33 secproc (6 tests)
cld_fraction = 1.15 sec
proc (1 test)
cxx baseline_cmp = 7.43 secproc (2 tests)
diagnostics = 49.83 sec
proc (23 tests)
driver = 68.38 secproc (12 tests)
dynamics = 6.90 sec
proc (3 tests)
fail = 31.24 secproc (5 tests)
io = 60.55 sec
proc (14 tests)
mam4_aci = 25.19 secproc (4 tests)
mam4_optics = 8.65 sec
proc (1 test)
nudging = 13.21 secproc (2 tests)
p3 = 87.21 sec
proc (10 tests)
p3_sk = 38.35 secproc (2 tests)
physics = 159.28 sec
proc (23 tests)
remap = 6.38 secproc (1 test)
rrtmgp = 54.31 sec
proc (11 tests)
shoc = 46.39 secproc (11 tests)
spa = 12.05 sec
proc (4 tests)
surface_coupling = 5.22 sec*proc (1 test)

Total Test time (real) = 668.88 sec

Testing '''118041cac39f73600bb030e38e6034d5dbf03092''' for test '''full_sp_debug'''

RUN: taskset -c 52-103 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/ctest-build/full_sp_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/ctest-build/full_sp_debug -DBUILD_NAME_MOD=full_sp_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DSCREAM_DOUBLE_PRECISION=False -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_sp_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/ctest-build/full_sp_debug

Testing '''118041cac39f73600bb030e38e6034d5dbf03092''' for test '''release'''

RUN: taskset -c 104-155 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/ctest-build/release/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/ctest-build/release -DBUILD_NAME_MOD=release -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Release -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/release" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/ctest-build/release

Testing '''118041cac39f73600bb030e38e6034d5dbf03092''' for test '''full_debug'''

RUN: taskset -c 0-51 sh -c '''SCREAM_BUILD_PARALLEL_LEVEL=52 CTEST_PARALLEL_LEVEL=1 ctest -V --output-on-failure --resource-spec-file /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/ctest-build/full_debug/ctest_resource_file.json -DNO_SUBMIT=True -DBUILD_WORK_DIR=/home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/ctest-build/full_debug -DBUILD_NAME_MOD=full_debug -S /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/cmake/ctest_script.cmake -DCTEST_SITE=weaver -DCMAKE_COMMAND="-C /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/cmake/machine-files/weaver.cmake -DNetCDF_Fortran_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-fortran/4.6.1/gcc/11.3.0/openmpi/4.1.6/5tv5psl -DNetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/netcdf-c/4.9.2/gcc/11.3.0/openmpi/4.1.6/pyuuqd3 -DPnetCDF_C_PATH=/projects/ppc64le-pwr9-rhel8/tpls/parallel-netcdf/1.12.3/gcc/11.3.0/openmpi/4.1.6/2s52shy -DCMAKE_BUILD_TYPE=Debug -DEKAT_DEFAULT_BFB=True -DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=True -DEKAT_DISABLE_TPL_WARNINGS='''''''''ON''''''''' -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_C_COMPILER=mpicc -DCMAKE_Fortran_COMPILER=mpifort -DSCREAM_DYNAMICS_DYCORE=HOMME -DSCREAM_TEST_MAX_TOTAL_THREADS=1 -DSCREAM_BASELINES_DIR=/home/projects/e3sm/scream/pr-autotester/master-baselines/weaver/full_debug" '''
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx/ctest-build/full_debug
OVERALL STATUS: PASS
Starting analysis on weaver with cmd: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
RUN: cd /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx && source /etc/profile.d/modules.sh && module purge && module load cmake/3.25.1 git/2.39.1 python/3.10.8 py-netcdf4/1.5.8 gcc/11.3.0 cuda/11.8.0 openmpi netcdf-c netcdf-fortran parallel-netcdf netlib-lapack && export HDF5_USE_FILE_LOCKING=FALSE && true && bsub -I -q rhel8 -n 4 -gpu num=4 ./scripts/test-all-scream --baseline-dir AUTO $compiler -p -c EKAT_DISABLE_TPL_WARNINGS=ON -m weaver
FROM: /home/e3sm-jenkins/weaver/workspace/SCREAM_PullRequest_Autotester_Weaver/5966/scream/components/eamxx
Completed analysis on weaver'

  • [[ 0 != 0 ]]
  • [[ 1 == 0 ]]
  • [[ weaver == \m\a\p\p\y ]]
  • set +x
    Performing Post build task...
    Match found for : : True
    Logical operation result is TRUE
    Running script : #!/bin/bash -le

cd $WORKSPACE/${BUILD_ID}/

./scream/components/eamxx/scripts/jenkins/jenkins_cleanup.sh
[SCREAM_PullRequest_Autotester_Weaver] $ /bin/bash -le /tmp/jenkins664374710269722894.sh
POST BUILD TASK : SUCCESS
END OF POST BUILD TASK : 0
Sending e-mails to: lbertag@sandia.gov
Finished: SUCCESS

@@ -5,6 +5,9 @@ include (${EKAT_MACH_FILES_PATH}/kokkos/amd-zen3.cmake)
include (${EKAT_MACH_FILES_PATH}/kokkos/openmp.cmake)

set(CMAKE_CXX_FLAGS "-DTHRUST_IGNORE_CUB_VERSION_CHECK" CACHE STRING "" FORCE)
set(PYBIND11_PYTHON_VERSION 3.9 CACHE STRING "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elynnwu or @frodre , I think there is consensus that we can merge this in as long as we put guard rails on the environment call. Can you add a compiler flag to turn these options "OFF" by default and only "ON" when the user specifies they want ML? I would be in favor of making the flag explicit in name, like FV3NET or CorrectiveML. You would want to add the flags to both gpu and cpu config files.

@mahf708 have I characterized what needs to be done correctly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something like should be okay:

option (SCREAM_ENABLE_ML_CORRECTION "Whether to enable ML correction parametrization" OFF)
if (SCREAM_ENABLE_ML_CORRECTION)
  set(PYBIND11_PYTHON_VERSION 3.9 CACHE STRING "")
endif()

You can add whatever you want inside the guarded if–endif.

To activate SCREAM_ENABLE_ML_CORRECTION, you can add it to the scream configs in a run script (with ./xmlchange SCREAM_CMAKE_OPTIONS="... SCREAM_ENABLE_ML_CORRECTION ON ...") or directly into the cmake call (if you're building this at a lower level) with -DSCREAM_ENABLE_ML_CORRECTION=ON or the like

@AaronDonahue
Copy link
Contributor

Closing this until we finalize the ML approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants