Table of Contents

Table of Contents

About

...

CI System Information

Quick reference

Presubmit tests can be controlled by single-line annotations in the pull
request description. These annotations will be re-examined for each run.
Here is an example of their use:

# Build tests with other unsubmitted packages.
build-group=https://github.com/JCSDA-internal/oops/pull/2284
build-group=https://github.com/JCSDA-internal/saber/pull/651

# Disable the build-cache for tests.
jedi-ci-build-cache=skip

Each configuration setting must be on a single line, but order and
position does not matter.

# Enable tests for your draft PR (disabled by default).
run-ci-on-draft=true

# Select the compiler used by CI (defaults to random choice).
jedi-ci-test-select=gcc

# Select the jedi-bundle branch used for building. Using this option
# disables the build cache.
jedi-ci-bundle-branch=feature/my-bundle-change

Specifying a Build Group

In the default configuration the CI system will build candidate code against
the latest submitted version each package of the jedi-bundle. A pull request
can be built against unsubmitted versions of specific packages by specifying
the version using a tag in the pull request description. Multiple tags may
be added as long as each tag is on its own line of the pull request
description.

build-group=https://github.com/JCSDA-internal/oops/pull/2284

Selecting a Compiler

To save on cloud compute resources the CI test environment selects one of
our three environments randomly. If you want tests with a specific compiler
you can set the annotation jedi-ci-test-select to either gcc, intel,
or gcc11. Please do not use the special value all unless you have an
especially dangerous change known to affect all compilers or the CI
environment.

gcc11: uses the GNU Compiler Collection (GCC) v11.4.0 and OpenMPI v5.0.5.
gcc: uses the GNU Compiler Collection (GCC) v13.3.0 and OpenMPI v5.0.5.
intel: Uses the Intel OneAPI v2024.2.1 with icx/icpx/ifort and OneAPI MPI v2021.13.

Build Cache

The CI system relies on a build cache to speed the the build process. Some
changes are capable of causing build failures arising from the use of the
cache. The CI system has two controls to modify cache behavior.

The build cache can be disabled by adding the annotation
jedi-ci-build-cache=skip to the PR description.

If it is necessary to rebuild the entire cache to remove a bug in the cached
binaries, add the annotation jedi-ci-build-cache=rebuild to the PR
description.

CI Development and Debug Options:

USE THESE OPTIONS WITH CAUTION

jedi-ci-bundle-branch=branch-name: Unless otherwise specified tests
will be run from the default branch of the jedi-bundle repository.
This annotation overrides the branch and the value of this tag sets a
valid branch name currently fetchable from the jedi-bundle repository
that will be used for testing. If this annotation is explicitly set
(even if set to the default branch), cache reading and writing is
disabled and any cache annotations will be ignored.
jedi-ci-manifest-branch=branch-name: This tag overrides the default
branch name used for fetching the CI manifest. This is used when CI
config changes are needed to run the test. WARNING: a bad value here
can cause the test to silently fail to configure.
jedi-ci-next=true: This annotation will use the "next" tagged CI
images. This tag will primarily be used by the infra team for testing
spack-stack releases or breaking changes. At most times, the "next" tag
will be assigned to the current live build images.
jedi-ci-debug=true: This annotation can be used to induce a post-test
delay of 60 minutes during which the build environment will be saved
for inspection and debug activities.

FAQ

Q: Why is this test running?

A: This test was run by the JEDI CI system whose code is hosted at
github.com/JCSDA-internal/CI.

Q: My draft pull request's tests are not running.

A: You must enable tests for draft PRs by adding the annotation
run-ci-on-draft=true in the pull request description.

Q: How can a test "pass with failures"?

A: Because the integration test is much larger than typical unit tests, a
small amount of flake test failure is allowed. Over time we will track
the repeatedly flaky tests and fix them. Please examine any failures
carefully to ensure that they were not caused by your change.

Q: Why can't I access the build log?

A: The AWS hosted build logs require a login to the jcsda-usaf AWS
account. We also provide a public build log available to anyone with the
link but this log file is not available until all tests are complete for
an environment.

Administrative Tasks (For JEDI Infra team)

Updating CI instance disk space

...

Add an additional EBS volume and mount it on the instance
Move the spack-stack build and source caches there and link to current locations
Turn swapfile off on root filesystem and enable on new volume
Created 500Gb EBS volume “Ubuntu 22.04 CI Intel”
Mounted on EC2 instance following https://docs.aws.amazon.com/ebs/latest/userguide/ebs-attaching-volume.html
Partitioned on EC2 instance, created ext4 filesystem and mounted via /etc/fstab entry following Linux standard practices - mounted as /mnt/addon ; see https://docs.aws.amazon.com/ebs/latest/userguide/ebs-using-volumes.html for one of many tutorials
Moved spack source and build caches to /mnt/addon/spack-stack/{build,source}-cache
Created 128GB swapfile /mnt/addon/swapfile and removed 64GB swapfile /swapfile (incl. fstab entries); this again is Linux boilerplate, see e.g. https://phoenixnap.com/kb/linux-swap-file

Troubleshooting / FAQ

Running GitHub Workflow Locally

To save on costs and time there is a way to run GitHub Workflow CIs locally, but one that is heavily used that mimics GitHub's locally without minimal setup is act (https://github.com/nektos/act). This will take existing workflow yamls and run them locally using Docker.

MacOS Setup

MacOS has various differences compared to Linux when it comes to using act. Below are a few items that you need to make sure are installed on your machine:

Docker with Docker Desktop (it is preferred to install through homebrew).
GitHub personal token (since the majority of the repos are private, you'll have to generate a token for any cloning that is done in the workflow)
The platforms used by act are not one to one replacements for the GitHub images being used, so any errors running will have to have adjustments that should not be committed
If a workflow errors out due to a docker container already existing with that name, you will have to manually delete the docker container, or update the script to do so (snippet provided below)

To set up your environment to make sure the right Docker sockets are used you should run the following command(s):

Code Block
docker context use default

This makes sure the docker socket(s) are set up in a way that they are properly linked.

Install act with: brew install act

`act` command-line options

The most common command-line options you should remember are:

--container-architecture linux/amd64 - This is due to the Apple M* processors. Most of the images that act uses are based on linux/amd64 .
-W - The workflow directory/file. You can provide the appropriate yaml file here to run

Space shortcuts

Page tree

Versions Compared

Old Version 4

New Version 5

Key

About

CI System Information

Quick reference

Specifying a Build Group

Selecting a Compiler

Build Cache

CI Development and Debug Options:

FAQ

Administrative Tasks (For JEDI Infra team)

Updating CI instance disk space

Troubleshooting / FAQ

Running GitHub Workflow Locally

MacOS Setup

`act` command-line options

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 4

New Version 5

Key

About

CI System Information

Quick reference

Specifying a Build Group

Selecting a Compiler

Build Cache

CI Development and Debug Options:

FAQ

Administrative Tasks (For JEDI Infra team)

Updating CI instance disk space

Troubleshooting / FAQ

Running GitHub Workflow Locally

MacOS Setup

act command-line options

`act` command-line options