Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents

Table of Contents

About

...

CI System Information

Quick reference


Presubmit tests can be controlled by single-line annotations in the pull
request description. These annotations will be re-examined for each run.
Here is an example of their use:

# Build tests with other unsubmitted packages.
build-group=https://github.com/JCSDA-internal/oops/pull/2284
build-group=https://github.com/JCSDA-internal/saber/pull/651

# Disable the build-cache for tests.
jedi-ci-build-cache=skip

Each configuration setting must be on a single line, but order and
position does not matter.

# Enable tests for your draft PR (disabled by default).
run-ci-on-draft=true

# Select the compiler used by CI (defaults to random choice).
jedi-ci-test-select=gcc

# Select the jedi-bundle branch used for building. Using this option
# disables the build cache.
jedi-ci-bundle-branch=feature/my-bundle-change

Specifying a Build Group

In the default configuration the CI system will build candidate code against
the latest submitted version each package of the jedi-bundle. A pull request
can be built against unsubmitted versions of specific packages by specifying
the version using a tag in the pull request description. Multiple tags may
be added as long as each tag is on its own line of the pull request
description.

build-group=https://github.com/JCSDA-internal/oops/pull/2284

Selecting a Compiler

To save on cloud compute resources the CI test environment selects one of
our three environments randomly. If you want tests with a specific compiler
you can set the annotation jedi-ci-test-select to either gcc, intel,
or gcc11. Please do not use the special value all unless you have an
especially dangerous change known to affect all compilers or the CI
environment.

  • gcc11: uses the GNU Compiler Collection (GCC) v11.4.0 and OpenMPI v5.0.5.
  • gcc: uses the GNU Compiler Collection (GCC) v13.3.0 and OpenMPI v5.0.5.
  • intel: Uses the Intel OneAPI v2024.2.1 with icx/icpx/ifort and OneAPI MPI v2021.13.

Build Cache

The CI system relies on a build cache to speed the the build process. Some
changes are capable of causing build failures arising from the use of the
cache. The CI system has two controls to modify cache behavior.

The build cache can be disabled by adding the annotation
jedi-ci-build-cache=skip to the PR description.

If it is necessary to rebuild the entire cache to remove a bug in the cached
binaries, add the annotation jedi-ci-build-cache=rebuild to the PR
description.

CI Development and Debug Options:

USE THESE OPTIONS WITH CAUTION

  • jedi-ci-bundle-branch=branch-name: Unless otherwise specified tests
    will be run from the default branch of the jedi-bundle repository.
    This annotation overrides the branch and the value of this tag sets a
    valid branch name currently fetchable from the jedi-bundle repository
    that will be used for testing. If this annotation is explicitly set
    (even if set to the default branch), cache reading and writing is
    disabled and any cache annotations will be ignored.
  • jedi-ci-manifest-branch=branch-name: This tag overrides the default
    branch name used for fetching the CI manifest. This is used when CI
    config changes are needed to run the test. WARNING: a bad value here
    can cause the test to silently fail to configure.
  • jedi-ci-next=true: This annotation will use the "next" tagged CI
    images. This tag will primarily be used by the infra team for testing
    spack-stack releases or breaking changes. At most times, the "next" tag
    will be assigned to the current live build images.
  • jedi-ci-debug=true: This annotation can be used to induce a post-test
    delay of 60 minutes during which the build environment will be saved
    for inspection and debug activities.

FAQ

Q: Why is this test running?

A: This test was run by the JEDI CI system whose code is hosted at
github.com/JCSDA-internal/CI.

Q: My draft pull request's tests are not running.

A: You must enable tests for draft PRs by adding the annotation
  run-ci-on-draft=true in the pull request description.

Q: How can a test "pass with failures"?

A: Because the integration test is much larger than typical unit tests, a
small amount of flake test failure is allowed. Over time we will track
the repeatedly flaky tests and fix them. Please examine any failures
carefully to ensure that they were not caused by your change.

Q: Why can't I access the build log?

A: The AWS hosted build logs require a login to the jcsda-usaf AWS
account. We also provide a public build log available to anyone with the
link but this log file is not available until all tests are complete for
an environment.

Administrative Tasks (For JEDI Infra team)

Updating CI instance disk space

...

  1. Add an additional EBS volume and mount it on the instance
  2. Move the spack-stack build and source caches there and link to current locations
  3. Turn swapfile off on root filesystem and enable on new volume
  4. Created 500Gb EBS volume “Ubuntu 22.04 CI Intel”
  5. Mounted on EC2 instance following https://docs.aws.amazon.com/ebs/latest/userguide/ebs-attaching-volume.html
  6. Partitioned on EC2 instance, created ext4 filesystem and mounted via /etc/fstab entry following Linux standard practices - mounted as /mnt/addon ; see https://docs.aws.amazon.com/ebs/latest/userguide/ebs-using-volumes.html for one of many tutorials
  7. Moved spack source and build caches to /mnt/addon/spack-stack/{build,source}-cache
  8. Created 128GB swapfile /mnt/addon/swapfile and removed 64GB swapfile /swapfile (incl. fstab entries); this again is Linux boilerplate, see e.g. https://phoenixnap.com/kb/linux-swap-file

Troubleshooting / FAQ


Running GitHub Workflow Locally

To save on costs and time there is a way to run GitHub Workflow CIs locally, but one that is heavily used that mimics GitHub's locally without minimal setup is act (https://github.com/nektos/act). This will take existing workflow yamls and run them locally using Docker.


MacOS Setup

MacOS has various differences compared to Linux when it comes to using act.  Below are a few items that you need to make sure are installed on your machine:

  • Docker with Docker Desktop (it is preferred to install through homebrew).
  • GitHub personal token (since the majority of the repos are private, you'll have to generate a token for any cloning that is done in the workflow)
  • The platforms used by act are not one to one replacements for the GitHub images being used, so any errors running will have to have adjustments that should not be committed
  • If a workflow errors out due to a docker container already existing with that name, you will have to manually delete the docker container, or update the script to do so (snippet provided below)

To set up your environment to make sure the right Docker sockets are used you should run the following command(s):

Code Block
docker context use default

This makes sure the docker socket(s) are set up in a way that they are properly linked.

Install act  with: brew install act 

act command-line options

The most common command-line options you should remember are:

  • --container-architecture linux/amd64  - This is due to the Apple M* processors. Most of the images that act uses are based on linux/amd64 .
  • -W  - The workflow directory/file. You can provide the appropriate yaml file here to run