Continuous Integration
GitHub: https://github.com/JCSDA-internal/ci
Documentation: via multiple READMEs inside the GitHub repository
Table of Contents |
---|
Presubmit tests can be controlled by single-line annotations in the pull
request description. These annotations will be re-examined for each run.
Here is an example of their use:
# Build tests with other unsubmitted packages.
build-group=https://github.com/JCSDA-internal/oops/pull/2284
build-group=https://github.com/JCSDA-internal/saber/pull/651
# Disable the build-cache for tests.
jedi-ci-build-cache=skip
Each configuration setting must be on a single line, but order and
position does not matter.
# Enable tests for your draft PR (disabled by default).
run-ci-on-draft=true
# Select the compiler used by CI (defaults to random choice).
jedi-ci-test-select=gcc
# Select the jedi-bundle branch used for building. Using this option
# disables the build cache.
jedi-ci-bundle-branch=feature/my-bundle-change
In the default configuration the CI system will build candidate code against
the latest submitted version each package of the jedi-bundle. A pull request
can be built against unsubmitted versions of specific packages by specifying
the version using a tag in the pull request description. Multiple tags may
be added as long as each tag is on its own line of the pull request
description.
build-group=https://github.com/JCSDA-internal/oops/pull/2284
To save on cloud compute resources the CI test environment selects one of
our three environments randomly. If you want tests with a specific compiler
you can set the annotation jedi-ci-test-select
to either gcc
, intel
,
or gcc11
. Please do not use the special value all
unless you have an
especially dangerous change known to affect all compilers or the CI
environment.
gcc11
: uses the GNU Compiler Collection (GCC) v11.4.0 and OpenMPI v5.0.5.gcc
: uses the GNU Compiler Collection (GCC) v13.3.0 and OpenMPI v5.0.5.intel
: Uses the Intel OneAPI v2024.2.1 with icx/icpx/ifort and OneAPI MPI v2021.13.The CI system relies on a build cache to speed the the build process. Some
changes are capable of causing build failures arising from the use of the
cache. The CI system has two controls to modify cache behavior.
The build cache can be disabled by adding the annotationjedi-ci-build-cache=skip
to the PR description.
If it is necessary to rebuild the entire cache to remove a bug in the cached
binaries, add the annotation jedi-ci-build-cache=rebuild
to the PR
description.
USE THESE OPTIONS WITH CAUTION
jedi-ci-bundle-branch=branch-name
: Unless otherwise specified testsjedi-bundle
repository.jedi-ci-manifest-branch=branch-name
: This tag overrides the defaultjedi-ci-next=true
: This annotation will use the "next" tagged CIjedi-ci-debug=true
: This annotation can be used to induce a post-testQ: Why is this test running?
A: This test was run by the JEDI CI system whose code is hosted at
github.com/JCSDA-internal/CI.
Q: My draft pull request's tests are not running.
A: You must enable tests for draft PRs by adding the annotation run-ci-on-draft=true
in the pull request description.
Q: How can a test "pass with failures"?
A: Because the integration test is much larger than typical unit tests, a
small amount of flake test failure is allowed. Over time we will track
the repeatedly flaky tests and fix them. Please examine any failures
carefully to ensure that they were not caused by your change.
Q: Why can't I access the build log?
A: The AWS hosted build logs require a login to the jcsda-usaf
AWS
account. We also provide a public build log available to anyone with the
link but this log file is not available until all tests are complete for
an environment.
Use the following procedure to update the disk space for the ci instances if they are running out of space.
/etc/fstab
entry following Linux standard practices - mounted as /mnt/addon
; see https://docs.aws.amazon.com/ebs/latest/userguide/ebs-using-volumes.html for one of many tutorials/mnt/addon/spack-stack/{build,source}-cache
CDash is hosted on an AWS EC2 instance in our USAF account in us-east-2 region. Members of the Infrastructure team can access this instance with SSH.
ssh -i your-key ubuntu@cdash.jcsda.org
HTTPS / Signing Authority: The CDash server uses SSL connection with a LetsEncrypt SSL signature which is renewed on the 10th of each month by a cron job on the instance. If the job fails or our CDash integration breaks.
Containerized Service Deployment: The CDash server is deployed on the instance via a docker compose deployment with three containers. During certificate updates the "cdash" container is temporarily brought down so that certbot can communicate with the signing authority. It is safe to stop and start the cdash
container although it will cause a temporary service outage when it is stopped. Stopping the MySQL container (without preserving the volume) will clear all data from our CDash server, including repository configurations which will need to be added manually to re-enable test uploading.
cdash: kitware/cdash -
The http endpoint server that responds to web requests.cdash_worker: kitware/cdash-worker -
A background and RPC worker for the service (not web accessible).cdash_mysql: mysql/mysql-server. -
The MySQL database used by the service. Warning, do not stop this container.Detailed debugging notes for the containerized deployment can be found in the CDash config code repository README file
To save on costs and time there is a way to run GitHub Workflow CIs locally, but one that is heavily used that mimics GitHub's locally without minimal setup is act
(https://github.com/nektos/act). This will take existing workflow yamls and run them locally using Docker.
MacOS has various differences compared to Linux when it comes to using act
. Below are a few items that you need to make sure are installed on your machine:
homebrew
).act
are not one to one replacements for the GitHub images being used, so any errors running will have to have adjustments that should not be committedTo set up your environment to make sure the right Docker sockets are used you should run the following command(s):
docker context use default |
This makes sure the docker socket(s) are set up in a way that they are properly linked.
Install act
with: brew install act
act
command-line optionsThe most common command-line options you should remember are:
--container-architecture linux/amd64
- This is due to the Apple M* processors. Most of the images that act
uses are based on linux/amd64
.-W
- The workflow directory/file. You can provide the appropriate yaml file here to run