This guide covers administrative tasks for the R2D2 API server, including managing, updating, fixing, and monitoring the API, database, and AWS infrastructure. It’s meant for JEDI Infrastructure team members and JCSDA AWS admins who need to keep things running smoothly, debug issues, and ensure uptime. In addition to a system overview administration guide, this document will contain playbook entries and work logs to maintain history and procedures of administrative work. If this guide is missing entries, please author them as needed.

Quick Reference

Infrastructure Overview

R2D2’s HTTP API is hosted on AWS utilizing the following components. For more info see this Feb 7, 2025 slide deck.

  • Application Load Balancer - API Server routing, load balancing, and SSL termination. 
  • EC2 Compute Hosts - R2D2 API server execution environment.
  • AWS Systems Manager - shell access and administration.
  • AWS Relational Database Service - R2D2 database.


Common Tasks

Deploy Updated R2D2 API Server Binaries

WARNING: Manually downing or upgrading R2D2 servers without first draining the instance from our load balancer !

The standard process for updating servers involves draining them from the load balancer one at a time, updating the service, and then re-enabling them. This approach ensures continuous network availability and prevents the loss of inbound or queued API calls. A Bash script automates this process while maintaining service stability. However, updates should be performed during low-traffic periods, as serving capacity is temporarily reduced during the update.

1. Build and publish binaries with the prod tag as described later in this guide.

2. From a developer machine with admin AWS credentials (members of the JEDI-infra team) clone the r2d2 repository.

git clone https://github.com/JCSDA-internal/r2d2.git && cd r2d2

3. Run the following command and wait about 10 minutes for the full rollout to complete.

./server/scripts/update_prod_servers.sh --operation update 

Note: If the rollout fails for any reason, refer to the update_prod_servers.sh script, which contains the manual update process and extensive debugging information. This script serves as the central resource for detailed rollout procedures and documentation.

Deploy Cloud Formation Updates

  1. Copy the updated prod.yaml config to our infrastructure-as-code bucket and get the versioned URL for the file as shown here. 


    aws s3 cp server/cfn/prod.yaml s3://jcsda-usaf-iac-artifacts/r2d2/prod.yaml
    file_version=$(aws s3api list-object-versions \
                     --bucket jcsda-usaf-iac-artifacts \
                     --prefix "r2d2/prod.yaml"  \
                 | jq -r '.Versions[] | select(.IsLatest==true) | .VersionId')
    echo "https://jcsda-usaf-iac-artifacts.s3.us-east-2.amazonaws.com/r2d2/prod.yaml?versionId=${file_version}"
  2. Go to the CloudFormation stack "r2d2-api-prod" in the AWS console (or use quick-ref link above) and click "Update".
  3. In the update menu, select "Replace existing template" and enter the URL generated in the first step. Change any parameters and review deploy options.
    1. Review the change log and ensure that EC2 instances and RDS instances are not replaced.
  4. Monitor the CloudFormation rollout in console and ensure it executes to completion.

Viewing Service and Error Logs

Logs for all API server replicas can be found under the the "r2d2.*" log groups in AWS CloudWatch. The CloudWatch log exports are on a half-our schedule so expect a delay between an event and being able to view the logs in CloudWatch.

 Each server replica writes logs to /var/log/gunicorn/ and those can be viewed in realtime if you have a shell into the replica.

Build and publish API server binaries

1) Build the docker image

git clone -b feature/restapi https://github.com/jcsda-internal/r2d2.git
cd r2d2
docker build -f server/docker/Dockerfile.app \
     --platform=linux/amd64 \
     -t 747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod .

2) Get local credentials for our ECR repository.

aws ecr get-login-password --region us-east-2 \
| docker login --username AWS \
               --password-stdin 747101682576.dkr.ecr.us-east-2.amazonaws.com

3) Push the docker image

docker push 747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod

Update a Running API Server Instance

This is the process needed to push updated code to an API Server instance. Note that you can build the binary on the instance, but if you do, you should also push it to ECR to keep a persistent copy of our server code.

# If logged into the server, log out now and log back in to reset the shell environment

# From your developer machine log into the server and switch to the root user.
aws ssm start-session --target i-0a750b39c8e88d101 --region us-east-2

sudo su -

# Get local credentials for our ECR repository.
aws ecr get-login-password --region us-east-2 \
| docker login --username AWS \
               --password-stdin 747101682576.dkr.ecr.us-east-2.amazonaws.com

# Pull the latest Docker container
docker pull 747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod

# Stop the running service.
systemctl stop r2d2.service

# If necessary, edit service definition file: /etc/systemd/system/r2d2.service
# Do this only if docker arguments or environment variables have changed.
# If updating this file you will need to run `systemctl daemon-reload` to
# load in updates to the service.

# Restart the systemd service.
systemctl start r2d2.service

# Check the status of the service
docker ps 

# Sample output from the production server
CONTAINER ID   IMAGE                                                           COMMAND                  CREATED          STATUS          PORTS                               NAMES
cd8745ac7561   747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod   "run_r2d2_app --port…"   47 seconds ago   Up 46 seconds   0.0.0.0:80->80/tcp, :::80->80/tcp   r2d2-api-service

# Check the output logs
docker logs -f r2d2-api-service


Accessing the RDS SQL database

The RDS database cannot be accessed from the public internet and must be accessed from one of our dev servers using a ssm shell login.

1) Log into the dev server (see quick reference)

2) Fetch the server password from the parameters in our cloud formation stack (see link in quick refrence)

3) From the dev server execute this command, using the password retrieved above

mysql -h r2d2-api-prod-rdsinstance-eja4eofrohvy.cg24vilqoa8w.us-east-2.rds.amazonaws.com -u admin -p


Load "dev" server data onto r2d2 database

This should not be done! This procedure was used to test the R2D2 database prior to release and is documented here for historical context and in case some of these steps have useful analogs for other work.

For the localhost server container, can we pre-load r2d2-data and the r2d2 mysql user so
that end users just download the container and execute it on port 8080? Basically execute
the setup database script from r2d2/scripts.


Update crontabs on all platforms (experiment scrubber)
 - Platforms include derecho, discover, hercules, orion, s4
  - Update venv used by crontabs to use r2d2-client and not r2d2



Create username and api keys for all users
 - We will have to massage the user database table since some humans have more than one username
 - We need an api key generator and a way to send these api keys to the user


Go to dev server dump dev database into IAC bucket
 - Access astromech-nonprod
     $ aws ssm start-session --target i-0e97feb060f886e43 --region us-east-2
     $ sudo su ubuntu && cd ~
 - Use password "local-server-root-password" (only works on dev server)
 - DATE=$(date "+%Y%m%dT%H%M%SZ")
 - mysqldump -h 127.0.0.1 -u root -p --databases r2d2 > r2d2-dev-backup-${DATE}.sql
 - aws s3 cp r2d2-dev-backup-20250205T171114Z.sql s3://jcsda-usaf-iac-artifacts/r2d2/


Log into prod server and load database with dev data
 - Login and load data
   - ssm login command (see quick ref)
   - sudo su ubuntu && cd ~
   - aws s3 cp  s3://jcsda-usaf-iac-artifacts/r2d2/r2d2-dev-backup-20250205T171114Z.sql ./r2d2-dev-backup.sql
 - Get password cloud formation console parameters
 - Log into server
   - mysql -h r2d2-api-prod-rdsinstance-eja4eofrohvy.cg24vilqoa8w.us-east-2.rds.amazonaws.com -u admin -p
 - Run these sql commands
      CREATE DATABASE r2d2;
      USE r2d2;
      source /home/ubuntu/r2d2-dev-backup.sql



  • No labels