Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added deploy binary instructions

...

Infrastructure Overview

...

  • Application Load Balancer - API Server routing, load balancing, and SSL termination. 
  • EC2 Compute Hosts - R2D2 API server execution environment.
  • AWS Systems Manager - shell access and administration.
  • AWS Relational Database Service - R2D2 database.


Common Tasks

Deploy Updated R2D2 API Server Binaries

WARNING: Manually downing or upgrading R2D2 servers without first draining the instance from our load balancer !

The standard process for updating servers involves draining them from the load balancer one at a time, updating the service, and then re-enabling them. This approach ensures continuous network availability and prevents the loss of inbound or queued API calls. A Bash script automates this process while maintaining service stability. However, updates should be performed during low-traffic periods, as serving capacity is temporarily reduced during the update.

1. Build and publish binaries with the prod tag as described later in this guide.

2. From a developer machine with admin AWS credentials (members of the JEDI-infra team) clone the r2d2 repository.

git clone https://github.com/JCSDA-internal/r2d2.git && cd r2d2

3. Run the following command and wait about 10 minutes for the full rollout to complete.

./server/scripts/update_prod_servers.sh --operation update 

Note: If the rollout fails for any reason, refer to the update_prod_servers.sh script, which contains the manual update process and extensive debugging information. This script serves as the central resource for detailed rollout procedures and documentation.

Deploy Cloud Formation Updates

  1. Copy the updated prod.yaml config to our infrastructure-as-code bucket and get the versioned URL for the file as shown here. 


    aws s3 cp server/cfn/prod.yaml s3://jcsda-usaf-iac-artifacts/r2d2/prod.yaml
    file_version=$(aws s3api list-object-versions \
                     --bucket jcsda-usaf-iac-artifacts \
                     --prefix "r2d2/prod.yaml"  \
                 | jq -r '.Versions[] | select(.IsLatest==true) | .VersionId')
    echo "https://jcsda-usaf-iac-artifacts.s3.us-east-2.amazonaws.com/r2d2/prod.yaml?versionId=${file_version}"
  2. Go to the CloudFormation stack "r2d2-api-prod" in the AWS console (or use quick-ref link above) and click "Update".
  3. In the update menu, select "Replace existing template" and enter the URL generated in the first step. Change any parameters and review deploy options.
    1. Review the change log and ensure that EC2 instances and RDS instances are not replaced.
  4. Monitor the CloudFormation rollout in console and ensure it executes to completion.

Viewing Service and Error Logs

Logs for all API server replicas can be found under the the "r2d2.*" log groups in AWS CloudWatch. The CloudWatch log exports are on a half-our schedule so expect a delay between an event and being able to view the logs in CloudWatch.

 Each server replica writes logs to /var/log/gunicorn/ and those can be viewed in realtime if you have a shell into the replica.

Build and publish API server binaries

...

This is the process needed to push updated code to an API Server instance. Note that you can build the binary on the instance, but if you do, you should also push it to ECR to keep a persistent copy of our server code.

Code Block
# If logged into the server, log out now and log back in to reset the shell environment

# From your developer machine log into the server and switch to the root user.
aws ssm start-session --target i-0a0612957e48959170a750b39c8e88d101 --region us-east-2

sudo su -

# This process needs to be completed as root
sudo su -
 Get local credentials for our ECR repository.
aws ecr get-login-password --region us-east-2 \
| docker login --username AWS \
               --password-stdin 747101682576.dkr.ecr.us-east-2.amazonaws.com

# Pull the latest Docker container
docker pull 747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod

# Stop the running service.
systemctl stop r2d2.service

# Remove the current container.
docker rm r2d2-api-service

# Re-start the prod service container.
docker run -d --name r2d2-api-service -p 80:80 \
 If necessary, edit service definition file: /etc/systemd/system/r2d2.service
# Do this only if docker arguments or environment variables have changed.
# If updating this file you will need to run `systemctl daemon-reload` to
# load in updates to the service.

# Restart the systemd service.
systemctl start r2d2.service

# Check the status of the service
docker ps 

# Sample output from the production server
CONTAINER ID   IMAGE                                                         -e MYSQL_USER='r2d2' \
     COMMAND                  CREATED          STATUS          PORTS        -e MYSQL_DATABASE='r2d2' \
           -e MYSQL_HOST='r2d2-api-prod-rdsinstance-eja4eofrohvy.cg24vilqoa8w.us-east-2.rds.amazonaws.com' \
        NAMES
cd8745ac7561   747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod \
    "run_r2d2_app --port…"   47 seconds ago    run_r2d2_app --port 80 --nodebugUp 46 seconds   0.0.0.0:80->80/tcp, :::80->80/tcp   r2d2-api-service

# RestartCheck the systemdoutput service.
systemctl start r2d2.logs
docker logs -f r2d2-api-service


Accessing the RDS SQL database

...