Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added deploy binary instructions

...

Infrastructure Overview

...

  • Application Load Balancer - API Server routing, load balancing, and SSL termination. 
  • EC2 Compute Hosts - R2D2 API server execution environment.
  • AWS Systems Manager - shell access and administration.
  • AWS Relational Database Service - R2D2 database.


Common Tasks

Deploy Updated R2D2 API Server Binaries

WARNING: Manually downing or upgrading R2D2 servers without first draining the instance from our load balancer !

The standard process for updating servers involves draining them from the load balancer one at a time, updating the service, and then re-enabling them. This approach ensures continuous network availability and prevents the loss of inbound or queued API calls. A Bash script automates this process while maintaining service stability. However, updates should be performed during low-traffic periods, as serving capacity is temporarily reduced during the update.

1. Build and publish binaries with the prod tag as described later in this guide.

2. From a developer machine with admin AWS credentials (members of the JEDI-infra team) clone the r2d2 repository.

git clone https://github.com/JCSDA-internal/r2d2.git && cd r2d2

3. Run the following command and wait about 10 minutes for the full rollout to complete.

./server/scripts/update_prod_servers.sh --operation update 

Note: If the rollout fails for any reason, refer to the update_prod_servers.sh script, which contains the manual update process and extensive debugging information. This script serves as the central resource for detailed rollout procedures and documentation.

Deploy Cloud Formation Updates

...

Logs for all API server replicas can be found under the the "r2d2.*" log groups in AWS CloudWatch [direct link]. The CloudWatch log exports are on a half-our schedule so expect a delay between an error and finding event and being able to view the logs in CloudWatch. Each server

 Each server replica writes logs to /var/log/gunicorn/ and those can be viewed in realtime if you are logged inhave a shell into the replica.

Build and publish API server binaries

...

Code Block
# If logged into the server, log out now and log back in to reset the shell environment

# From your developer machine log into the server and switch to the root user.
aws ssm start-session --target i-0a0612957e48959170a750b39c8e88d101 --region us-east-2

# Enable the bash shell


# This process needs to be completed as root
sudo su -

# Get local credentials for our ECR repository.
aws ecr get-login-password --region us-east-2 \
| docker login --username AWS \
               --password-stdin 747101682576.dkr.ecr.us-east-2.amazonaws.com

# Pull the latest Docker container
docker pull 747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod

# Stop the running service.
systemctl stop r2d2.service

# If necessary, edit service definition file: /etc/systemd/system/r2d2.service
# Do this only if docker arguments or environment variables have changed.
# If updating this file you will need to run `systemctl daemon-reload` to
# load in updates to the service.

# Restart the systemd service.
systemctl start r2d2.service

# Check the status of the service
docker ps 

# Sample output from the production server
CONTAINER ID   IMAGE                                                           COMMAND                  CREATED          STATUS          PORTS                               NAMES
cd8745ac7561   747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod   "run_r2d2_app --port…"   47 seconds ago   Up 46 seconds   0.0.0.0:80->80/tcp, :::80->80/tcp   r2d2-api-service

# Check the output logs
docker logs -f r2d2-api-service

...