...
- AWS Resources for R2D2 Prod
- Cloud formation config: //r2d2/server/cfn/prod.yaml
- Cloud formation stack: arn:aws:cloudformation:us-east-2:747101682576:stack/r2d2-api-prod/c15cb090-d2dc-11ef-a026-06c957b621e9
- API Server EC2 Instances: ec2 filtered search
- Load Balancer: arn:aws:elasticloadbalancing:us-east-2:747101682576:loadbalancer/app/r2d2-prod-load-balancer/f526daf391077f70
- SSL certificate (for r2d2-api.jcsda.org): arn:aws:acm:us-east-2:747101682576:certificate/be95f8b1-4162-4615-9ed5-29ed897b527b
- Database
- Admin commands
- Login to a server instance:
aws ssm start-session --target i-0a0612957e48959170a750b39c8e88d101 --region us-east-2
- Login to a server instance:
Infrastructure Overview
...
- Application Load Balancer - API Server routing, load balancing, and SSL termination.
- EC2 Compute Hosts - R2D2 API server execution environment.
- AWS Systems Manager - shell access and administration.
- AWS Relational Database Service - R2D2 database.
Common Tasks
Deploy Updated R2D2 API Server Binaries
WARNING: Manually downing or upgrading R2D2 servers without first draining the instance from our load balancer !
The standard process for updating servers involves draining them from the load balancer one at a time, updating the service, and then re-enabling them. This approach ensures continuous network availability and prevents the loss of inbound or queued API calls. A Bash script automates this process while maintaining service stability. However, updates should be performed during low-traffic periods, as serving capacity is temporarily reduced during the update.
1. Build and publish binaries with the prod
tag as described later in this guide.
2. From a developer machine with admin AWS credentials (members of the JEDI-infra team) clone the r2d2 repository.
git clone https://github.com/JCSDA-internal/r2d2.git
&& cd r2d2
3. Run the following command and wait about 10 minutes for the full rollout to complete.
./server/scripts/update_prod_servers.sh --operation update
Note: If the rollout fails for any reason, refer to the update_prod_servers.sh
script, which contains the manual update process and extensive debugging information. This script serves as the central resource for detailed rollout procedures and documentation.
Deploy Cloud Formation Updates
- Copy the updated prod.yaml config to our infrastructure-as-code bucket and get the versioned URL for the file as shown here.
aws s3 cp server/cfn/prod.yaml s3://jcsda-usaf-iac-artifacts/r2d2/prod.yaml
file_version=$(aws s3api list-object-versions \
--bucket jcsda-usaf-iac-artifacts \
--prefix "r2d2/prod.yaml" \
| jq -r '.Versions[] | select(.IsLatest==true) | .VersionId')
echo "https://jcsda-usaf-iac-artifacts.s3.us-east-2.amazonaws.com/r2d2/prod.yaml?versionId=${file_version}" - Go to the CloudFormation stack "r2d2-api-prod" in the AWS console (or use quick-ref link above) and click "Update".
- In the update menu, select "Replace existing template" and enter the URL generated in the first step. Change any parameters and review deploy options.
- Review the change log and ensure that EC2 instances and RDS instances are not replaced.
- Monitor the CloudFormation rollout in console and ensure it executes to completion.
...
Logs for all API server replicas can be found under the the "r2d2.*" log groups in AWS CloudWatch [direct link]. The CloudWatch log exports are on a half-our schedule so expect a delay between an error and finding event and being able to view the logs in CloudWatch. Each server
- Direct link to cloudwatch log groups.
- Filtered query in LogsExplorer.
Each server replica writes logs to /var/log/gunicorn/
and those can be viewed in realtime if you are logged inhave a shell into the replica.
Build and publish API server binaries
...
Code Block |
---|
# If logged into the server, log out now and log back in to reset the shell environment # From your developer machine log into the server and switch to the root user. aws ssm start-session --target i-0a0612957e48959170a750b39c8e88d101 --region us-east-2 sudo su - # Get local credentials for our ECR repository. aws ecr get-login-password --region us-east-2 \ | docker login --username AWS \ --password-stdin 747101682576.dkr.ecr.us-east-2.amazonaws.com # Pull the latest Docker container docker pull 747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod # Stop the running service. systemctl stop r2d2.service # If necessary, edit service definition file: /etc/systemd/system/r2d2.service # Do this only if docker arguments or environment variables have changed. # If updating this file you will need to run `systemctl daemon-reload` to # load in updates to the service. # Restart the systemd service. systemctl start r2d2.service # Check the status of the service docker ps # Sample output from the production server CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES cd8745ac7561 747101682576.dkr.ecr.us-east-2.amazonaws.com/r2d2-server:prod "run_r2d2_app --port…" 47 seconds ago Up 46 seconds 0.0.0.0:80->80/tcp, :::80->80/tcp r2d2-api-service # Check the output logs docker logs -f r2d2-api-service |
...