You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

https://experiments.jcsda.org/

https://skylab.jcsda.org/

About

Our web applications, experiments.jcsda.org and skylab.jcsda.org run on AWS EC2 instances in us-east-1. They are a Voila application with python notebooks and each run from their own instance. The logs on these instances are located at /var/log/nginx/error.log. The r2d2, solo, and diag-plots repositories are needed and are located inside the home directory.

Procedures

Reboot EC2 Instance

  1. Log into the AWS Console in us-east-1 using the jcsda-noaa account.
  2. Navigate to the EC2 Instance dashboard.
  3. Click the checkbox next to "skylab.jcsda.org" or "experiments.jcsda.org" depending on which site is down. 
  4. Click the drop down arrow next to "Instance state" in the top bar and select "Reboot instance". This might take some time to stop and start. Once it says running again, verify the website is reachable and try to connect to the instance via ssh as the ubuntu user. If those fail, then you will need to stop/start the instance. That will change the public IP address so more steps will need to be taken in order to update with the new IP. Proceed to steps 5-X.
  5. Under "Instance state" click on "Stop instance" to shut it down. If this takes longer then a few minutes, do a "Force stop instance".
  6. Start the instance again, click "Instance state" and "Start instance". After it is back running, try to connect to the instance via ssh as the ubuntu user.
  7. Copy the new  “Public IPv4 address” and navigate to the Route 53 dashboard. Click "Hosted zones" then "jcsda.org". Scroll down to experiments.jcsda.org or skylab.jcsda.org and click "Edit record". Paste the new IP where the old one is and click "Save".
  8. Try loading the web page. If it still does not load, try with a different browser. If that does not load then you will need to restart services, so proceed with steps 9-X.
  9. Log back into the instance as user ubuntu via ssh. 
  10. Restart the following services and run:

    sudo systemctl restart nginx.service
    sudo systemclt daemon-reload
    sudo systemclt restart voila.service
  11. Verify the website is back up and running.

Updating the certificate

The certificate should be automatically updated and managed via CertBot and Let's Encrypt. If you need to manually update the certificate you can follow this procedure.

Launch new instance running Voila and NGINX

TODO, see https://docs.google.com/document/d/1vqK-qAxBGt_I9j6VG1SSLtPhQp89PzV4Y32yuWLWbUg/edit

Troubleshooting

Web page fails to load

Issue: the web pages will fail to load/ timeout

Explanation: Each time the website is hit a new python kernel is opened and will stay open forever. Therefore, when the instance memory is full of these python kernels the page will go down.

Solution: Reboot the EC2 Instance (see procedure above)

  • No labels