Alces flight is a third-party company that provides a free user-friendly interface to help you create an HPC-like cluster on AWS, complete with a login node, N compute notes, and a job scheduler.  The instructions below give basic information on how to create a AWS cluster using Alces flight.  As you gain experience, there are many customization options available to configure your cluster as you wish.

Go to AWS Marketplace and search for Alces flight

Select the community edition (free)

Select "continue to subscribe" in the upper right

The first time you do this it will ask you to enter some information (name, email...) to subscribe.  Go ahead and do so.

If you hvae subscribed successfully, selecting the "continue to subscribe" button will bring you to a page telling you that you are already subscribed

select the "continue to configuration" button on the top right

On the configuration page select

CloudFormation

Personal HPC cluster

and your favorite region (leave the software version at the default value)

Select "continue to launch" on the upper right

Under "Choose Action" select
Launch CloudFormation

Then hit Launch on the lower right

Now you have left AWS Marketplace/Alces Flight and gone straight to the AWS CloudFormation pages

You land on the CloudFormation "Create Stack/Select Template" page. As I understand it, this is the main thing that Alces flight does for you - it gives you a default template (stored on AWS S3) so you don't have to design one yourself.  This default template should already be selected.  So just select Next in the lower right   

As you will see soon, the AWS CloudFormation Dashboard is a lot like the EC2 Dashboard but the items are called Stacks instead of Instances. So, this page asks you to give your Stack (i.e. instance a name). Give it a unique name - anything you like

enter your AWS login as the cluster administrator user name

Select the ssh key pair you want.  If you don't see the one you normally use go back and make sure you selected your usual region in the previous page.

Enter this for the access network address:
0.0.0.0/0

Scroll down and enter the desired number of compute nodes under 

initial/maximum compute nodes

You can just go with the defaults for the rest if you wish (many of them blank).  Or, here are a few things you might consider selecting (optional)

Under "Preload Software"  select "development"

You can enable the autoscaling policy if you wish and specify the initial and maximum number of nodes

You can specify the instance types for the login node and the compute nodes 

Be careful at the very bottom where it says explicit availability zone. You may be tempted to enter something like us-west-2 (I did!).  But, what it really wants is a sub-zone (e.g. us-west-2a).  It's best to just leave this blank.

Then select Next on the bottom right

On the next page you can just leave everything as the default and selected Next again

That brings you to a Review page

Near the bottom, select

I acknowledge that AWS CloudFormation might create IAM resources

Then in the lower right hit Create

This brings you to the main CloudFormation page. It takes a while to initialize but if you wait a few minutes you will see your new stack listed in much the same way as the EC2 instances are listed.  Be patient.  After it appears on the list, it still takes a few minutes to initialize - more time than a single EC2 instance.

Eventually, if all goes right it will say CREATE_COMPLETE

If it says ROLLBACK_COMPLETE then something went wrong.  Select the stack and go to the Overview tab.  There you can scroll down and find a link that says view failure event details.  This should allow you to figure out what went wrong and then try again from the Alces flight page at AWS Marketplace.  But first be sure to delete your stack by selecting Actions-delete stack.

If your stack creation succeeded, you can get a public IP address by going to the Output tab.  Then you can ssh into it just like you would ssh into an EC2 instance.  However, your username here is your AWS user name, not ec2-user or ubuntu as in an EC2 instance.  So, you'd type, e.g.

ssh -i "my-pem-file.pem" <my-aws-username>@<ip-address>


When you're done with the cluster, select it from the menu and then Actions-delete stack

  • No labels