Alces flight is a third-party company that provides a free user-friendly interface to help you create an HPC-like cluster on AWS, complete with a login node, N compute notes, and a job scheduler. The instructions below give basic information on how to create a AWS cluster using Alces flight. As you gain experience, there are many customization options available to configure your cluster as you wish.
Go to AWS Marketplace and search for Alces flight
Select the community edition (free)
Select "continue to subscribe" in the upper right
The first time you do this it will ask you to enter some information (name, email...) to subscribe. Go ahead and do so.
If you hvae subscribed successfully, selecting the "continue to subscribe" button will bring you to a page telling you that you are already subscribed
select the "continue to configuration" button on the top right
On the configuration page select
CloudFormation
Personal HPC cluster
and your favorite region (leave the software version at the default value)
Select "continue to launch" on the upper right
Under "Choose Action" select
Launch CloudFormation
Then hit Launch on the lower right
Now you have left AWS Marketplace/Alces Flight and gone straight to the AWS CloudFormation pages
You land on the CloudFormation "Create Stack/Select Template" page. As I understand it, this is the main thing that Alces flight does for you - it gives you a default template (stored on AWS S3) so you don't have to design one yourself. This default template should already be selected. So just select Next in the lower right
As you will see soon, the AWS CloudFormation Dashboard is a lot like the EC2 Dashboard but the items are called Stacks instead of Instances. So, this page asks you to give your Stack (i.e. instance a name). Give it a unique name - anything you like
enter your AWS login as the cluster administrator user name
Select the ssh key pair you want. If you don't see the one you normally use go back and make sure you selected your usual region in the previous page.
Enter this for the access network address:
0.0.0.0/0
Scroll down and enter the desired number of compute nodes under
initial/maximum compute nodes
You can just go with the defaults for the rest if you wish (many of them blank). Or, here are a few things you might consider selecting (optional)
Under "Preload Software" select "development"
You can enable the autoscaling policy if you wish and specify the initial and maximum number of nodes
You can specify the instance types for the login node and the compute nodes
Be careful at the very bottom where it says explicit availability zone. You may be tempted to enter something like us-west-2 (I did!). But, what it really wants is a sub-zone (e.g. us-west-2a). It's best to just leave this blank.
Then select Next on the bottom right
On the next page you can just leave everything as the default and selected Next again
That brings you to a Review page
Near the bottom, select
I acknowledge that AWS CloudFormation might create IAM resources
Then in the lower right hit Create
This brings you to the main CloudFormation page. It takes a while to initialize but if you wait a few minutes you will see your new stack listed in much the same way as the EC2 instances are listed. Be patient. After it appears on the list, it still takes a few minutes to initialize - more time than a single EC2 instance.
Eventually, if all goes right it will say CREATE_COMPLETE
If it says ROLLBACK_COMPLETE then something went wrong. Select the stack and go to the Overview tab. There you can scroll down and find a link that says view failure event details. This should allow you to figure out what went wrong and then try again from the Alces flight page at AWS Marketplace. But first be sure to delete your stack by selecting Actions-delete stack.
If your stack creation succeeded, you can get a public IP address by going to the Output tab. Then you can ssh into it just like you would ssh into an EC2 instance. However, your username here is your AWS user name, not ec2-user or ubuntu as in an EC2 instance. So, you'd type, e.g.
ssh -i "my-pem-file.pem" <my-aws-username>@<ip-address>
When you're done with the cluster, select it from the menu and then Actions-delete stack