In the last year working with/on AWS has become an almost daily task in my life as a consultant and trainer. From a trainer perspective there is not much to say about it because we use a very limited amount of services (EC2, mostly) and we have a limited amount of trainers that use the platform to deliver our trainings. When the amount of services you use is small and the amount of people managing the platform is small as well it not very hard to keep things simple. As a consultant, on the other side, when you support customers that use dozens of services, have tens or hundreds of EC2 instances things quickly become tricky and unmanageable if you do not care about simplicity right from the beginning.
AWS accounts
For getting access to AWS you need an AWS account, nothing special around that. When you have one account this is a no brainer: Everything you configure and need to manage is under that account. But more and more companies do not only have one account, they have multiple. Maybe there is one account for testing and one account for production. Maybe there even is an account for every department or, even worse, there is one account per application. Who is going to control and manage that? To avoid cleaning up the mess when you are on AWS already for some time you should work with AWS organizations right from the beginning. Without AWS organizations management of all your accounts will become unmanageable and it will be a nightmare if you want to put in place some centralized controlling.
IAM users, groups, roles and policies
Identity access management is a crucial part on AWS. This is probably where you need to think most _before_ your start using AWS. Do you want to create an IAM user for all of your employees? Or do you want to make use of identity federation? This is a decision you should be clear about right from the beginning. If you have hundreds or thousands of people working on AWS it is probably not a good idea to create an IAM user for all of them. You also want to think about multi factor authentication.
A lot of services in AWS require to assume an IAM role. When you use the AWS console to configure your services there often is a choice that AWS creates the required IAM roles in the background for you. Do you really want that? Let some time go by and you have so many IAM roles that it is nearly impossible to know what is really required and for what purpose. You definitely need to limit permissions to a specific group that manages all the IAM stuff, especially roles and policies. When you have multiple accounts, make sure that you have the same roles and policies in every account and that you have naming conventions. If you don’t do this, well, again there will be chaos.
Tags and automation
Right from the beginning: Think about tags. Without tags you will be lost. At some point in time you will need automation and you can totally forget automation if you do not tag your services and resources. I would even go further and enforce the usage of your most important tags and do that in all your AWS accounts and use exactly the same tags in all of your accounts. Otherwise, again, there will be chaos.
Right from the beginning: Think about automation. Automation is key because you will quickly have more services and resources to manage than you would have on premises. Deploying new applications on AWS by using the AWS services is easy. The more people are allowed to do that, the quicker you will have tens or hundreds or even thousands of services to manage. How do you want to manage that without automation? Do you want to patch hundreds of EC2 instances by hand or do you want to stop and start EC2 instances by hand because instances that are running, but not required all the time, increase costs?
Backups and restores
Where do you put your backups when your are on AWS? In the same availability zone? In the same region but in another availability zone? In another region? Do you want push the backups to another cloud provider?
All of that is possible but it depends on the classification of your data where you have to push the backups to. Storing backups in another region is the most secure way when you want to stick with AWS completely, but this will also come with a cost as transferring data out of a region is charged. Cross region backups also come with a time component: How fast do you need your backups back in case an AWS region goes down completely and which region will you use for the disaster environment? Do you have a complete copy of your data and services in another region? That will increase costs again.
Cleanup, cleanup, cleanup
You definitely need procedures for cleaning up your AWS accounts. How many EBS snapshots do you currently have and do you know if all of them are required? How many VPCs, subnets, internet gateways and routing tables do you have? Do you need all of them? How many key pairs exist? How many Lambda functions, how many IAM roles and policies do exist, how many S3 buckets do you have and for what purpose? If you do not regularly cleanup, again, there will be chaos.
Test, test, test…but wait, that will cost something
Using services in a public cloud does not mean you do not need to test. But testing will cost you money as well. Every service you are using will be charged no matter if you use it for testing purposes or for production. If you test in one account and then want to apply the same stuff in another account or another region do not expect that all is fine. You will need to test again because maybe the other account is missing some IAM stuff or services are not available at all in another region.
It is still about engineering
Using cloud services is easy, all is managed by the cloud provider. So far for the marketing side. It is true that you save time and money because you are using services and you do not need to take care about setting up and managing the services. But it is also true that all the stuff you need to put around these services requires a lot of engineering. Setting up your business, or parts of you business, on AWS or any other cloud provider requires a lot of brain work. You do not need to spend the time and money at the beginning because getting started is easy. You will probably spend a lot of time and money when you are in a public cloud for some months or even years and start to realize that management of all that cloud stuff can be really hard.