Autoscale your architecture
FlexStack stands alone with its unique architecture autoscaling that fine-tunes your infrastructure for cost-savings – automatically and without downtime. Gone are the days of over-provisioning and spreadsheet planning. Rip it and ship it.
No scale, hyperscale
Use a request-based architecture when your web traffic is low and scale up to a load-balanced architecture when you've got serious traffic.
Public IPs cost money
FlexStack environments are designed to save you money. We know exactly when public IP pricing costs more than using a NAT gateway with private subnets.
Zero-config spot placement
We use spot placement in development and staging environments to save money. In production, we use on-demand placement for reliability. No configuration required.
CPU architecture selection
ARM processors are cheaper than x86 processors. Use "flex" configuration if your application is multi-platform to automate the selection of the cheapest processor.
Automatic service discovery
Your FlexStack services are automatically registered with Route 53 and Service Connect for easy service discovery and communication, removing the cost and latency of network load balancers. Use the ECS metadata endpoint to discover services in the same availability zone to reduce inter-zone data transfer costs.
Autoscale your infrastructure
Adding autoscaling to your services is a great way to save money and ensure your services are always available. FlexStack takes the guess work out of setting up autoscaling for your services, so you can focus on building great products.
Horizontal scaling
Automatically increase or decrease the number of running tasks in your service based on CPU, memory, and request thresholds. Use horizontal scaling to handle changes in traffic and ensure your services are always available.
Scale out
If utilization is trending above or below some percentage a scaling event will occur. An alarm associated with triggering a scale-out event evaluates metrics for 3 minutes. If the average value is greater than this % for all three samples, a scale-out event occurs.
Scale in
The scale-in alarm is more conservative. It evaluates average CPU utilization for a cooldown period before triggering a scale-in event. This prevents the service from scaling between task counts too quickly, which can negatively impact availability. The value for the scale-in alarm is 5% less than the threshold value to avoid too much fluctuation in service capacity.
Request scaling
Request scaling allows you to scale your service based on the number of requests it receives. This can be useful if you have a service that is network-bound and you want to ensure that it can handle a large number of requests.