Scaling Groups
Overview
Scaling Groups automatically add or remove instances based on real-time CPU and memory metrics. Define a template (plan, image, networking) and scaling policies, and the system handles the rest.
Key characteristics:
- Horizontal scaling -- Automatically creates and destroys instances to match demand.
- Metric-driven -- Scales based on average CPU or memory usage across the group.
- Template-based -- Every instance in a group uses the same plan, image, and cloud-init configuration.
- Optional Load Balancer integration -- New instances can automatically register as backend targets on a load balancer, and deregister when removed.
- Configurable limits -- Set minimum and maximum instance counts to control costs and availability.
Creating a Scaling Group
- Navigate to Scaling Groups in the sidebar (under Compute)
- Click Create Scaling Group
- Fill in the configuration:
Configuration Fields
| Field | Description |
|---|---|
| Name | A descriptive name for the group |
| Location | The data center / hypervisor group to deploy in (must have autoscaling enabled) |
| Plan Type | Select a plan category |
| Plan | The instance plan (CPU, RAM, Storage) for all instances in the group |
| Image | The OS or custom image to deploy. Choose from General (base images) or My Images (your custom snapshots) |
| Cloud-Init | Optional YAML configuration applied to every instance at boot |
| SSH Keys | Optional SSH keys injected into instances |
| User Scripts | Optional scripts to run on instance creation |
| VPC + Subnet | Optional VPC networking. Instances are deployed into the selected subnet |
| Load Balancer | Optional. Select a load balancer and backend to auto-register instances as targets |
| Min Instances | Minimum number of instances to maintain (default: 1). The group will always keep at least this many running |
| Max Instances | Maximum number of instances allowed (default: 5). Scaling will never exceed this |
- Click Create
If min instances is greater than 0, the initial instances are deployed immediately.
Managing a Scaling Group
Click Manage (cog icon) on any scaling group to access its management page.
Overview
The overview card displays the group's configuration: location, plan, image, VPC/subnet, load balancer, current instance count vs limits, and cloud-init config. You can edit:
- Plan -- Click "Change" to select a new plan type and plan. New instances will use the updated plan.
- Min/Max Instances -- Click "Edit" to adjust limits. If the new minimum exceeds the current count, instances are created immediately. If the new maximum is below the current count, excess instances are removed.
- Cloud-Init -- Click "Edit" to modify the cloud-init YAML in a code editor. Changes apply to new instances only.
Pause / Resume
- Pause stops all automatic scaling evaluations. No instances are created or destroyed while paused. Existing instances continue running.
- Resume reactivates scaling. If the current count is below minimum, instances are created immediately.
Destroy
Destroying a scaling group removes all its instances, policies, and activity history. This action is permanent.
Scaling Policies
Policies define the rules for when and how the group scales. A group can have multiple policies (e.g., one for CPU and one for memory).
Creating a Policy
From the scaling group management page, click Add Policy and configure:
| Field | Description |
|---|---|
| Metric | CPU or Memory -- the metric to monitor |
| Scale Up Threshold | Percentage (1-100). When the group average exceeds this, instances are added |
| Scale Down Threshold | Percentage (0-99). When the group average drops below this, instances are removed |
| Scale Up Step | Number of instances to add per scale-up event (default: 1) |
| Scale Down Step | Number of instances to remove per scale-down event (default: 1) |
| Scale Up Cooldown | Seconds to wait after a scale-up before allowing another (default: 300) |
| Scale Down Cooldown | Seconds to wait after a scale-down before allowing another (default: 600) |
| Evaluation Interval | How often the metric is checked, in seconds (default: 30) |
| Evaluation Window | How far back to average the metric, in seconds (default: 120) |
How Evaluation Works
- The system checks active scaling groups every 30 seconds
- For each policy whose evaluation interval has elapsed, it collects the average metric across all running instances in the group
- If the average exceeds the scale up threshold and the cooldown has elapsed, instances are added (up to the max)
- If the average drops below the scale down threshold and the cooldown has elapsed, instances are removed (down to the min)
Cooldown Periods
Cooldowns prevent rapid oscillation. After a scale-up event, no further scale-up can happen until the cooldown expires. Scale-down has its own independent cooldown. This gives new instances time to absorb load before the system decides to scale further.
Load Balancer Integration
When a load balancer and backend are selected:
- Scale up: New instances are automatically registered as targets in the specified backend. Traffic begins routing to them as soon as they pass health checks.
- Scale down: Instances are deregistered from the backend before being destroyed, allowing in-flight requests to complete.
This creates a fully automatic scaling pipeline: traffic arrives at the load balancer, which distributes it across the scaling group's instances, which grow and shrink based on demand.
Activity Log
The management page shows a real-time activity log of all scaling events:
| Column | Description |
|---|---|
| Time | When the event occurred |
| Action | Scale Up (green), Scale Down (yellow), or Error (red) |
| Metric | The metric reading that triggered the action (e.g., CPU 85%) |
| Instance | The instance created or destroyed (clickable link) |
| Message | Human-readable description of what happened |
Activity logs are retained for 30 days and then automatically cleaned up.
Instances
The instances section shows all instances currently managed by the scaling group, with status and creation time. From here you can:
- Manage (cog icon) -- Navigate to the instance's full management page
- Destroy (trash icon) -- Remove a specific instance from the group
If you manually destroy an instance and the count drops below the minimum, the system will automatically create a replacement.
Billing
Instances in a scaling group are billed identically to regular Cloud Service instances -- hourly charges based on the selected plan. There is no additional charge for the scaling group itself.
Costs scale linearly: if your group scales from 2 to 5 instances, you pay for 5 instances. When it scales back down, billing stops for the removed instances.
Troubleshooting
Instances not being created
- Verify the scaling group status is Active (not Paused)
- Check that your account has sufficient credit balance
- Verify the location has available hypervisors with enough resources
- If using a private VPC subnet, ensure a NAT gateway is attached
Scale up not triggering
- Confirm at least one scaling policy is enabled
- Check that the average metric exceeds the scale-up threshold
- The cooldown period may not have elapsed since the last scale-up
- Review the activity log for error entries
Instances created but not accessible
- New instances take a minute or two to boot and apply cloud-init
- If using a load balancer, check that health checks are passing
- Verify the VPC subnet has available IPs
"No hypervisors available"
All hypervisors in the location are at capacity. Contact your provider about adding more compute resources.