Skip to main content

Scaling Groups

Overview

Scaling Groups automatically add or remove instances based on real-time CPU and memory metrics. Define a template (plan, image, networking) and scaling policies, and the system handles the rest.

Key characteristics:

  • Horizontal scaling -- Automatically creates and destroys instances to match demand.
  • Metric-driven -- Scales based on average CPU or memory usage across the group.
  • Template-based -- Every instance in a group uses the same plan, image, and cloud-init configuration.
  • Optional Load Balancer integration -- New instances can automatically register as backend targets on a load balancer, and deregister when removed.
  • Configurable limits -- Set minimum and maximum instance counts to control costs and availability.

Creating a Scaling Group

  1. Navigate to Scaling Groups in the sidebar (under Compute)
  2. Click Create Scaling Group
  3. Fill in the configuration:

Configuration Fields

FieldDescription
NameA descriptive name for the group
LocationThe data center / hypervisor group to deploy in (must have autoscaling enabled)
Plan TypeSelect a plan category
PlanThe instance plan (CPU, RAM, Storage) for all instances in the group
ImageThe OS or custom image to deploy. Choose from General (base images) or My Images (your custom snapshots)
Cloud-InitOptional YAML configuration applied to every instance at boot
SSH KeysOptional SSH keys injected into instances
User ScriptsOptional scripts to run on instance creation
VPC + SubnetOptional VPC networking. Instances are deployed into the selected subnet
Load BalancerOptional. Select a load balancer and backend to auto-register instances as targets
Min InstancesMinimum number of instances to maintain (default: 1). The group will always keep at least this many running
Max InstancesMaximum number of instances allowed (default: 5). Scaling will never exceed this
  1. Click Create

If min instances is greater than 0, the initial instances are deployed immediately.


Managing a Scaling Group

Click Manage (cog icon) on any scaling group to access its management page.

Overview

The overview card displays the group's configuration: location, plan, image, VPC/subnet, load balancer, current instance count vs limits, and cloud-init config. You can edit:

  • Plan -- Click "Change" to select a new plan type and plan. New instances will use the updated plan.
  • Min/Max Instances -- Click "Edit" to adjust limits. If the new minimum exceeds the current count, instances are created immediately. If the new maximum is below the current count, excess instances are removed.
  • Cloud-Init -- Click "Edit" to modify the cloud-init YAML in a code editor. Changes apply to new instances only.

Pause / Resume

  • Pause stops all automatic scaling evaluations. No instances are created or destroyed while paused. Existing instances continue running.
  • Resume reactivates scaling. If the current count is below minimum, instances are created immediately.

Destroy

Destroying a scaling group removes all its instances, policies, and activity history. This action is permanent.


Scaling Policies

Policies define the rules for when and how the group scales. A group can have multiple policies (e.g., one for CPU and one for memory).

Creating a Policy

From the scaling group management page, click Add Policy and configure:

FieldDescription
MetricCPU or Memory -- the metric to monitor
Scale Up ThresholdPercentage (1-100). When the group average exceeds this, instances are added
Scale Down ThresholdPercentage (0-99). When the group average drops below this, instances are removed
Scale Up StepNumber of instances to add per scale-up event (default: 1)
Scale Down StepNumber of instances to remove per scale-down event (default: 1)
Scale Up CooldownSeconds to wait after a scale-up before allowing another (default: 300)
Scale Down CooldownSeconds to wait after a scale-down before allowing another (default: 600)
Evaluation IntervalHow often the metric is checked, in seconds (default: 30)
Evaluation WindowHow far back to average the metric, in seconds (default: 120)

How Evaluation Works

  1. The system checks active scaling groups every 30 seconds
  2. For each policy whose evaluation interval has elapsed, it collects the average metric across all running instances in the group
  3. If the average exceeds the scale up threshold and the cooldown has elapsed, instances are added (up to the max)
  4. If the average drops below the scale down threshold and the cooldown has elapsed, instances are removed (down to the min)

Cooldown Periods

Cooldowns prevent rapid oscillation. After a scale-up event, no further scale-up can happen until the cooldown expires. Scale-down has its own independent cooldown. This gives new instances time to absorb load before the system decides to scale further.


Load Balancer Integration

When a load balancer and backend are selected:

  • Scale up: New instances are automatically registered as targets in the specified backend. Traffic begins routing to them as soon as they pass health checks.
  • Scale down: Instances are deregistered from the backend before being destroyed, allowing in-flight requests to complete.

This creates a fully automatic scaling pipeline: traffic arrives at the load balancer, which distributes it across the scaling group's instances, which grow and shrink based on demand.


Activity Log

The management page shows a real-time activity log of all scaling events:

ColumnDescription
TimeWhen the event occurred
ActionScale Up (green), Scale Down (yellow), or Error (red)
MetricThe metric reading that triggered the action (e.g., CPU 85%)
InstanceThe instance created or destroyed (clickable link)
MessageHuman-readable description of what happened

Activity logs are retained for 30 days and then automatically cleaned up.


Instances

The instances section shows all instances currently managed by the scaling group, with status and creation time. From here you can:

  • Manage (cog icon) -- Navigate to the instance's full management page
  • Destroy (trash icon) -- Remove a specific instance from the group
info

If you manually destroy an instance and the count drops below the minimum, the system will automatically create a replacement.


Billing

Instances in a scaling group are billed identically to regular Cloud Service instances -- hourly charges based on the selected plan. There is no additional charge for the scaling group itself.

Costs scale linearly: if your group scales from 2 to 5 instances, you pay for 5 instances. When it scales back down, billing stops for the removed instances.


Troubleshooting

Instances not being created

  • Verify the scaling group status is Active (not Paused)
  • Check that your account has sufficient credit balance
  • Verify the location has available hypervisors with enough resources
  • If using a private VPC subnet, ensure a NAT gateway is attached

Scale up not triggering

  • Confirm at least one scaling policy is enabled
  • Check that the average metric exceeds the scale-up threshold
  • The cooldown period may not have elapsed since the last scale-up
  • Review the activity log for error entries

Instances created but not accessible

  • New instances take a minute or two to boot and apply cloud-init
  • If using a load balancer, check that health checks are passing
  • Verify the VPC subnet has available IPs

"No hypervisors available"

All hypervisors in the location are at capacity. Contact your provider about adding more compute resources.