Skip to content

fix(roles/exoscale_vm): rewrite on top of v2 HTTP API#253

Open
bhatti-lf wants to merge 1 commit into
mainfrom
fix/exoscale_vm-v2-api
Open

fix(roles/exoscale_vm): rewrite on top of v2 HTTP API#253
bhatti-lf wants to merge 1 commit into
mainfrom
fix/exoscale_vm-v2-api

Conversation

@bhatti-lf
Copy link
Copy Markdown
Contributor

The role has been broken for a while: Exoscale deprecated the CloudStack v1 API (changelog) and the ngine_io.cloudstack.* modules the role used now return HTTP 403 'This API is deprecated.'. exo made breaking CLI changes on top of that (--private-instance got replaced by --public-ip none|inet4), and --check runs were crashing with json.decoder.JSONDecodeError because the list task was skipped while the create/delete when: still tried to from_json its empty output.

This rewrites the whole thing on top of the v2 HTTP API, via a new small linuxfabrik.lfops.exoscale_api module that signs each request with EXO2-HMAC-SHA256 and polls returned operation objects until they leave pending. Signing was cross-checked byte-for-byte against Exoscale's reference requests-exoscale-auth.ExoscaleV2Auth. The exo binary and python3-cs are no longer required on the control node.

A few things the original role never did, now also wired up:

  • exoscale_vm__state actually reacts post-create. started / stopped / restarted / absent map to :start / :stop / :reboot and the auto-start flag on the create body.
  • Private networks reconcile properly on every run. Change fixed_ip and the role calls :update-ip instead of needing destroy + recreate; remove a network entry and it gets detached.
  • Changes to service_offering and disk_size on existing VMs trigger :scale and :resize-disk. The role stops the VM first if needed, the existing power-state step starts it back up.
  • --diff shows the method, path and JSON body for every mutating call.

There's a new meta/argument_specs.yml; legacy exoscale_vm__account is declared but ignored at runtime so existing inventories don't blow up at role entry.

A couple of design choices that might warrant a look during review:

  • Custom module instead of filter plugin + ansible.builtin.uri. Keeps the role YAML declarative and the signing logic in one place. Easy to flip if plugins/modules/ shouldn't grow further.
  • module_defaults on the outer block to carry the credentials, instead of repeating api_key / api_secret / zone on every task. No other lfops role seems to use that pattern. Credentials can be inlined per task if that's deliberate.

A few API limits to keep in mind, all documented in the README:

  • :scale only allows within-family changes (standard.tiny to standard.large, not standard to memory).
  • :resize-disk can only grow.
  • --check only previews top-level mutations cleanly. Nested cascades (rules after the SG is just-created, attach after the instance is just-created, :start after a :stop-for-scale) have no resource to act on yet, so they silently skip.

Tested:

  • fresh deploy with SG rules and a fixed-IP private network
  • re-run reports 0 changed
  • bump disk_size: stops, resizes, restarts
  • bump service_offering within family: stops, scales, restarts
  • change fixed_ip: :update-ip fires
  • remove a network from inventory: :detach fires
  • state: 'absent' removes VM and per-VM SG
  • --check --diff previews bodies without hitting the API

old cloudstack APIs have been deprecated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant