Injecting fault Into Your Platform
This page references the information about how fault injects its resources into the platform it supports.
Google Cloud Platform
fault may run on Google Cloud Platform by hooking into a Cloud Run service.
When initializing, fault creates a new revision of the
service and injects a sidecar container into it. The container runs the
fault
cli.
The new sidecar container also exposes a port between 50000
and 55000
.
This means that traffic will now be sent to the fault
container which will reroute to 127.0.0.1:<service port>
where the
<service port>
is the original port exposed by the Cloud Run service.
On rollback, a new revision is created with the previous specification of the service.
sequenceDiagram
autonumber
fault (local)->>CloudRun Service: Fetch
fault (local)->>CloudRun Service: Add fault's container as a sidecar, expose a random port between 50000 and 55000 as the public port of the service.
CloudRun Service->>fault CloudRun Container: Starts container and set traffic shapping on new revision
loop fault proxy
fault CloudRun Container->>CloudRun Application Container: Route traffic via fault on `127.0.0.1:<service port>`
loop fault injection
fault CloudRun Container->>fault CloudRun Container: Apply faults
end
end
fault uses the default GCPO authentication mechanism to connect to the project.
The roles for that user needs at least the following permissions:
- run.services.get
- run.services.list
- run.services.update
You should be fine with using the roles/run.developer role.
Kubernetes
fault may run on Kubernetes by creating the following resources:
- a job (CronJob are not supported yet)
- a service
- a dedicated service account
- a config map that holds the environment variables used to configure the proxy
sequenceDiagram
autonumber
fault (local)->>Service Account: Create
fault (local)->>Config Map: Create with fault's proxy environment variables
fault (local)->>Target Service: Fetch target service's selectors and ports
fault (local)->>Target Service: Replace target service selectors to match new fault's pod
fault (local)->>fault Service: Create new service with target service's selectors and ports but listening on port 3180
fault (local)->>Job: Create to manage fault's pod, with proxy sending traffic to new service's address
Job->>fault Pod: Schedule fault's pod with config map attached
fault Pod->>Service Account: Uses
fault Pod->>Config Map: Loads
Target Service->>fault Pod: Matches
loop fault proxy
fault (local)->>Target Service: Starts scenario
Target Service->>fault Pod: Route traffic via fault
loop fault injection
fault Pod->>fault Pod: Apply faults
end
fault Pod->>fault Service: Forwards
fault Service->>Target Pods: forward traffic to final endpoints
Target Pods->>fault (local): Sends response back after faults applied
end
Note
Once a scenario completes, fault rollbacks the resources to their original state.
fault uses the default Kubernetes authentication mechanism to connect
to the cluster: ~/.kube/config
, KUBECONFIG
...
The authorizations for that user needs at least the following roles:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: fault
rules:
# ServiceAccounts (create/delete)
- apiGroups: [""]
resources:
- serviceaccounts
verbs:
- create
- delete
- get
# ConfigMaps (create/delete/get)
- apiGroups: [""]
resources:
- configmaps
verbs:
- create
- delete
- get
# Services (get/patch/delete)
- apiGroups: [""]
resources:
- services
verbs:
- get
- patch
- delete
# Jobs (create/delete/get/list)
- apiGroups:
- batch
resources:
- jobs
verbs:
- create
- delete
- get
- list
# Pods (list/get)
- apiGroups: [""]
resources:
- pods
verbs:
- get
- list
- watch