action AZ:EKS

fail_az

Simulates the loss of an AZ in an AWS Region for EKS clusters with managed nodegroups

Run it now
View details
Typeaction
Moduleazchaosaws.eks.actions
Namefail_az
Returnmapping

This function simulates the loss of an AZ in an AWS Region for EKS clusters with managed nodegroups. All nodegroups within the tagged clusters will be affected. For network failure type, it uses a blackhole network ACL with deny all traffic. For instance failure type, it stops normal instances with force; stops persistent spot instances; cancels spot requests and terminates one-time spot instances. Ensure your target clusters are tagged. ASG(s) that are part of the managed node groups will also be impacted.

Usage

JSON

{
  "name": "fail_az",
  "type": "action",
  "provider": {
    "type": "python",
    "module": "azchaosaws.eks.actions",
    "func": "fail_az",
    "arguments": {
      "az": "",
      "dry_run": true
    }
  }
}

YAML

name: fail_az
provider:
  arguments:
    az: ""
    dry_run: true
  func: fail_az
  module: azchaosaws.eks.actions
  type: python
type: action

Arguments

NameTypeDefaultRequiredTitleDescription
azstringYesAvailability ZoneAZ to target
tagsmapping[{"Name": ""}]NoTagsMatch only resources with these tags
failure_typestringnetworkNoFailure TypeType of failure to apply: network, instance
state_pathstring/tmp/fail_eks_az.jsonNoLocal Path to OperationPath to a local file that holds the information of this operation, used by the recover_az action. Unless you need to run this action multiple times in the same experiment, you can ignore this field
dry_runbooleanfalseNoDry RunOnly perform a dry run for it

Required:

  • az (str): An availability zone
  • dry_run (bool): The boolean flag to simulate a dry run or not. Setting to True will only run read-only operations and not make changes to resources. (Accepted values: True | False)

Optional:

  • failure_type: The failure type to simulate. (Accepted values: “network” | “instance”) (Default: “network”)
  • tags: A list of key/value pair to identify the cluster(s) by. (Default: [{"Name": ""}])

Return structure

{
  "AvailabilityZone": str,
  "DryRun": bool,
  "Clusters": [
    {
      "ClusterName": str,
      "NodeGroups": [
        {
          "NodeGroupName": str,
          "AutoScalingGroups": [
            {
              "AutoScalingGroupName": str,
              "Before": {
                "SubnetIds": List[str],
                "AZRebalance": bool,
                "MinSize": int,
                "MaxSize": int,
                "DesiredCapacity": int
              },
              "After": {
                "SubnetIds": List[str],
                "AZRebalance": bool,
                "MinSize": int,
                "MaxSize": int,
                "DesiredCapacity": int
              }
            }
          ],
          "Subnets": [
            {
              "SubnetId": str,
              "VpcId": str,
              "Before": {
                "NetworkAclId": str,
                "NetworkAclAssociationId": str
              },
              "After": {
                "NetworkAclId": str,
                "NetworkAclAssociationId": str
              }
            },
            ...
          ],
          "Instances": [
            {
              "InstanceId": str,
              "Before": {
                "State": 'pending'|'running'
              },
              "After": {
                "State": 'stopping'|'stopped'
              }
            },
            ...
          ]
        }
        ...
      ]
    }
  ]
}

Signature

def fail_az(
    az: str = None,
    dry_run: bool = None,
    failure_type: str = "network",
    tags: List[Dict[str, str]] = [{"Name": ""}],
    state_path: str = "fail_az.{}.json".format(__package__.split(".", 1)[1]),
    configuration: Configuration = None,
) -> Dict[str, Any]:
    pass