Streamlining Monitoring: How to Receive AWS Health Check Alerts on Slack

Streamlining Monitoring: How to Receive AWS Health Check Alerts on Slack
Photo by Austin Distel / Unsplash

Monitoring health checks in Slack, and getting a notification if something goes wrong seems simple and straightforward. Until it isn't.

Mainly because AWS uses the US-east-1 region to maintain the services related to route 53. Which makes things problematic if you use AWS services from other regions. In that case, you need to set up the infrastructure for the US region as well.

As of today, we can't trigger an alarm in our desired region (in my case it was the EU). So the idea is like this

  1. Configure health check in route 53 [Global]
  2. Attach an alarm into it [US region]
  3. Attach a SNS topic to this alarm [US region]
  4. Create a subscription, and attach a lambda there. This lamba can be any region.
  5. Write the logic to parse the SNS topic (whether the state is healthy or unhealthy) and send a notification based on this.

We'll use AWS Cloudformation in this blog - but you can follow along to do it manually as well.

Prepare Configuration:

If a template for the US-east-1 region is absent, update your samconfig.toml file with relevant configuration details. This file helps streamline deployment processes.

[us-east-1]
[us-east-1.deploy]
[us-east-1.deploy.parameters]
stack_name = "your_stack_name"
s3_bucket = "your_s3_bucket"
s3_prefix = "whateveryouwant"
region = "us-east-1"
capabilities = "CAPABILITY_IAM CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM"
image_repositories = []

Set Up SNS Topic:

Create an SNS, its topic and its subscription. You can keep the endpoint blank for now, we'll update it later once we write our lambda.

AWSTemplateFormatVersion: "2010-09-09"

Resources:
  MySNSTopic:
    Type: AWS::SNS::Topic
    Properties:
      DisplayName: "MyGenericTopic"
      TopicName: "MyGenericTopic"

  MyLambdaSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      Protocol: lambda
      TopicArn: !Ref MySNSTopic
      Endpoint: "arn:aws:lambda:REGION:ACCOUNT_ID:function:MyLambdaFunction"

Create Health Checks and Alarms


Craft health checks and associated alarms within a template specific to the US region (e.g., template.us.yaml).

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: US Stack


Resources:
  FooSNS:
    Type: AWS::Serverless::Application
    Properties:
      Location: {{path to your SNS resource}}

  HealthCheckFoo:
    Type: AWS::Route53::HealthCheck
    Properties:
      HealthCheckConfig:
        FailureThreshold: 1
        FullyQualifiedDomainName: www.example.com
        Port: 443
        RequestInterval: 30
        ResourcePath: /health
        Type: HTTPS

  HealthCheckAlarmFoo:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmName: HealthCheckAlarmFoo
      ComparisonOperator: LessThanThreshold
      EvaluationPeriods: 1
      MetricName: HealthCheckStatus
      Namespace: AWS/Route53
      Period: 60
      Statistic: Minimum
      Threshold: 1
      Dimensions:
        - Name: "HealthCheckId"
          Value: !Ref HealthCheckFoo
      AlarmDescription: "Foo Health check failed"
      AlarmActions:
        - !GetAtt FooSNS.Outputs.MySNSTopicArn
      OKActions:
        - !GetAtt FooSNS.Outputs.MySNSTopic

Quick knowledge: How this HealthCheckAlarmFoo is being attached to this HealthCheckFoo?

It’s by these lines.

   Dimensions:
        - Name: "HealthCheckId"
          Value: !Ref HealthCheckFoo

If you have different types of health checks e.g.: type calculate, you should use AlarmIdentifier in the health check config. If you try to use this AlarmIdentifier property for basic type healthcare (the one we’re creating, you will get a 400 error).

Now let’s create the lambda. In your template.yaml add the configuration for lambda

  AWSLambdas:
    Type: AWS::CloudFormation::Stack
    Properties:
      TemplateURL: {{path to your lambda config}}
      Parameters: { }

Now let’s configure the lambda:

AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: Lambda functions

Resources:
  SendSlackNotificationFunction:
    Type: AWS::Serverless::Function
    Architecture: x86_64
    Properties:
      Handler: pathToYourLambdaFunction/index.handler
      Runtime: nodejs18.x [Change it if you use other version]
      CodeUri: [path to the codebase for lambda]
      FunctionName: yourLambdaFunctionNameToSendSlackNotification
      Policies:
        - AWSLambdaBasicExecutionRole

# this is important, otherwise your SNS from US region can't invoke this
  AllowSNSInvoke:
    Type: AWS::Lambda::Permission
    Properties:
      Action: lambda:InvokeFunction
      FunctionName: !GetAtt SendSlackNotificationFunction.Arn
      Principal: 'sns.amazonaws.com'

Outputs:
  SendSlackNotificationFunctionArn:
    Description: "Lambda Function ARN for Slack Notifications"
    Value: !GetAtt SendSlackNotificationFunction.Arn

And then, let’s write the lambda code itself

const https = require('https');

const SLACK_WEBHOOK_URL = 'URL FOR SLACK WEBHOOK';

function getMessageFromTopic(topicName, alarmState) {
	// your logic to generate lamba URL 
}

function handler(event) {

	// Extracting topic name [in case you have multiple sns pointing out same Lambda
    const topicArn = event.Records[0].Sns.TopicArn;
    const topicName = topicArn.split(':').pop();
    const alarmMessage = JSON.parse(event.Records[0].Sns.Message);
    const alarmState = alarmMessage.NewStateValue;

    const message = getMessageFromTopic(topicName, alarmState);
    const postData = JSON.stringify({ text: message });

    const options = {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json',
            'Content-Length': postData.length
        }
    };

    return new Promise((resolve, reject) => {
        const req = https.request(SLACK_WEBHOOK_URL, options, (res) => {
            if (res.statusCode === 200) {
                resolve({ statusCode: 200, body: 'Message sent to Slack' });
            } else {
                reject(new Error(`Request to slack returned an error ${res.statusCode}, ${res.statusMessage}`));
            }
        });

        req.on('error', (e) => {
            reject(new Error(e.message));
        });

        req.write(postData);
        req.end();
    });
};

module.exports = {
    handler
};

Finally, for deploying you can use these commands

          sam build --template template.us.yaml --use-container
          sam deploy --config-file samconfig.toml --no-confirm-changeset --no-fail-on-empty-changeset --config-env dev-us-east-1 

So, there you go! Handling health check notifications across various parts of AWS doesn't have to be daunting. With SNS and Lambda, you're equipped to stay in the loop when issues arise.

Pro tip: You can also do some basic steps to recover your endpoint, may be restarting the server from lambda as well?