Quantcast
Channel: OpsGenie Blog
Viewing all articles
Browse latest Browse all 204

How to create a free status page using OpsGenie

$
0
0

Every service provider wants their services to be available 24x7x365. But outages and planned maintenance are inevitable occurrences for online software services. Dealing with outages and communicating with users during the outage is as important as the availability of the services provided. To keep users informed, many service providers use web based “status pages” that contain up to date information about the health of the services, incidents, and what the provider is doing to resolve the issues.

OpsGenie is an incident management system for Dev & Ops teams. Customers use OpsGenie to consolidate their alerts generated by monitoring systems and route them to the right people using on-call schedules and escalations. Because OpsGenie is an essential tool used during outages and we have vital information about the incidents; our customers have been inquiring if we can create “status pages” programmatically based on the alerts generated in OpsGenie.

Responding this request, we’ve taken up the challenge to provide this solution to manage status pages for OpsGenie customers.

How does it work?

As illustrated, the solution makes use of OpsGenie webhook integration, AWS Gateway, Lambda and S3 services (though these components can be replaced with others):

  1. Alerts are generated by monitoring tools, or by users through OpsGenie’s web interface, email, or other integration(s).
  2. Any OpsGenie alert can be treated as an incident, while in-turn generating a status page for the impacted service.
  3. OpsGenie webhook integration is used to trigger execution of the code in AWS Lambda via AWS API Gateway.
  4. Lambda code calculates the service state based on alert severity and then generates the status page.

This approach enables OpsGenie customers the opportunity to easily update status pages using an existing alert, or by creating a new alert. The logic is straightforward:

  • if an alert has the “statuspage” tag, it then triggers the execution of the code in Lambda via the webhook integration.
  • The name of the impacted service and the severity of the incident should also be provided as tags (major, critical).
  • when the alert is updated in OpsGenie, the status page is updated as well to reflect the latest state of the service.

The solution is meant to serve as a reference implementation. It is ready to be used with OpsGenie as is, but it’s also possible to replace various components within the solution. For example, an application server of your choice can be used instead of AWS Lambda. Also the status pages can be hosted on an internal web server instead of S3.

The source code of the Lambda functions and the web application is available now on GitHub. As always, we’re available to answer any further questions you may have regarding the new status page solution.

Please CLICK HERE for the reference implementation document to generate status pages from OpsGenie alerts.


Viewing all articles
Browse latest Browse all 204

Trending Articles