Intermittent Network Connectivity Issues
Incident Report for Duo
Postmortem

Connectivity issues to AWS us-west-2 region

Incident Report - 2022-08-11

Summary

On August 11, 2022, at around 07:30 EST, Duo's Engineering Team was alerted by reports from a small subset of customers that users were unable to authenticate and Duo administrators were unable to log in to the Duo Admin Panel. The root cause is still under investigation.

The issue resolved at around 08:28 EST.

Deployments Impacted

  • DUO9, DUO17, DUO22, DUO39, DUO40, DUO45, DUO49, DUO50, DUO52, DUO55, DUO56, DUO58, DUO61, DUO62, DUO63, DUO64, DUO65, DUO71

Timeline of Events EST

2022-08-11 07:30 Duo Site Reliability Engineering (SRE) receives a report that multiple customers are not able to connect to Duo, affecting both user authentication and administrator logins to the Admin Panel. SRE begins triage.

2022-08-11 07:36 After assessing our internal monitoring system, SRE finds that we have no immediate proof of a failure on Duo’s side. SRE finds external evidence of a potential outage related to our cloud provider, using Cisco’s Thousandeyes.

2022-08-11 07:44 Status updated to Investigating.

2022-08-11 08:02 Support request and direct communications are established with AWS to investigate the cause of the outage.

2022-08-11 08:30 Our SRE team continues investigating and watching our external monitoring systems and finds no issues with Duo’s service.

2022-08-11 08:45 Using additional information provided by customers, Duo’s team narrows the issue to those who connect to a particular ISP, indicating that there was a potential WAN outage.

2022-08-11 08:50 Status updated to Monitoring.

2022-08-11 09:04 AWS confirms that they do not see any issues on their end.

2022-08-11 09:28 Status updated to Resolved.

Details

Duo’s internal monitoring found no indications of failures on the Duo side. The most likely cause was an issue in the connection between our customers and our infrastructure hosted on AWS. Our end-to-end testing did not see any failures, indicating that this was an isolated incident tied to a specific ISP routing failure between our customer’s infrastructure and Duo’s infrastructure. Cisco’s Thousandeyes found a 32-minute period of network connectivity issues with our cloud provider.

The incident resolved itself, with no action from Duo.

Duo is working with its hosting provider and customers to identify the root cause of this issue. We will also improve our end-to-end monitoring to provide more insights beyond the Duo perimeter with Thousandeyes.

Note: You can find your Duo deployment’s ID and sign up for updates via the StatusPage by following the instructions in this Duo Knowledge Base article.

Posted Aug 12, 2022 - 10:09 EDT

Resolved
After monitoring for any further reports or issues, we can confirm that this incident has been resolved.
Posted Aug 11, 2022 - 09:28 EDT
Monitoring
The network connectivity issues have subsided and we are monitoring for any more issues.
Posted Aug 11, 2022 - 08:56 EDT
Update
We are continuing to investigate this issue.
Posted Aug 11, 2022 - 08:26 EDT
Investigating
We’re currently seeing service degradation in some deployments due to network connectivity issues, and are actively working on a resolution.
Posted Aug 11, 2022 - 08:26 EDT