Authentication issues on DUO39
Incident Report for Duo Security
From 18:52 to 19:07 UTC on March 17, 2017, the DUO39 deployment experienced increased authentication latency that caused authentication failure for some customer applications protected by the Duo service.
As part of best practices, Duo frequently deploys application code and operating system (OS) level enhancements to ensure our customers are able to take advantage of the newest features and are protected against the latest vulnerabilities. Duo’s engineering team has implemented processes allowing these types of changes to be made in a gradual and automated fashion without any corresponding impact to end users or service availability. These processes are exercised regularly as part of Duo’s standard release process.
In this case, a deficiency in the automation responsible for rolling out a specific set of OS level software patches caused some authentication requests to be queued for multiple minutes. This led to slow or in some cases failing authentications for some customers.
Duo’s monitoring systems detected an increase in authentication latency at 18:52 UTC and alerted relevant engineering staff members. The automation in question had already completed the intended operations and queued authentication requests were processed during the subsequent 15 minutes.
Duo’s Engineering team has already corrected the specific bug that caused this issue and will be implementing additional test cases and controls around operational processes such as these to ensure that we are able to make continual improvements to the service without customer facing impact.