Between approximately 20:45 to 21:31 UTC (4:45 to 5:31 Eastern Time) on October 20, and 14:08 to 15:03 UTC (10:08 to 11:03 Eastern Time) on October 21, a significant number of users on Duo Security’s DUO1 deployment were unable to authenticate.
Several servers failed to process authentications. This led to intermittent cascading failures in which DUO1 servers processing authentications became overloaded and intermittently timed out. Our Operations Team was able to manually clear the backlogs causing this cascading failure and fully restore service between 21:20 to 21:31 UTC on October 20, and 14:56 to 15:03 UTC on October 21.
Because the recovery was implemented in stages, the total outage time depended on which applications were protected and authentication factors used.
We traced the outage to being the result of contention between user onboarding processes and backend cleanup procedures, and we are implementing architectural changes that will prevent future cascading failures and associated end user impact. As a reminder, you can subscribe to receive status updates related to your specific Duo deployment here at: https://status.duosecurity.com/.