On March 27, 2024 at around 05:18 EDT Duo Engineering was notified by Customer Support that customers utilizing a Trusted Endpoints with Certificates integration in conjunction with a Trusted Endpoints Windows Duo Desktop integration encountered blocked authentications with an erroneous message indicating the user was using a “personal device”. The root cause was identified as a bug that occurred while determining the trust status of the device.
The issue was mitigated on March 28, 2024 by rolling back the impacted users to the previous stable release. The permanent fix was landed and released to customers on April 1, 2024.
2024-03-27 05:18 Duo Engineering team is informed by Duo Customer Support that customers are reporting issues with Trusted Endpoints devices completing successful authentications.
2024-03-27 06:00 Duo Engineering begins investigation.
2024-03-27 11:30 Duo Engineering is called to triage.
2024-03-28 8:19 Escalation channel is spun up and follow the sun on-call engineers are notified.
2024-03-28 8:40 Rollback initiated for impacted deployments.
2024-03-28 14:12 Status page updated to Monitoring.
2024-03-28 15:06 Rollbacks are completed.
2024-03-28 15:17 Duo Engineering identified the root cause and steps to reproduce.
2024-03-28 17:19 Status page updated to Resolved.
2024-03-29 09:00 Duo Engineering prepares tasks to release the stable fix onto the current deployment.
2024-04-01 1:30 The release including the fix begins to rollout to customers.
In the previous release cycle, Duo Engineering addressed some updates to enhance the speed and reliability of our authentications. We modified parts of the code that verify the trustworthiness of the device during authentication. When making this update, we encountered an issue where users with Trusted Endpoint configurations with an Active Certificate integration and a Windows Duo Desktop integration configurations encountered blocked authentications during sign-in, because we could not determine the trust status of the devices. We've since addressed the bug and improved our monitoring to catch and address any potential issues in the future.
How did Duo Resolve the incident
As a short-term solution, Duo performed a roll-back on the affected customer’s regions to the previous stable d-release. In the meantime, we triaged to find the exact error being raised and provided a long-term fix to be sent out to all releases.
What is Duo doing to prevent this in the future?
Duo has added more details into the logging that occurs during trust collection to troubleshoot and alert engineers of errors raised during endpoint collection. Previously these logs lacked the details needed to quickly diagnose the issue. Duo is also taking preventative measures by adding tests that will catch this particular use case and prevent this part of the code base from introducing similar bugs, and adding additional monitoring and alerting for these types of issues.