Until recently, the onus has been on every team who that owns stateful hosts to review their deployment scripts and identify circular dependencies. In practice, however, many dependencies aren’t identified until an incident occurs, which can delay recovery. The obvious route would be to block access to github.com from the machines to validate that the system can deploy without it. But these hosts are stateful and serve customer traffic even during rolling deploys, drains, or restarts. Blocking github.com entirely would impact their ability to handle production requests. This is where we starte
Our new circular dependency detection process is live after a six-month rollout. Now, if a team accidentally adds a problematic dependency, or if an existing binary tool we use takes a new dependency, the tooling will detect that problem and flag it to the team. The net result is a more stable GitHub and faster mean time to recovery during incidents (due to the removal of these circular dependencies). Are there ways for circular dependencies to still trip things up? You bet—and we’ll look to improve the tool as we discover them.