Improving platform resilience at Cloudflare through automation
2024-10-09
EdgeEngineeringServerlessDeveloper PlatformDevelopersAgile Developer ServicesGoReliabilitySpeed & Reliability
We realized that we need a way to automatically heal our platform from an operations perspective, and designed and built a workflow orchestration platform to provide these self-healing capabilities across our global network. We explore how this has helped us to reduce the impact on our customers due to operational issues, and the rich variety of similar problems it has empowered us to solve....