Keepalives considered harmful
2020-03-19
You’d think keepalives would always be helpful, but turns out reality isn’t always what you expect it to be. It really helps if you read Why does one NGINX worker take all the load? first....
2020-03-19
You’d think keepalives would always be helpful, but turns out reality isn’t always what you expect it to be. It really helps if you read Why does one NGINX worker take all the load? first....
2018-08-24
Here at Cloudflare we use Prometheus to collect operational metrics. We run it on hundreds of servers and ingest millions of metrics per second to get insight into our network and provide the best possible service to our customers....
2018-05-13
How an innocent OS upgrade triggered a cascade of issues and forced us into tracing Linux networking internals....
2018-03-05
How Cloudflare was able to save hundreds of gigabits of network bandwidth and terabytes of storage from Kafka....
2016-12-14
We use Salt to manage our ever growing global fleet of machines. Salt is great for managing configurations and being the source of truth. We use it for remote command execution and for network automation tasks....
2016-12-07
The following blog post describes a debugging adventure on Cloudflare's Mesos-based cluster. This internal cluster is primarily used to process log file information so that Cloudflare customers have analytics, and for our systems that detect and respond to attacks....