MORE POSTS
February 08, 2024 2:00 PM
connect() - why are you so slow?
This is our story of what we learned about the connect() implementation for TCP in Linux. Both its strong and weak points. How connect() latency changes under pressure, and how to open connection so that the syscall latency is deterministic and time-bound...
December 06, 2023 2:00 PM
How we used OpenBMC to support AI inference on GPUs around the world
This is what Cloudflare has been able to do so far with OpenBMC with respect to our GPU-equipped servers...
November 17, 2023 2:00 PM
How to execute an object file: part 4, AArch64 edition
The initial posts are dedicated to the x86 architecture. Since then, the fleet of our working machines has expanded to include a large and growing number of ARM CPUs. This time we’ll repeat this exercise for the aarch64 architecture....
October 06, 2023 1:05 PM
Virtual networking 101: bridging the gap to understanding TAP
Tap devices were historically used for VPN clients. Using them for virtual machines is essentially reversing their original purpose - from traffic sinks to traffic sources. In the article I explore the intricacies of tap devices, covering topics like offloads, segmentation, and m...
June 26, 2023 1:00 PM
Lost in transit: debugging dropped packets from negative header lengths
In this post, we'll provide some insight into the process of investigating networking issues and how to begin debugging issues in the kernel using pwru and kprobe tracepoints...
June 19, 2023 1:00 PM
Every request, every microsecond: scalable machine learning at Cloudflare
We'll describe the technical strategies that have enabled us to expand the number of machine learning features and models, all while substantially reducing the processing time for each HTTP request on our network...
May 26, 2023 1:00 PM
How Oxy uses hooks for maximum extensibility
Let's take a look from the perspective of an Oxy application developer, and then we can discuss the implementation of the framework and some of the interesting design decisions we made...
May 25, 2023 3:31 PM
Unbounded memory usage by TCP for receive buffers, and how we fixed it
We are constantly monitoring and optimizing the performance and resource utilization of our systems. Recently, we noticed that some of our TCP sessions were allocating more memory than expected. This blog post describes in detail the root cause of the problem and shows the test r...
May 18, 2023 1:00 PM
Building Cloudflare on Cloudflare
Cloudflare was originally built as native services, but we’re building more and more of it on Cloudflare itself. This post describes how and why we’re doing this....
April 19, 2023 1:00 PM
DDR4 memory organization and how it affects memory bandwidth
In this blog, we will study the concepts of memory rank and organization, and how memory rank and organization affect the memory bandwidth performance by reviewing some benchmarking test results...
March 20, 2023 1:00 PM
The quantum state of a TCP port
If I navigate to https://blog.cloudflare.com/, my browser will connect to a remote TCP address from the local IP address assigned to my machine, and a randomly chosen local TCP port. What happens if I then decide to head to another site?...
March 03, 2023 2:00 PM
How Cloudflare runs Prometheus at scale
Here at Cloudflare we run over 900 instances of Prometheus with a total of around 4.9 billion time series.
Operating such a large Prometheus deployment doesn’t come without challenges .
In this blog post we’ll cover some of the issues we hit and how we solved them...
January 16, 2023 1:46 PM
A debugging story: corrupt packets in AF_XDP; a kernel bug or user error?
A race condition in the virtual ethernet driver of the Linux kernel led to occasional packet content corruptions, which resulted in unwanted packet drops by one of our DDoS mitigation systems. This blogpost describes the thought process and technique we used to debug this complex...
November 28, 2022 2:57 PM
The Linux Kernel Key Retention Service and why you should use it in your next application
Many leaks happen because of software bugs and security vulnerabilities. In this post we will learn how the Linux kernel can help protect cryptographic keys from a whole class of potential security vulnerabilities: memory access violations....
November 16, 2022 2:00 PM
The Cloudflare API now uses OpenAPI schemas
Cloudflare now has OpenAPI Schemas available for the API. Users can use these schemas in any open source OpenAPI Tooling....