Isaac Rehg

Making Workers AI faster and more efficient: Performance optimization with KV cache compression and speculative decoding

2024-09-26

Birthday Week Product News Cloudflare Workers Developers Developer Platform LLM

With a new generation of data center accelerator hardware and using optimization techniques such as KV cache compression and speculative decoding, we’ve made large language model (LLM)...

Isaac Rehg
Jesse Kipp

Meta Llama 3 available on Cloudflare Workers AI

2024-04-18

Llama Developers Developer Platform Workers AI Cloudflare Workers Product News

We are thrilled to give developers around the world the ability to build AI applications with Meta Llama 3 using Workers AI. We are proud to be a launch partner with Meta for their newest 8B Llama 3 model...

Michelle Chen
Davina Zamanzadeh
Isaac Rehg
Nikhil Kothari

Workers AI Update: Hello, Mistral 7B!

2023-11-21

Workers AI Cloudflare Workers Developers Developer Platform

Today we’re excited to announce that we’ve added the Mistral-7B-v0.1-instruct to Workers AI...

Jesse Kipp
Isaac Rehg