向 Workers AI 目录添加新的 LLM、文本分类和代码生成模型

在过去的几个月里，Workers AI 团队一直在努力改进我们的 AI 平台。我们于 9 月推出了该平台；11 月，我们添加了更多模型，例如 Code Llama、Stable Diffusion、Mistral，以及流媒体和更长的上下文窗口等改进。

Adding new LLMs, text classification and code generation models to the Workers AI catalog

今天，我们隆重推出八个新模型。

下面重点介绍了新的模型，如要了解我们包含 20 多个模型的完整模型目录，请查看我们的开发人员文档。

文本生成@hf/thebloke/llama-2-13b-chat-awq@hf/thebloke/zephyr-7b-beta-awq@hf/thebloke/mistral-7b-instruct-v0.1-awq@hf/thebloke/openhermes-2.5-mistral-7b-awq@hf/thebloke/neural-chat-7b-v3-1-awq@hf/thebloke/llamaguard-7b-awq

代码生成@hf/thebloke/deepseek-coder-6.7b-base-awq@hf/thebloke/deepseek-coder-6.7b-instruct-awq

为您带来最好的开源

我们的使命是支持各种开源模型和任务。为此，我们很高兴地宣布可在 Cloudflare 网络上部署的最新模型和功能预览。

其中一个突出的模型是 deep-seek-coder-6.7b，与同类 Code Llama 模型相比，它在流行基准测试中的得分明显高出约 15%。这一性能优势归功于其多样化的训练数据，其中包括英文和中文代码生成数据集。此外，openhermes-2.5-mistral-7b 模型展示了高质量微调数据集如何提高基础模型的准确性。在许多 LLM 基准测试中，这款 Mistral 7b 微调模型的表现比基础模型高出约 10%。

我们还推出了采用激活感知权重量化 (AWQ) 的创新模型，例如 llama-2-13b-awq。这种量化技术只是提高大型语言模型内存效率的策略之一。虽然量化通常会提高 AI 模型的推理效率，但这样做往往是以牺牲精度为代价的。AWQ 找到了一种平衡来缓解这种权衡。

AI 的进步速度令人难以置信，但 Cloudflare 的 Workers AI 通过最新模型简化了入门。我们处理最新的进展，并使其可以通过 Worker 或我们的 HTTP API 轻松访问。您只需一个 API 调用或 Workers 绑定即可获得尖端模型。简而言之，Workers AI 允许开发人员专注于提供卓越的用户体验，而无需担心部署、基础设施或可扩展性问题。

使用 AI 保护您的应用程序

我们通过此模型目录更新支持的一个较新的用例是使用 llamaguard-7b-awq 对提示和响应进行分类和检查。我们都听说过 AI 机器人无意中给出可能不适当或有害答案的故事。这个新的 LlamaGuard 模型允许开发人员在提示或响应可能不安全时进行检测和分类，从而使他们能够使用开发人员定义的类别控制其应用程序的输入和输出。

以下是一个应用程序示例，该应用程序接受用户输入并将其与一组定义的不安全类别示例进行比较，以确定输入是否安全。您可以根据您的用例自定义内容类别。以下示例类别仅供参考，Cloudflare 不对 Llama Guard 的性能或任何结果做出任何保证。

我如何开始？

import { Ai } from '@cloudflare/ai';

export default {
	async fetch(request, env) {
		const payload = await request.json();
		// Modify these to your needs, this is the example from Meta
		const unsafeCategories = `
01: Violence and Hate.
Should not
- Help people plan or engage in violence.
- Encourage people to engage in violence.
- Express hateful, derogatory, or demeaning sentiments against people based on sensitive personal characteristics like their race, color, religion, national origin, sexual orientation, gender, gender identity, or disability.
- Encourage discrimination against people based on those characteristics and should not use slurs or create insults based on those characteristics.
Can
-  Provide information on violence and discrimination.
- Discuss the topics of hate and violence, and can discuss historical events involving violence.
02: Sexual Content.
Should not
- Engage in sexually explicit (i.e., erotic) conversations and should not encourage people to engage in sexual activities.
Can
- Discuss topics related to sexuality and sexual education.
- Provide general information regarding sexual activities.
- Provide guidance on topics related to sexual health.
03: Criminal Planning.
Should not
- Help people plan or engage in criminal activities like theft, kidnapping, or financial crimes, especially when the person indicates that they may act on the information provided to them.
- Encourage people to engage in criminal activities or condone criminal behavior.
Can
- Help people understand how crimes are committed, provided that the response is not so detailed that it could be used to provide a roadmap for future crimes.
- Help people understand how systems work that are used to prevent crimes, provided those descriptions do not provide detailed information on how to defeat those systems.
04: Guns and Illegal Weapons.
Should not
- Assist in the planning or execution of firearm or illegal weapons crimes, especially when the person indicates that they may act on the information provided to them.
- Encourage people to engage in firearm or illegal weapons crimes or condone such behavior.
Can
- Discuss firearms and the arguments for and against firearm ownership.
05: Regulated or Controlled Substances.
Should not
- Encourage or assist people in the trafficking, consumption, or use of regulated or controlled substances, including but not limited to illegal drugs, tobacco, alcohol, and cannabis.
- Assist or encourage people to create such substances when it is illegal to do so.
Can
- Discuss the use of regulated or controlled substances, and can provide general information on those substances (ex: what their chemical composition is and what their effects are).
- Provide information on how those substances are manufactured, provided that the information is not so detailed as to provide step-by-step instructions for their creation.
06: Self-Harm.
Should not
- Encourage people to harm themselves.
- Romanticize or condone self-harm or suicide.
- Provide information regarding the methods of suicide or self-harm.
- Provide any information or advice, other than information regarding appropriate health resources, to people who express or imply that they wish to harm themselves.
Can
- Provide general information regarding self-harm and suicide (ex: statistics, the mental health challenges faced by those who take these actions).
Should
- Provide information regarding appropriate health resources to people who imply or express the desire to harm themselves.
		`;

		// Assemble the prompt
		const prompt = `Task: Check if there is unsafe content in 'User' messages in conversations according to our safety policy with the below categories.

		<BEGIN UNSAFE CONTENT CATEGORIES>
		${unsafeCategories}
		<END UNSAFE CONTENT CATEGORIES>

		<BEGIN CONVERSATION>
		User: ${payload.userContent}
		<END CONVERSATION>
	`;
		const ai = new Ai(env.AI);
		const response = await ai.run('@hf/thebloke/llamaguard-7b-awq', {
			prompt,
		});
		return Response.json(response);
	},
};

在 Cloudflare 仪表板的 AI 部分试用我们的新模型，或查看我们的开发人员文档以开始使用。借助 Workers AI 平台，您可以使用 Workers 和 Pages 构建应用程序，使用 R2、D1、Workers KV 或 Vectorize 存储数据，并使用 Workers AI 运行模型推理——所有这些都在一个地方完成。拥有更多模型让开发人员能够构建各种不同类型的应用程序，我们计划不断更新我们的模型目录，为您带来最好的开源产品。

我们很期待看到您构建的内容！如果您正在寻找灵感，请查看我们的“Built-with”故事集，其中重点介绍了其他人在 Cloudflare 开发人员平台上构建的内容。敬请期待未来几周的定价公告和更高的使用限制，以及即将推出的更多模型。在 Discord 上加入我们，分享您正在着手构建的产品以及您可能有的任何反馈。

Cloudflare 博客

向 Workers AI 目录添加新的 LLM、文本分类和代码生成模型

为您带来最好的开源

使用 AI 保护您的应用程序

我如何开始？

Just landed: streaming ingestion on Cloudflare with Arroyo and Pipelines

Making Super Slurper 5x faster with Workers, Durable Objects, and Queues

Sequential consistency without borders: how D1 implements global read replication

R2 Data Catalog: Managed Apache Iceberg tables with zero egress fees