agent/users/jojo/data/conversations/conv_20251109_113433_663.json
2025-11-14 16:44:12 +08:00

238 lines
140 KiB
JSON
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"id": "conv_20251109_113433_663",
"title": "在请求月之暗面的kimi-k2模型时是如何控制模型思考还是不思考的是通过请求时加入thinkin...",
"created_at": "2025-11-09T11:34:33.663230",
"updated_at": "2025-11-09T12:05:51.659274",
"messages": [
{
"role": "user",
"content": "在请求月之暗面的kimi-k2模型时是如何控制模型思考还是不思考的是通过请求时加入thinking参数还是通过修改模型的名称为kimi-k2-thinking帮我查一下",
"timestamp": "2025-11-09T11:35:10.641137"
},
{
"role": "assistant",
"content": "我来帮你查一下月之暗面Kimi-k2模型控制思考模式的机制。让我搜索相关的技术文档和使用说明。",
"timestamp": "2025-11-09T11:35:13.140118"
},
{
"role": "assistant",
"content": "",
"timestamp": "2025-11-09T11:35:13.147109",
"tool_calls": [
{
"id": "web_search:0",
"index": 0,
"type": "function",
"function": {
"name": "web_search",
"arguments": "{\"query\": \"月之暗面 kimi k2 模型 thinking 参数 思考模式 控制方法\", \"max_results\": 10}"
}
}
]
},
{
"role": "tool",
"content": "{\"success\": true, \"summary\": \"🔍 搜索查询: 月之暗面 kimi k2 模型 thinking 参数 思考模式 控制方法\\n📅 搜索时间: 2025-11-09T11:35:18.395158\\n🎯 过滤条件: Topic: general\\n\\n📝 AI摘要:\\nKimi K2 Thinking is a powerful open-source model with advanced reasoning and tool use capabilities. It can execute up to 300 consecutive tool calls without human intervention. It's designed to perform complex tasks through long-term planning and reasoning.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. 使用kimi-k2-thinking思考模型 - Moonshot AI 开放平台- 月之暗面\\n 🔗 https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\\n 📄 > `kimi-k2-thinking` 模型是月之暗面提供的具有通用 Agentic 能力和推理能力的思考模型,它擅长深度推理,并可通过多步工具调用,帮助解决各类难题。\\n\\n基本示例` 来判断是否存在字段,如果存在,则使用 `getattr(obj, \\\"reasoning_content\\\")` 获取字段值\\n 如果你使用其他框架或自行通过 HTTP 接口对接,可以直接获取与 `content` ...\\n\\n2. 开源新纪录月之暗面Kimi K2实测超越GPT-5和Claude4.5,完全免费\\n 🔗 https://www.huxiu.com/article/4802189.html\\n 📄 对于大多数研究和企业应用这一条款相当于一个轻量级的署名要求同时保留了标准MIT协议的自由度。\\n\\n这使得K2 Thinking成为目前可用的最宽松授权的前沿级模型之一。\\n\\n新的基准测试领跑者\\n\\nKimi K2 Thinking是一个基于万亿参数的混合专家MoE模型每次推理激活320亿参数。\\n\\n它将长程推理与结构化工具使用相结合能够在无需人工干预的情况下执行200-300次连续的工具调用。...\\n\\n3. 详解Kimi K2 Thinking谢谢DeepSeek我先去干掉GPT-5了 - 36氪\\n 🔗 https://m.36kr.com/p/3543851834322816\\n 📄 Raschka的分析指出了一个关键事实Kimi K2 Thinking对DeepSeek核心架构的“继承”是显而易见的包括MoE机制、MLA多头潜在注意力等设计。这是在一个已被验证的基座上根据自身的目标进行了针对性的调整和优化。例如减少注意力头和激活参数量旨在降低推理成本而增加专家数量和词汇表则是为了增强模型的知识容量和表达能力。这种“站在巨人肩膀上”的做法是开源精神最直接的体...\\n\\n4. 离GPT-5最近的一次中国1万亿参数开源巨兽突然爆火 - 智源社区\\n 🔗 https://hub.baai.ac.cn/view/50171\\n 📄 # 离GPT-5最近的一次中国1万亿参数开源巨兽突然爆火\\n\\nMultimodalAIDL\\n\\n新智元 2025-11-08 14:20 分享\\n\\n### \\n\\n编辑倾倾 桃子\\n\\n##### 【新智元导读】Kimi K2 Thinking重磅开源1万亿「思考Agent模型」在推理、智能体基准上干翻GPT-5。关键还能连调300次工具直出3D模拟。\\n\\n昨天月之暗面发布全新模型Kimi K2 Th...\\n\\n5. Kimi 迄今能力最强开源思考模型月之暗面Kimi K2 Thinking 发布\\n 🔗 https://www.ithome.com/0/895/484.htm\\n 📄 据介绍,该模型是月之暗面基于“模型即 Agent”理念训练的新一代 Thinking Agent原生掌握“边思考边使用工具”的能力。在人类最后的考试Humanity's Last Exam、自主网络浏览能力BrowseComp、复杂信息收集推理SEAL-0等多项基准测试中表现达到 SOTA 水平,并在 Agentic 搜索、Agentic 编程、写作和综合推理能力等方面取得全面提升...\\n\\n6. 国产之光Kimi K2 Thinking 原创 - CSDN博客\\n 🔗 https://blog.csdn.net/twelveai/article/details/154575275\\n 📄 本文介绍Kimi K2-0905端侧推理SDK的零代码部署方案帮助开发者解决边缘AI项目中的高延迟问题。内容涵盖SDK安装、模型加载与推理调用、批处理优化等延迟优化技巧以及性能监控方法。通过本地模型加载、缓存策略等优化手段可实现42%的延迟降低。该SDK提供零代码接口简化部署流程开发者可快速验证边缘设备上的推理效果。\\n\\nKimi-k2-thinking模型发布\\n\\n架构师刘7的博客\\n\\n11...\\n\\n7. 刚刚月之暗面发布Kimi K2 Thinking模型藏师傅首测教你用Kimi 编程 ...\\n 🔗 https://m.aitntnews.com/newDetail.html?newId=19581\\n 📄 `uv tool install --python 3.13 kimi-cli`\\n\\n安装后在终端打开你的项目目录输入 kimi 就可以启动 Kimi CLI 了。\\n\\n首次启动后会让你选登录方式选第一个然后输入刚才从 KFC 获取的 API Key 就行。\\n\\n进去之后按 Tab 键就可以启用思考,这时候用的就是 K2-Thinking 模型了Claude Code 启用思考也是一样的操作。\\n\\n当然...\\n\\n8. Kimi K2 Thinking突袭智能体&推理能力超GPT-5网友 - 量子位\\n 🔗 https://www.qbitai.com/2025/11/349957.html\\n 📄 在人类最后的考试HLE、测试自主网络浏览能力的BrowseComp和复杂信息收集推理基准测试SEAL-0等评测基准中Kimi K2 Thinking都刷新了SOTA超越GPT-5、Claude Sonnet 4.5Thinking等闭源模型。\\n\\nKimi K2 Thinking的代码和权重均遵循最为宽松的MIT协议。新模型也已第一时间上线kimi.com和最新版Kimi手机应用即刻...\\n\\n9. Kimi K2 Thinking模型发布并开源该模型哪些信息值得关注 - 知乎\\n 🔗 https://www.zhihu.com/question/1969558404759544488\\n 📄 HLE 和IMO (确切说是这两天谷歌新出的IMO answerbench都刷到了SOTA而且是SOTA不是“国产SOTA ”。值得一提的是IMO这个新bench在kimi出来之前sota居然是Deepseek而不是北美模型\\n\\n不知道heavy模式会不会开放给用户订阅如果开放的话正好可以看看国内愿意每个月花1000左右买sota的用户到底有多少。\\n\\n这个公司估值只有openai的0.5...\\n\\n10. Kimi K2 快速开始 - Moonshot AI 开放平台\\n 🔗 https://platform.moonshot.cn/docs/guide/kimi-k2-quickstart\\n 📄 🚀 最新发布 kimi-k2-0905-preview 模型,上下文窗口扩展至 256K代码能力更强快来体验吧\\n\\n文档\\n\\n入门指南\\n\\nKimi K2 快速开始\\n\\n# Kimi K2 快速开始\\n\\n## \\n\\nKimi K2 是月之暗面推出的新一代具备自主行动 (agentic) 能力的旗舰模型。采用 1T 总参数量、32B 激活参数的 MoE 架构设计。该模型在代码编程 (AI coding) ...\", \"filters\": {\"topic\": \"general\"}, \"query\": \"月之暗面 kimi k2 模型 thinking 参数 思考模式 控制方法\", \"results\": [{\"index\": 1, \"title\": \"使用kimi-k2-thinking思考模型 - Moonshot AI 开放平台- 月之暗面\", \"url\": \"https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"> `kimi-k2-thinking` 模型是月之暗面提供的具有通用 Agentic 能力和推理能力的思考模型,它擅长深度推理,并可通过多步工具调用,帮助解决各类难题。\\n\\n基本示例` 来判断是否存在字段,如果存在,则使用 `getattr(obj, \\\"reasoning_content\\\")` 获取字段值\\n 如果你使用其他框架或自行通过 HTTP 接口对接,可以直接获取与 `content` 字段同级的 `reasoning_content` 字段\\n 在流式输出(`stream=True`)的场合,`reasoning_content` 字段一定会先于 `content` 字段出现,你可以在业务代码中通过判断是否出现 `content` 字段来识别思考内容(或称推理过程)是否结束\\n `reasoning_content` 中包含的 Tokens 也受 `max_tokens` 参数控制,`reasoning_content` 的 Tokens 数加上 `content` 的 Tokens 数应小于等于 `max_tokens` [...] 多步工具调用,模型会根据实际情况选择把必要的思考内容送到模型进行推理。\\n 设置 `max_tokens>=16000` 以避免无法输出完整的 `reasoning_content` 和 `content`。\\n 设置 `temperature=1.0`,以获得最佳性能。\\n 使用流式输出(`stream=True``kimi-k2-thinking` 模型的输出内容包含了 `reasoning_content`,相比普通模型其输出内容更多,启用流式输出能获得更好的用户体验,同时一定程度避免网络超时问题。 [...] Kimi K2 快速开始\\n 使用 kimi-k2-thinking 思考模型\\n 开始使用 Kimi API\\n 从 OpenAI 迁移到 Kimi API\\n 使用 API 调试工具\\n 使用 Kimi API 进行多轮对话\\n 使用 Vision 视觉模型\\n 自动选择 Kimi 模型\\n 自动断线重连\\n 使用 Stream 流式输出\\n 使用 Tool Calls\\n 使用联网搜索 Tool\\n 使用 JSON Mode\\n 使用 Partial Mode\\n 使用 Kimi API 进行文件问答\\n 使用开发工作台调试模型\\n 在 ClaudeCode/Cline/RooCode 中使用 kimi-k2 模型\\n 在开发工作台中配置 ModelScope MCP 服务器\\n 在 Kimi API 中使用官方工具\\n 使用 Kimi CLI 调用 Kimi 大模型\", \"score\": 0.8325776, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"开源新纪录月之暗面Kimi K2实测超越GPT-5和Claude4.5,完全免费\", \"url\": \"https://www.huxiu.com/article/4802189.html\", \"content\": \"对于大多数研究和企业应用这一条款相当于一个轻量级的署名要求同时保留了标准MIT协议的自由度。\\n\\n这使得K2 Thinking成为目前可用的最宽松授权的前沿级模型之一。\\n\\n新的基准测试领跑者\\n\\nKimi K2 Thinking是一个基于万亿参数的混合专家MoE模型每次推理激活320亿参数。\\n\\n它将长程推理与结构化工具使用相结合能够在无需人工干预的情况下执行200-300次连续的工具调用。\\n\\n性能数据一览\\n\\n根据月之暗面公布的测试结果K2 Thinking取得了\\n\\n 44.9%-Humanitys Last ExamHLE达到业界最先进水平\\n 60.2%-BrowseComp智能体网络搜索和推理测试\\n 71.3%-SWE-Bench Verified和83.1%-LiveCodeBench v6关键编程评估\\n 56.3%-Seal-0真实世界信息检索基准\\n \\n\\n在这些任务中K2 Thinking持续超越GPT-5的相应得分并超过了几周前MiniMax AI发布的前开源领跑者MiniMax-M2。\\n\\n开源模型超越专有系统 [...] M2和K2 Thinking的快速接连发布展示了开源研究追赶前沿系统的速度有多快。MiniMax-M2证明了开源模型可以以一小部分计算成本接近GPT-5级别的智能体能力。月之暗面现在将这一前沿推进得更远将开源权重从平衡推向了彻底领先。\\n\\n两个模型都依赖稀疏激活来提高效率但K2 Thinking更高的激活数320亿vs 100亿活跃参数在各个领域产生了更强的推理保真度。其测试时缩放——扩展\\\"思维tokens\\\"和工具调用轮次——提供了可测量的性能提升无需重新训练这是MiniMax-M2中尚未观察到的特性。\\n\\n技术展望\\n\\n月之暗面报告称K2 Thinking支持原生INT4推理和256k token上下文性能下降微乎其微。其架构集成了量化、并行轨迹聚合“重度模式”以及针对推理任务调优的混合专家路由。\\n\\n在实践中这些优化使K2 Thinking能够维持复杂的规划循环——代码编译-测试-修复、搜索-分析-总结——跨越数百次工具调用。这种能力支撑了它在BrowseComp和SWE-Bench上的优异表现而推理连续性在这些任务中至关重要。\\n\\n对AI生态系统的巨大影响 [...] K2 Thinking的核心能力在于其显式推理轨迹。模型输出一个辅助字段reasoning\\\\_content在每个最终响应之前揭示中间逻辑。这种透明性在长时间多轮任务和多步骤工具调用中保持了连贯性。\\n\\n月之暗面发布的参考实现演示了模型如何自主执行\\\"每日新闻报告\\\"工作流:调用日期和网络搜索工具、分析检索内容、生成结构化输出——同时保持内部推理状态。\\n\\n这种端到端的自主性使模型能够在数百个步骤中进行规划、搜索、执行和综合证据反映了正在崛起的\\\"智能体AI\\\"系统类别,这些系统以最少的监督运行。\\n\\n效率与访问成本\\n\\n尽管规模达到万亿参数K2 Thinking的运行成本保持适中。月之暗面列出的使用价格为\\n\\n $0.15/100万tokens缓存命中\\n $0.60/100万tokens缓存未命中\\n $2.50/100万tokens输出\\n \\n\\n这些价格甚至优于MiniMax-M2的$0.30输入/$1.20输出定价——比GPT-5$1.25输入/$10输出低了一个数量级。\\n\\n对比背景开源权重加速\", \"score\": 0.7732339, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"详解Kimi K2 Thinking谢谢DeepSeek我先去干掉GPT-5了 - 36氪\", \"url\": \"https://m.36kr.com/p/3543851834322816\", \"content\": \"Raschka的分析指出了一个关键事实Kimi K2 Thinking对DeepSeek核心架构的“继承”是显而易见的包括MoE机制、MLA多头潜在注意力等设计。这是在一个已被验证的基座上根据自身的目标进行了针对性的调整和优化。例如减少注意力头和激活参数量旨在降低推理成本而增加专家数量和词汇表则是为了增强模型的知识容量和表达能力。这种“站在巨人肩膀上”的做法是开源精神最直接的体现。\\n\\n除了对DeepSeek架构的继承Kimi K2 Thinking的成果也离不开对整个开源生态成果的广泛“化用”。从底层用于加速注意力计算的FlashAttention到K2技术报告中提到的、为解决训练不稳定性而改进的MuonClip优化器再到各种数据处理和后训练方法论都整合了开源社区的集体智慧。\\n\\n如果说架构和开源技术决定了模型的骨架那么让其血肉丰满的则是月之暗面自身的工程实现能力。这主要体现在三个方面 [...] Kimi K2 Thinking是基于 Kimi K2 模型训练而来的专注于提升Agentic能力和推理能力。这是一个拥有1万亿总参数的混合专家模型Mixture-of-Experts, MoE每次推理激活约320亿参数支持256k的上下文窗口并采用了原生INT4量化技术。设计思路是在保持巨大模型规模的同时想办法控制计算成本和训练成本。据CNBC援引知情人士的报道该模型的训练成本仅为460万美元。作为对比DeepSeek披露的 V3训练成本租赁价正式训练阶段是560 万美元R1为29.4万美元。这里主要考虑的是GPU预训练费用不包括研发、基础设施等投资。\\n\\nKimi K2 Thinking的一个核心特性是Agent能力官方宣称它能够连续执行200-300次工具调用来解决复杂问题。Grok-4等闭源阵营广泛采用RL提升工具使用与长程规划但在开源模型中看到如此的实现还是第一次。它表明开源社区正在快速跟上智能体技术的前沿同时也对模型托管服务提出了更高的要求。 [...] 1、训练稳定性在长达15.5万亿tokens的预训练过程中Kimi K2 Thinking实现了“零loss spike损失尖峰”。这意味着训练过程极其稳定无需因模型崩溃而进行成本高昂的回滚。这对于万亿参数规模的模型来说是一项重要的工程成就。\\n\\n2、原生量化推理Kimi K2 Thinking支持原生INT4量化推理据称能在极小的性能损失下将推理速度提升约2倍并显著降低部署所需的GPU显存。这是将大参数模型从实验室推向广泛应用的关键。\\n\\n3、长程任务执行模型能够稳定执行200-300轮工具调用这不仅考验了模型的推理能力也检验了其系统鲁棒性。在长达数百步的交互中模型必须能处理各种异常背后需要一套复杂的工程机制。\", \"score\": 0.76826453, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"离GPT-5最近的一次中国1万亿参数开源巨兽突然爆火 - 智源社区\", \"url\": \"https://hub.baai.ac.cn/view/50171\", \"content\": \"# 离GPT-5最近的一次中国1万亿参数开源巨兽突然爆火\\n\\nMultimodalAIDL\\n\\n新智元 2025-11-08 14:20 分享\\n\\n### \\n\\n编辑倾倾 桃子\\n\\n##### 【新智元导读】Kimi K2 Thinking重磅开源1万亿「思考Agent模型」在推理、智能体基准上干翻GPT-5。关键还能连调300次工具直出3D模拟。\\n\\n昨天月之暗面发布全新模型Kimi K2 Thinking一上线就挤爆了服务器。\\n\\n思考是它的核心卖点自称是开源的「思考Agent模型」。\\n\\n它同样采用了MoE架构总参数约1万亿每次激活约320亿上下文256K token。\\n\\n在各大基准测试中Kimi K2 Thinking性能表现亮眼。\\n\\n尤其是在BrowseComp、HLE测试中实力完全碾压GPT-5、Claude Sonnet 4.5。\\n\\n在Tau2 Bench Telecom基准测试中K2 Thinking位列第一。\\n\\n最关键的是在无人干预情况下K2 Thinking可连续调用200-300次工具。 [...] 这意味着K2 Thinking的测试成绩就是它在真实环境下的表现而非理想化打分。\\n\\n它目前已经在kimi.com上线并开源API和模型权重开发者可以直接试用。\\n\\n从实验室到真实场景这个模型的「思考能力」明显超过了现有的其他模型。\\n\\n智能体编码一流300次工具调用\\n\\n这一次月之暗面没再让模型停留在论文里。\\n\\nK2 Thinking不是展示品而是一台真正能被人用起来的智能体。\\n\\n发布当天团队同步上线了 kimi.com聊天模式、开放了API接口还在Hugging Face公布了完整权重。\\n\\n开发者不需要等待内测邀请也不用注册繁琐流程任何人都能直接使用。\\n\\nK2 Thinking的从训练开始到优化再到上线周期不到半年。\\n\\n在这个动辄以年为单位更新的大模型时代这个速度意味着它已具备完整的工程化能力。\\n\\n打开kimi.com就能直接体验到K2 Thinking的思考过程。\\n\\n与一般聊天模型不同它在生成答案前会清晰地展示自己的推理链。\\n\\n此外研究人员特别提到K2 Thinking在软件和编码任务上进步显著。 [...] 国外研究者Nathan Lambert 称它为:「开源模型距闭源前沿最近的一次。」\\n\\n这句话在技术圈广为流传人们也开始重新审视这款模型。\\n\\n不只是聊天工具K2 Thinking更像是一个会自己推理、自己动手的智能体。\\n\\n一款真正会思考的模型\\n\\nKimi K2 Thinking没有强调算力更大而是强调更会「思考」。\\n\\n这些配置让它在处理长文本、复杂任务时能维持更稳定的推理过程。\\n\\n苹果大牛Awni Hannun测试后惊叹道\\n\\n1万亿参数只用2台M3 Ultra芯片的Mac电脑即可流畅运行而且int4压缩后性能几乎无损。\\n\\n通过mlx-lm并行技术它生成了大约3500个token速度每秒15个token。\\n\\n但真正让人关注的是它的「思考能力」。\\n\\n如前所述K2 Thinking可以在一次任务中连续执行200到300次工具调用全程无需人工干预。\\n\\n有网友实测「工具调用」立即制作出如下的数学和物理讲解动画。\\n\\n不同于其他模型的胡编乱造它在面对复杂问题时会自己拆解步骤、搜索信息、调用外部工具、再整合结果。\", \"score\": 0.7485182, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"Kimi 迄今能力最强开源思考模型月之暗面Kimi K2 Thinking 发布\", \"url\": \"https://www.ithome.com/0/895/484.htm\", \"content\": \"据介绍,该模型是月之暗面基于“模型即 Agent”理念训练的新一代 Thinking Agent原生掌握“边思考边使用工具”的能力。在人类最后的考试Humanity's Last Exam、自主网络浏览能力BrowseComp、复杂信息收集推理SEAL-0等多项基准测试中表现达到 SOTA 水平,并在 Agentic 搜索、Agentic 编程、写作和综合推理能力等方面取得全面提升。\\n\\n该模型无需人类干预即可自主实现高达 300 轮的工具调用和持续稳定的多轮思考能力,从而帮助用户解决更复杂的问题。\\n\\nIT之家附 Hugging Face、ModelScope 部署链接如下:\\n\\n Hugging Face\\n ModelScope\\n\\n“人类最后的考试”是一项涵盖 100 多个专业领域的终极封闭式学术测试。在允许使用工具 —— 搜索、Python、网络浏览工具的同等情况下Kimi K2 Thinking 在这项基准评测中取得了 44.9% 的 SOTA 成绩。 [...] 在官方提供的另一个例子中Kimi K2 Thinking 经过两轮搜索和思考先根据股票回购的已知信息找到了这家制造快艇的公司然后在美国证券交易委员会SEC的官网上找到了股票回购公告信息得出了准确的答案\\n\\nKimi K2 Thinking 模型的编码能力也得到了增强,在多语言软件工程基准 SWE-Multilingual、SWE-bench 验证集和 Terminal 终端使用等基准测试中的表现有了进一步提升。\\n\\n月之暗面表示Kimi K2 Thinking 的通用基础能力也得到了升级: [...] 官方提供的示例中Kimi K2 Thinking 经过 5 轮搜索和推理,结合每轮搜索到的新信息,层层深入,最终推理出了答案:\\n\\n根据介绍在复杂搜索和浏览场景中Kimi K2 Thinking 模型也表现出色。BrowseComp 是由 OpenAI 发布的一个专门评估 AI Agent 网络浏览能力的基准测试,这项测试的初衷是衡量 AI Agent 在信息过载环境中展现出的坚持性与创造力,即能否像人类研究员一样“刨根问底”。在这项极具挑战的任务上,人类平均只能达到 29.2% 的成绩。Kimi K2 Thinking 在这项基准测试中展现出极强的钻研能力,以 60.2% 的成绩成为新的 SOTA 模型。\\n\\n在长程规划和自主搜索能力的驱动下Kimi K2 Thinking 可借助多达上百轮的“思考 → 搜索 → 浏览网页 → 思考 → 编程”动态循环,持续地提出并完善假设、验证证据、进行推理,并构建出逻辑一致的答案。这种边主动搜索边持续思考的能力,使 Kimi K2 Thinking 能够将模糊且开放式的问题分解为清晰、可执行的子任务。\", \"score\": 0.72814995, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"国产之光Kimi K2 Thinking 原创 - CSDN博客\", \"url\": \"https://blog.csdn.net/twelveai/article/details/154575275\", \"content\": \"本文介绍Kimi K2-0905端侧推理SDK的零代码部署方案帮助开发者解决边缘AI项目中的高延迟问题。内容涵盖SDK安装、模型加载与推理调用、批处理优化等延迟优化技巧以及性能监控方法。通过本地模型加载、缓存策略等优化手段可实现42%的延迟降低。该SDK提供零代码接口简化部署流程开发者可快速验证边缘设备上的推理效果。\\n\\nKimi-k2-thinking模型发布\\n\\n架构师刘7的博客\\n\\n11-06\\n\\n641\\n\\n月之暗面发布Kimi-k2-thinking模型强化推理与Agent任务能力 摘要MoonshotAI正式推出Kimi-k2-thinking模型该256K长文本模型专为深度推理和多轮工具调用优化。基于万亿参数MoE架构具备MuonClip优化器、大规模Agentic数据合成等创新技术在代码生成、数学推理等场景表现优异。提供标准版和Turbo高速版两种计费方案支持API接入和多框架兼容。此次发布延续了月之暗面从长文本到智能体的技术演进路径以低成本高性价比对标行业顶级推理模型推动AI智能体应用 [...] 目前Kimi K2 Thinking 已经在 Kimi 官网上线直接通过聊天模式就能使用但月之暗面也表示为了确保快速、轻量级体验Kimi 官网的聊天模式下仅选择性使用部分工具,并减少工具调用轮次。因此,在 Kimi 官网上聊天体验可能无法复现基准测试成绩。\\n\\n完整智能体模式 将在近期更新,以展示 K2 Thinking 的全部能力。你也可以通过 Kimi K2 Thinking API 接入使用。\\n\\n作为 AI 编程博主,比较关注 Kimi K2 Thinking 模型的编程能力,先看下官方关于编程能力的介绍:\\n\\nK2 Thinking 在编程与软件开发任务上展现出显著的进步。\\n\\nSWE-Bench Verified71.3%\\n\\n这些成绩表明K2 Thinking 在不同编程语言与智能体框架agent scaffolds 间具备极强的泛化能力。\\n\\n在 HTML、React 及依赖大量组件的前端开发任务中,模型表现尤为突出,能够将抽象的想法转化为完整可运行、响应迅速的产品。 [...] 近日互联网上一则来自Kimi员工的推文引发了AI领域的广泛关注。该员工在社交平台X上发文称“我嗅到了一个伟大模型的到来”这一神秘的言论迅速在评论区引发热议人们纷纷猜测究竟是什么样的模型即将问世。经过深入挖掘大家才发现原来是Kimi团队悄然发布了全新的K2思考模型。与以往不同的是这次K2模型的发布异常低调官方并未进行大规模的宣传推广仅仅是在官方API文档中默默地更新了模型名称。\\n此次\\n\\n大模型学习必藏Kimi K2 Thinking开源发布性能直逼GPT-5\\n\\nstar\\\\_nwe的博客\\n\\n11-07\\n\\n589\\n\\n月之暗面发布开源大模型Kimi K2 Thinking多项基准测试性能超越GPT-5等闭源模型。该模型擅长多轮工具调用和持续思考具备自主网络浏览等先进能力。通过INT4量化技术实现万亿参数低成本运行已采用宽松MIT许可证开源。这一突破或标志着国产大模型发展的新拐点。\\n\\ndify智能体-Kimi-K2+Mermaid 神器,一键生成系统架构图.yml\\n\\n07-27\", \"score\": 0.7208996, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"刚刚月之暗面发布Kimi K2 Thinking模型藏师傅首测教你用Kimi 编程 ...\", \"url\": \"https://m.aitntnews.com/newDetail.html?newId=19581\", \"content\": \"`uv tool install --python 3.13 kimi-cli`\\n\\n安装后在终端打开你的项目目录输入 kimi 就可以启动 Kimi CLI 了。\\n\\n首次启动后会让你选登录方式选第一个然后输入刚才从 KFC 获取的 API Key 就行。\\n\\n进去之后按 Tab 键就可以启用思考,这时候用的就是 K2-Thinking 模型了Claude Code 启用思考也是一样的操作。\\n\\n当然这个依旧可以在 Cluade Code 里面使用如果嫌配置环境变量麻烦的话用我的《ai-claude-start》项目就行创建的时候填写下面内容。\\n\\n`URL: kimi-k2-thinkingAPI Key: 你刚才在 Kimi For Coding 获取的 API`\\n\\n好了接下来就是模型测试了。\\n\\n我们这次的测试逻辑改一下尽量少的测试模型单次生成的能力多测试实际使用中频繁迭代修改以及发现问题的能力。\\n\\n首先测试连续编辑和生成的能力随着要求越来越多越来越复杂 K2-Thinking 会不会出现崩溃和冲突修改的问题。 [...] 约到后面需求是越来越难的,比如拖拽功能以及最后的移动端适配大改版,中间还伴随着增加功能导致的 UI 变化,但是 K2-Thinking 全部都是一次搞定。\\n\\n第三次的时候由于增加功能导致的 UI 问题,也在第四次修复了,有点顶。\\n\\n我也用其他模型试了一下这个测试。Claude 4.5 可以搞定,但是 Codex 就不行了。\\n\\n下面是 Codex CLI 第七轮修改的样子,非常的凄凉,这产品几乎不可用,让他整废了。\\n\\n然后是考验 Kimi CLI 搜索能力的一个测试,让他搜索 Linear 这个产品的首页设计风格然后写一个网页复刻这个产品网页的设计风格看一下有多像。\\n\\n重点是要他输出他的思考和决策过程看一下思考带来的优势。\\n\\n模仿 Linear 的动效设计风格制作英文网页\\n\\n要求\\n\\n1.先搜索了解 Linear (linear.app) 的设计特点:\\n\\n 深色主题为主\\n 流畅的过渡动画\\n 磨砂玻璃效果glassmorphism\\n 微妙的渐变和光效\\n 简洁的排版\\n 调研 Linear 使用的前端和动效库方案,并且参考\\n\\n2.创建一个产品功能展示页面,包含: [...] Hero 区域:大标题 + 副标题 + CTA 按钮\\n 功能卡片3-4个鼠标悬停有光效跟随\\n 实现滚动视差\\n 背景网格动画\\n\\n3.技术要求:\\n\\n TailwindCSS\\n 颜色方案参考 Linear\\n 动画要流畅60fps\\n\\n4.核心特效:\\n\\n 鼠标移动时背景有微妙的渐变跟随\\n 卡片悬停时有发光边框\\n 滚动时元素渐显和位移\\n 按钮悬停有磁吸效果\\n\\n请先简单描述你搜索到的 Linear 设计特点,然后实现,我要看到你的完整思考和决策过程。\\n\\n来看一下过程他真的在调用搜索工具搜索一些 Linear 风格的设计文章,然后总结了 Linear 的设计特点拓展了原来的提示词和要求。\\n\\n这里面最难的是那个磁吸按钮以及卡片上鼠标 Hover 的光效,也是之前 Linear 的设计核心,他都搞定了。\\n\\n我也用这个提示词在 Claude Code 里面让 Claude 4.5 尝试了,磁吸按钮动效以及鼠标 Hover 的渐变效果 Claude 4.5 都没有搞定。\", \"score\": 0.7156829, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"Kimi K2 Thinking突袭智能体&推理能力超GPT-5网友 - 量子位\", \"url\": \"https://www.qbitai.com/2025/11/349957.html\", \"content\": \"在人类最后的考试HLE、测试自主网络浏览能力的BrowseComp和复杂信息收集推理基准测试SEAL-0等评测基准中Kimi K2 Thinking都刷新了SOTA超越GPT-5、Claude Sonnet 4.5Thinking等闭源模型。\\n\\nKimi K2 Thinking的代码和权重均遵循最为宽松的MIT协议。新模型也已第一时间上线kimi.com和最新版Kimi手机应用即刻就能实测体验。API可通过Kimi开放平台访问。\\n\\n# 技术细节\\n\\n官方提到K2 Thinking是月之暗面在Test-Time Scaling测试时扩展领域的最新进展通过同时扩展思考Token和工具调用轮次模型实现了更强的Agent和推理性能。\\n\\n智能体、推理能力全面提升\\n\\n体现在测试上在人类最后的考试HLE允许使用工具——搜索、Python、网络浏览工具的同等情况下Kimi K2 Thinking取得了44.9%的SOTA成绩。\\n\\n官方还放出了一个通过23次推理和工具调用K2 Thinking成功解决博士级别数学问题的示例。 [...] # Kimi K2 Thinking突袭智能体&推理能力超GPT-5网友再次缩小开源闭源差距\\n\\n鱼羊 2025-11-07 09:42:41 来源:量子位\\n\\n> 鱼羊 发自 凹非寺\\n>\\n> 量子位 | 公众号 QbitAI\\n\\nKimi K2 Thinking现已发布并开源\\n\\n主打一个“模型即Agent”不仅是Kimi“迄今能力最强的开源思考模型”还掌握边思考边使用工具的能力——\\n\\n无需人工干预即可执行200-300次连续工具调用。\\n\\n作为今年最受关注的开源模型系列之一Kimi K2的Thinking版本一上线就成为热议对象再次缩小了开源模型与闭源模型的差距。\\n\\n 1TB参数激活参数32B采用INT4而非FP8。\\n 256K上下文窗口。\\n 更多专家更少head更多思考。\\n\\n△与DeepSeek R1的架构对比图源x@rasbt [...] 官方的解释是思考模型会产生极长的解码长度常规的量化手段往往会导致模型性能大幅下降。为了克服这一挑战他们在后训练阶段采用了量化感知训练QAT并对MoE组件应用了INT4纯权重weight-only量化。\\n\\n这使得Kimi K2 Thinking能够在复杂推理和Agentic任务中支持原生的INT4推理并将生成速度提升了约2倍。\\n\\n以及INT4对推理硬件的兼容性更强对国产加速计算芯片更友好。\\n\\np.s. Blackwell之前的英伟达GPU不支持FP4。\\n\\n# 上手实测\\n\\n更多测试实例可以在官方技术博客中查看我们也第一时间简单测试了一波仅开启长思考模式未联网。\\n\\n经典题\\n\\n> 一根7米长的甘蔗如何通过1×2米的门\\n\\n思考了将近5分钟Kimi给出的回答是\\n\\n思考的时间是有点长但Kimi K2 Thinking成功绕过了这道题目中的陷阱想到了门的长宽其实并不会限制甘蔗的通过。\\n\\n编程方面我们测试的题目是\\n\\n> 编写一个Python程序让一个小球在旋转的六边形内弹跳小球运动遵循物理规律\", \"score\": 0.70944256, \"published_date\": \"\"}, {\"index\": 9, \"title\": \"Kimi K2 Thinking模型发布并开源该模型哪些信息值得关注 - 知乎\", \"url\": \"https://www.zhihu.com/question/1969558404759544488\", \"content\": \"HLE 和IMO (确切说是这两天谷歌新出的IMO answerbench都刷到了SOTA而且是SOTA不是“国产SOTA ”。值得一提的是IMO这个新bench在kimi出来之前sota居然是Deepseek而不是北美模型\\n\\n不知道heavy模式会不会开放给用户订阅如果开放的话正好可以看看国内愿意每个月花1000左右买sota的用户到底有多少。\\n\\n这个公司估值只有openai的0.5%anthropic和grok的2%,朱啸虎还觉得有“泡沫”……\\n\\n对了QAT和DSA是互相独立可以叠加的可以降低成本的优化这波又把模型的cost降低了5倍左右。\\n\\n编辑于 2025-11-07 10:08\\n\\n赞同 13129 条评论\\n\\n分享\\n\\n收藏喜欢\\n\\nImage 7: 刘少伟\\n\\n刘少伟\\n\\nImage 8\\n\\n互联网行业 从业人员\\n\\n 关注\\n\\n谢邀 @知乎科技\\n\\n利益相关月之暗面infra团队工程师一枚深度参与了这次thinking模型int4量化方面的工作。 [...] Token控制K2T 其实在一些推理CoT中明显有进行控制的痕迹如#4 题就大量使用中文简写内容几乎不是给人阅读的。但奈何整体思考效率依然不高很多复杂问题偏向暴力解验算次数也偏多导致最终Token开销显著高于Grok 4是目前第二高。 [...] 指令遵循K2T 的指令遵循能力略差于Grok 4例如在简单指令#20 桌游模拟,复杂指令#40 代码推导得分均落后于Grok 4最低分与K2 相当。但也有一些惊喜,如在#30 日记整理题中题目要求了输出字数其他模型通常只预估一个概数很少能严格遵守而K2T 在思维链中严格的逐字计算了总字数,最终字符与要求只差 6 个字。如果提高字数要求K2T 这套方法自然是行不通的会导致Token超长此处仅用来说明其遵守能力。\\n 编程基本功在之前K2 的测试中提到K2 虽然编程基本功不足但具备在多轮环境中稳定改善的能力。K2T 在编程方面并没有相应的强化表现仍与K2 相当。或许是Kimi团队认为在Agent工作模式下知错能改的重要性大于不犯错。\\n 特殊字符在80%以上的输出文本中都混入了NBSP特殊字符而这在K2 基础模型中并不存在。合理怀疑是Kimi在后训练阶段故意加入用作某种标记。此外K2T 也有小概率输出英文,这属于国模的常见病症,不做讨论。\", \"score\": 0.6501347, \"published_date\": \"\"}, {\"index\": 10, \"title\": \"Kimi K2 快速开始 - Moonshot AI 开放平台\", \"url\": \"https://platform.moonshot.cn/docs/guide/kimi-k2-quickstart\", \"content\": \"🚀 最新发布 kimi-k2-0905-preview 模型,上下文窗口扩展至 256K代码能力更强快来体验吧\\n\\n文档\\n\\n入门指南\\n\\nKimi K2 快速开始\\n\\n# Kimi K2 快速开始\\n\\n## \\n\\nKimi K2 是月之暗面推出的新一代具备自主行动 (agentic) 能力的旗舰模型。采用 1T 总参数量、32B 激活参数的 MoE 架构设计。该模型在代码编程 (AI coding) 和智能体 (Agent) 搭建两大领域表现卓越。技术报告 (opens in a new tab)\\n\\n### \\n\\n 国内领先Kimi K2 是目前国内表现最优秀的 Coding 模型之一\\n 全栈支持:从前端到后端,从代码生成到开发运维、调试优化,全方位覆盖真实编程场景\\n 效率倍增:官方提供联网搜索等十余款开箱即用工具,配合精准的 tool call 调用能力,显著提升编程效率\\n\\n###\", \"score\": 0.6461259, \"published_date\": \"\"}], \"total_results\": 10}",
"timestamp": "2025-11-09T11:35:18.404611",
"tool_call_id": "web_search:0",
"name": "web_search"
},
{
"role": "assistant",
"content": "让我查看更详细的技术文档,特别是关于模型调用的具体参数说明。",
"timestamp": "2025-11-09T11:35:21.433885"
},
{
"role": "assistant",
"content": "",
"timestamp": "2025-11-09T11:35:21.440638",
"tool_calls": [
{
"id": "extract_webpage:1",
"index": 0,
"type": "function",
"function": {
"name": "extract_webpage",
"arguments": "{\"url\": \"https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\"}"
}
}
]
},
{
"role": "tool",
"content": "{\"success\": true, \"url\": \"https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"🌐 网页内容 (14645 字符):\\n📍 URL: https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\\n==================================================\\n使用kimi-k2-thinking思考模型 - Moonshot AI 开放平台 - Kimi 大模型 API 服务\\n\\n===============\\n\\n🎉 最新发布 kimi-k2-thinking 模型,支持多步工具调用与思考,擅长解决更复杂的问题,快来体验吧!\\n\\n[![Image 1: Logo](https://platform.moonshot.cn/lightmode.svg)](https://platform.moonshot.cn/)[联系销售](https://platform.moonshot.cn/contact-sales)[Blog](https://platform.moonshot.cn/blog)[文档](https://platform.moonshot.cn/docs/overview)[开发工作台](https://platform.moonshot.cn/playground)[用户中心](https://platform.moonshot.cn/console)\\n\\n⌘K\\n\\n⌘K\\n\\n* [欢迎](https://platform.moonshot.cn/docs/overview)\\n* [使用手册](https://platform.moonshot.cn/docs/introduction)\\n* API 接口说明\\n\\n * [Chat](https://platform.moonshot.cn/docs/api/chat)\\n * [Tool Use](https://platform.moonshot.cn/docs/api/tool-use)\\n * [Partial Mode](https://platform.moonshot.cn/docs/api/partial)\\n * [文件接口](https://platform.moonshot.cn/docs/api/files)\\n * [计算 Token](https://platform.moonshot.cn/docs/api/estimate)\\n * [查询余额](https://platform.moonshot.cn/docs/api/balance)\\n\\n* 产品定价\\n\\n * [模型推理定价](https://platform.moonshot.cn/docs/pricing/chat)\\n * [联网搜索定价](https://platform.moonshot.cn/docs/pricing/tools)\\n * [充值与限速](https://platform.moonshot.cn/docs/pricing/limits)\\n * [常见问题](https://platform.moonshot.cn/docs/pricing/faq)\\n\\n* 入门指南\\n\\n * [Kimi K2 快速开始](https://platform.moonshot.cn/docs/guide/kimi-k2-quickstart)\\n * [使用 kimi-k2-thinking 思考模型](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model)\\n * [开始使用 Kimi API](https://platform.moonshot.cn/docs/guide/start-using-kimi-api)\\n * [从 OpenAI 迁移到 Kimi API](https://platform.moonshot.cn/docs/guide/migrating-from-openai-to-kimi)\\n * [使用 API 调试工具](https://platform.moonshot.cn/docs/guide/use-moonpalace)\\n * [使用 Kimi API 进行多轮对话](https://platform.moonshot.cn/docs/guide/engage-in-multi-turn-conversations-using-kimi-api)\\n * [使用 Vision 视觉模型](https://platform.moonshot.cn/docs/guide/use-kimi-vision-model)\\n * [自动选择 Kimi 模型](https://platform.moonshot.cn/docs/guide/choose-an-appropriate-kimi-model)\\n * [自动断线重连](https://platform.moonshot.cn/docs/guide/auto-reconnect)\\n * [使用 Stream 流式输出](https://platform.moonshot.cn/docs/guide/utilize-the-streaming-output-feature-of-kimi-api)\\n * [使用 Tool Calls](https://platform.moonshot.cn/docs/guide/use-kimi-api-to-complete-tool-calls)\\n * [使用联网搜索 Tool](https://platform.moonshot.cn/docs/guide/use-web-search)\\n * [使用 JSON Mode](https://platform.moonshot.cn/docs/guide/use-json-mode-feature-of-kimi-api)\\n * [使用 Partial Mode](https://platform.moonshot.cn/docs/guide/use-partial-mode-feature-of-kimi-api)\\n * [使用 Kimi API 进行文件问答](https://platform.moonshot.cn/docs/guide/use-kimi-api-for-file-based-qa)\\n * [使用开发工作台调试模型](https://platform.moonshot.cn/docs/guide/use-playground-to-debug-the-model)\\n * [在 ClaudeCode/Cline/RooCode 中使用 kimi-k2 模型](https://platform.moonshot.cn/docs/guide/agent-support)\\n * [在开发工作台中配置 ModelScope MCP 服务器](https://platform.moonshot.cn/docs/guide/configure-the-modelscope-mcp-server)\\n * [在 Kimi API 中使用官方工具](https://platform.moonshot.cn/docs/guide/use-official-tools)\\n * [使用 Kimi CLI 调用 Kimi 大模型](https://platform.moonshot.cn/docs/guide/kimi-cli-support)\\n * [Prompt 最佳实践](https://platform.moonshot.cn/docs/guide/prompt-best-practice)\\n * [组织管理最佳实践](https://platform.moonshot.cn/docs/guide/org-best-practice)\\n * [常见问题及解决方案](https://platform.moonshot.cn/docs/guide/faq)\\n\\n* 条款与协议\\n\\n * [平台服务协议](https://platform.moonshot.cn/docs/agreement/modeluse)\\n * [用户服务协议](https://platform.moonshot.cn/docs/agreement/userservice)\\n * [用户隐私协议](https://platform.moonshot.cn/docs/agreement/userprivacy)\\n * [充值协议](https://platform.moonshot.cn/docs/agreement/payment)\\n\\n* [Moonshot ↗ (opens in a new tab)](https://www.moonshot.cn/)\\n* [Changelog ↗ (opens in a new tab)](https://platform.moonshot.cn/blog/posts/changelog)\\n* [联系客服](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model) \\n* [开发者交流群](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model) \\n* [官方公众号](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model) \\n* [Global | platform.moonshot.ai↗ (opens in a new tab)](https://platform.moonshot.ai/)\\n\\n目录\\n\\n* [基本示例](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%9F%BA%E6%9C%AC%E7%A4%BA%E4%BE%8B)\\n* [输出思考内容](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E8%BE%93%E5%87%BA%E6%80%9D%E8%80%83%E5%86%85%E5%AE%B9)\\n* [多步工具调用](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%A4%9A%E6%AD%A5%E5%B7%A5%E5%85%B7%E8%B0%83%E7%94%A8)\\n* [使用须知](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E4%BD%BF%E7%94%A8%E9%A1%BB%E7%9F%A5)\\n* [完整示例](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%AE%8C%E6%95%B4%E7%A4%BA%E4%BE%8B)\\n* [常见问题](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)\\n* [Q1: 为什么需要保留 reasoning_content](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#q1-%E4%B8%BA%E4%BB%80%E4%B9%88%E9%9C%80%E8%A6%81%E4%BF%9D%E7%95%99-reasoning_content)\\n* [Q2: reasoning_content 会消耗额外的 token 吗?](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#q2-reasoning_content-%E4%BC%9A%E6%B6%88%E8%80%97%E9%A2%9D%E5%A4%96%E7%9A%84-token-%E5%90%97)\\n\\n回到顶部\\n\\n文档\\n\\n入门指南\\n\\n使用 kimi-k2-thinking 思考模型\\n\\n使用kimi-k2-thinking思考模型\\n======================\\n\\n> `kimi-k2-thinking` 模型是月之暗面提供的具有通用 Agentic 能力和推理能力的思考模型,它擅长深度推理,并可通过多步工具调用,帮助解决各类难题。\\n\\n基本示例[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%9F%BA%E6%9C%AC%E7%A4%BA%E4%BE%8B)\\n---------------------------------------------------------------------------------------------------------------\\n\\n你可以简单地通过更换 `model` 来使用它:\\n\\ncurl python\\n\\n```\\n$ curl https://api.moonshot.cn/v1/chat/completions \\\\\\n -H \\\"Content-Type: application/json\\\" \\\\\\n -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\\\n -d '{\\n \\\"model\\\": \\\"kimi-k2-thinking\\\",\\n \\\"messages\\\": [\\n {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"你好\\\"}\\n ],\\n \\\"temperature\\\": 1.0\\n \\n }'\\n```\\n\\n输出思考内容[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E8%BE%93%E5%87%BA%E6%80%9D%E8%80%83%E5%86%85%E5%AE%B9)\\n-----------------------------------------------------------------------------------------------------------------------------------\\n\\n注意到在使用 `kimi-k2-thinking` 模型时,我们的 API 响应中使用了 `reasoning_content` 字段作为模型思考内容的载体,对于 `reasoning_content` 字段:\\n\\n* openai SDK 中的 `ChoiceDelta` 和 `ChatCompletionMessage` 类型并不提供 `reasoning_content` 字段,因此无法直接通过 `.reasoning_content` 的方式访问该字段,仅支持通过 `hasattr(obj, \\\"reasoning_content\\\")` 来判断是否存在字段,如果存在,则使用 `getattr(obj, \\\"reasoning_content\\\")` 获取字段值\\n* 如果你使用其他框架或自行通过 HTTP 接口对接,可以直接获取与 `content` 字段同级的 `reasoning_content` 字段\\n* 在流式输出(`stream=True`)的场合,`reasoning_content` 字段一定会先于 `content` 字段出现,你可以在业务代码中通过判断是否出现 `content` 字段来识别思考内容(或称推理过程)是否结束\\n* `reasoning_content` 中包含的 Tokens 也受 `max_tokens` 参数控制,`reasoning_content` 的 Tokens 数加上 `content` 的 Tokens 数应小于等于 `max_tokens`\\n\\n多步工具调用[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%A4%9A%E6%AD%A5%E5%B7%A5%E5%85%B7%E8%B0%83%E7%94%A8)\\n-----------------------------------------------------------------------------------------------------------------------------------\\n\\n使用 `kimi-k2-thinking` 支持通过深度地推理进行多次工具调用,进而完成非常复杂的任务。\\n\\n### 使用须知[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E4%BD%BF%E7%94%A8%E9%A1%BB%E7%9F%A5)\\n\\n为确保最佳效果请务必按以下方式调用 kimi-k2-thinking 模型:\\n\\n* 输入应当包括上下文中所有的思考内容(reasoning_content字段),模型会根据实际情况选择把必要的思考内容送到模型进行推理。\\n* 设置 `max_tokens>=16000` 以避免无法输出完整的 `reasoning_content` 和 `content`。\\n* **设置 `temperature=1.0`,以获得最佳性能。**\\n* 使用流式输出(`stream=True``kimi-k2-thinking` 模型的输出内容包含了 `reasoning_content`,相比普通模型其输出内容更多,启用流式输出能获得更好的用户体验,同时一定程度避免网络超时问题。\\n\\n### 完整示例[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%AE%8C%E6%95%B4%E7%A4%BA%E4%BE%8B)\\n\\n下面的示例展示了一个\\\"今日新闻报告生成\\\"的场景,模型会依次调用 `date`(获取日期)和 `web_search`(搜索今日新闻)等官方工具,并在这个过程中展现深度思考过程:\\n\\n```\\nimport os\\nimport json\\nimport httpx\\nimport openai\\n \\n \\nclass FormulaChatClient:\\n def __init__(self, base_url: str, api_key: str):\\n \\\"\\\"\\\"初始化 Formula 客户端\\\"\\\"\\\"\\n self.base_url = base_url\\n self.api_key = api_key\\n self.openai = openai.Client(\\n base_url=base_url,\\n api_key=api_key,\\n )\\n self.httpx = httpx.Client(\\n base_url=base_url,\\n headers={\\\"Authorization\\\": f\\\"Bearer {api_key}\\\"},\\n timeout=30.0,\\n )\\n self.model = \\\"kimi-k2-thinking\\\"\\n \\n def get_tools(self, formula_uri: str):\\n \\\"\\\"\\\"从 Formula API 获取工具定义\\\"\\\"\\\"\\n response = self.httpx.get(f\\\"/formulas/{formula_uri}/tools\\\")\\n response.raise_for_status() # 检查 HTTP 状态码\\n \\n try:\\n return response.json().get(\\\"tools\\\", [])\\n except json.JSONDecodeError as e:\\n print(f\\\"错误: 无法解析响应为 JSON (状态码: {response.status_code})\\\")\\n print(f\\\"响应内容: {response.text[:500]}\\\")\\n raise\\n \\n def call_tool(self, formula_uri: str, function: str, args: dict):\\n \\\"\\\"\\\"调用官方工具\\\"\\\"\\\"\\n response = self.httpx.post(\\n f\\\"/formulas/{formula_uri}/fibers\\\",\\n json={\\\"name\\\": function, \\\"arguments\\\": json.dumps(args)},\\n )\\n response.raise_for_status() # 检查 HTTP 状态码\\n fiber = response.json()\\n \\n if fiber.get(\\\"status\\\", \\\"\\\") == \\\"succeeded\\\":\\n return fiber[\\\"context\\\"].get(\\\"output\\\") or fiber[\\\"context\\\"].get(\\\"encrypted_output\\\")\\n \\n if \\\"error\\\" in fiber:\\n return f\\\"Error: {fiber['error']}\\\"\\n if \\\"error\\\" in fiber.get(\\\"context\\\", {}):\\n return f\\\"Error: {fiber['context']['error']}\\\"\\n return \\\"Error: Unknown error\\\"\\n \\n def close(self):\\n \\\"\\\"\\\"关闭客户端连接\\\"\\\"\\\"\\n self.httpx.close()\\n \\n \\n# 初始化客户端\\nbase_url = os.getenv(\\\"MOONSHOT_BASE_URL\\\", \\\"https://api.moonshot.cn/v1\\\")\\napi_key = os.getenv(\\\"MOONSHOT_API_KEY\\\")\\n \\nif not api_key:\\n raise ValueError(\\\"MOONSHOT_API_KEY 环境变量未设置,请先设置 API 密钥\\\")\\n \\nprint(f\\\"Base URL: {base_url}\\\")\\nprint(f\\\"API Key: {api_key[:10]}...{api_key[-10:] if len(api_key) > 20 else api_key}\\\\n\\\")\\n \\nclient = FormulaChatClient(base_url, api_key)\\n \\n# 定义要使用的官方工具 Formula URI\\nformula_uris = [\\n \\\"moonshot/date:latest\\\",\\n \\\"moonshot/web-search:latest\\\"\\n]\\n \\n# 加载所有工具定义并建立映射\\nprint(\\\"正在加载官方工具...\\\")\\nall_tools = []\\ntool_to_uri = {} # function.name -> formula_uri 的映射\\n \\nfor uri in formula_uris:\\n try:\\n tools = client.get_tools(uri)\\n for tool in tools:\\n func = tool.get(\\\"function\\\")\\n if func:\\n func_name = func.get(\\\"name\\\")\\n if func_name:\\n tool_to_uri[func_name] = uri\\n all_tools.append(tool)\\n print(f\\\" 已加载工具: {func_name} from {uri}\\\")\\n except Exception as e:\\n print(f\\\" 警告: 加载工具 {uri} 失败: {e}\\\")\\n continue\\n \\nprint(f\\\"总共加载 {len(all_tools)} 个工具\\\\n\\\")\\n \\nif not all_tools:\\n raise ValueError(\\\"未能加载任何工具,请检查 API 密钥和网络连接\\\")\\n \\n# 初始化消息列表\\nmessages = [\\n {\\n \\\"role\\\": \\\"system\\\",\\n \\\"content\\\": \\\"你是 Kimi一个专业的新闻分析师。你擅长收集、分析和整理信息生成高质量的新闻报告。\\\",\\n },\\n]\\n \\n# 用户请求生成今日新闻报告\\nuser_request = \\\"请帮我生成一份今日新闻报告,包含重要的科技、经济和社会新闻。\\\"\\nmessages.append({\\n \\\"role\\\": \\\"user\\\",\\n \\\"content\\\": user_request\\n})\\n \\nprint(f\\\"用户请求: {user_request}\\\\n\\\")\\n \\n# 开始多轮对话循环\\nmax_iterations = 10 # 防止无限循环\\nfor iteration in range(max_iterations):\\n # 调用模型\\n try:\\n completion = client.openai.chat.completions.create(\\n model=client.model,\\n messages=messages,\\n max_tokens=1024 * 32,\\n tools=all_tools,\\n temperature=1.0,\\n )\\n except openai.AuthenticationError as e:\\n print(f\\\"认证错误: {e}\\\")\\n print(\\\"请检查 API key 是否正确,以及 API key 是否有权限访问该端点\\\")\\n raise\\n except Exception as e:\\n print(f\\\"调用模型时发生错误: {e}\\\")\\n raise\\n \\n # 获取响应\\n message = completion.choices[0].message\\n \\n # 打印思考过程\\n if hasattr(message, \\\"reasoning_content\\\"):\\n print(f\\\"=============第 {iteration + 1} 轮思考开始=============\\\")\\n reasoning = getattr(message, \\\"reasoning_content\\\")\\n if reasoning:\\n print(reasoning[:500] + \\\"...\\\" if len(reasoning) > 500 else reasoning)\\n print(f\\\"=============第 {iteration + 1} 轮思考结束=============\\\\n\\\")\\n \\n # 添加 assistant 消息到上下文(保留 reasoning_content\\n messages.append(message)\\n \\n # 如果模型没有调用工具,说明对话结束\\n if not message.tool_calls:\\n print(\\\"=============最终回答=============\\\")\\n print(message.content)\\n break\\n \\n # 处理工具调用\\n print(f\\\"模型决定调用 {len(message.tool_calls)} 个工具:\\\\n\\\")\\n \\n for tool_call in message.tool_calls:\\n func_name = tool_call.function.name\\n args = json.loads(tool_call.function.arguments)\\n \\n print(f\\\"调用工具: {func_name}\\\")\\n print(f\\\"参数: {json.dumps(args, ensure_ascii=False, indent=2)}\\\")\\n \\n # 获取对应的 formula_uri\\n formula_uri = tool_to_uri.get(func_name)\\n if not formula_uri:\\n print(f\\\"错误: 找不到工具 {func_name} 对应的 Formula URI\\\")\\n continue\\n \\n # 调用工具\\n result = client.call_tool(formula_uri, func_name, args)\\n \\n # 打印结果(截断过长内容)\\n if len(str(result)) > 200:\\n print(f\\\"工具结果: {str(result)[:200]}...\\\\n\\\")\\n else:\\n print(f\\\"工具结果: {result}\\\\n\\\")\\n \\n # 添加工具结果到消息列表\\n tool_message = {\\n \\\"role\\\": \\\"tool\\\",\\n \\\"tool_call_id\\\": tool_call.id,\\n \\\"name\\\": func_name,\\n \\\"content\\\": result\\n }\\n messages.append(tool_message)\\n \\nprint(\\\"\\\\n对话完成\\\")\\n \\n# 清理资源\\nclient.close()\\n```\\n\\n整个过程展现了 `kimi-k2-thinking` 模型如何通过深度思考来规划和执行复杂的多步骤任务,每个步骤都有完整的推理过程(`reasoning_content`),并且思考内容会保留在上下文中以确保工具调用的准确性。\\n\\n常见问题[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)\\n---------------------------------------------------------------------------------------------------------------\\n\\n### Q1: 为什么需要保留 reasoning_content[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#q1-%E4%B8%BA%E4%BB%80%E4%B9%88%E9%9C%80%E8%A6%81%E4%BF%9D%E7%95%99-reasoning_content)\\n\\nA: 保留 reasoning_content 可以确保模型在多轮对话中保持推理的连贯性,特别是在工具调用过程中。服务器会自动处理这些字段,用户无需手动管理。\\n\\n### Q2: reasoning_content 会消耗额外的 token 吗?[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#q2-reasoning_content-%E4%BC%9A%E6%B6%88%E8%80%97%E9%A2%9D%E5%A4%96%E7%9A%84-token-%E5%90%97)\\n\\nA: 是的reasoning_content 会计入输入/输出 token 消耗。具体计费方式请参考 MoonshotAI 的定价文档。\\n\\nLast updated on 2025年11月6日\\n\\n[Kimi K2 快速开始](https://platform.moonshot.cn/docs/guide/kimi-k2-quickstart \\\"Kimi K2 快速开始\\\")[开始使用 Kimi API](https://platform.moonshot.cn/docs/guide/start-using-kimi-api \\\"开始使用 Kimi API\\\")\\n==================================================\"}",
"timestamp": "2025-11-09T11:35:23.249841",
"tool_call_id": "extract_webpage:1",
"name": "extract_webpage"
},
{
"role": "assistant",
"content": "让我再搜索一下是否有通过参数控制思考模式的相关信息。",
"timestamp": "2025-11-09T11:35:25.860543"
},
{
"role": "assistant",
"content": "",
"timestamp": "2025-11-09T11:35:25.866058",
"tool_calls": [
{
"id": "web_search:2",
"index": 0,
"type": "function",
"function": {
"name": "web_search",
"arguments": "{\"query\": \"kimi k2 thinking 参数 control thinking parameter api moonshot\", \"max_results\": 8}"
}
}
]
},
{
"role": "tool",
"content": "{\"success\": true, \"summary\": \"🔍 搜索查询: kimi k2 thinking 参数 control thinking parameter api moonshot\\n📅 搜索时间: 2025-11-09T11:35:29.518123\\n🎯 过滤条件: Topic: general\\n\\n📝 AI摘要:\\nKimi K2 Thinking is Moonshot AI's advanced model with 32 billion activated parameters, supporting complex reasoning and tool use. It uses a Mixture-of-Experts architecture and has a 256k context length. The model is optimized for long-horizon reasoning and dynamic tool invocation.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. Kimi K2 Thinking - API, Providers, Stats - OpenRouter\\n 🔗 https://openrouter.ai/moonshotai/kimi-k2-thinking\\n 📄 Kimi K2 Thinking is Moonshot AIs most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) arc...\\n\\n2. Moonshot AI Open Platform - Kimi Large Language Model API Service\\n 🔗 https://platform.moonshot.ai/\\n 📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nWelcome\\n\\n# Welcome to Kimi API Docs\\n\\n# Kimi K2 Model\\n\\nA MoE architecture base model with powe...\\n\\n3. moonshotai/Kimi-K2-Thinking\\n 🔗 https://huggingface.co/moonshotai/Kimi-K2-Thinking\\n 📄 Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. I...\\n\\n4. Kimi K2 is the large language model series developed by Moonshot ...\\n 🔗 https://github.com/MoonshotAI/Kimi-K2\\n 📄 Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptiona...\\n\\n5. moonshotai/Kimi-K2-Instruct - Hugging Face\\n 🔗 https://huggingface.co/moonshotai/Kimi-K2-Instruct\\n 📄 Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptiona...\\n\\n6. Kimi K2 Thinking突袭智能体&推理能力超GPT-5网友\\n 🔗 https://zhuanlan.zhihu.com/p/1970062429741385218\\n 📄 作为今年最受关注的开源模型系列之一Kimi K2的Thinking版本一上线就成为热议对象再次缩小了开源模型与闭源模型的差距。\\n\\nImage 6\\n\\n更多技术细节速览在此\\n\\n 1TB参数激活参数32B采用INT4而非FP8。\\n 256K上下文窗口。\\n 更多专家更少head更多思考。\\n\\nImage 7\\n\\n△与DeepSeek R1的架构对比图源x@rasbt\\n\\n在人类最后...\\n\\n7. Kimi K2: Open Agentic Intelligence\\n 🔗 https://moonshotai.github.io/Kimi-K2/\\n 📄 The Kimi Platform offers an OpenAI/Anthropic compatible interface, allowing for easy adaptation of your existing applications to Kimi K2. We encourage developers to explore our tool calling API for bu...\\n\\n8. Moonshot AI\\n 🔗 https://www.moonshot.ai/\\n 📄 Image 5Kimi Kimi is driven by Moonshot's proprietary LLM. It supports online search, deep thinking, multimodal reasoning, and long-form text conversations. Chat with KimiKimi Open Platform Kimi Open P...\", \"filters\": {\"topic\": \"general\"}, \"query\": \"kimi k2 thinking 参数 control thinking parameter api moonshot\", \"results\": [{\"index\": 1, \"title\": \"Kimi K2 Thinking - API, Providers, Stats - OpenRouter\", \"url\": \"https://openrouter.ai/moonshotai/kimi-k2-thinking\", \"content\": \"Kimi K2 Thinking is Moonshot AIs most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves [...] # MoonshotAI: Kimi K2 Thinking\\n\\n### moonshotai/kimi-k2-thinking\\n\\nCreated Nov 6, 2025262,144 context\\n\\n$0.60/M input tokens$2.50/M output tokens [...] For information about using third-party SDKs and frameworks with OpenRouter, please see ourframeworks documentation.\\n\\nSee the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.\", \"score\": 0.85538095, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"Moonshot AI Open Platform - Kimi Large Language Model API Service\", \"url\": \"https://platform.moonshot.ai/\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nWelcome\\n\\n# Welcome to Kimi API Docs\\n\\n# Kimi K2 Model\\n\\nA MoE architecture base model with powerful code and Agent capabilities, featuring 1T total parameters with 32B activated parameters and up to 256k context length. In benchmark tests for general knowledge reasoning, programming, mathematics, and Agent capabilities, K2 model outperforms other mainstream open-source models\\n\\n## Quick Start [...] Getting Started with Kimi K2\\n\\nKimi K2 quick start guide and basic conceptsConsole\\n\\nManage API keys, usage and billingUsing K2 Model in Software Agents\\n\\nExperience K2 model's code writing and tool calling capabilitiesPricing & Billing\\n\\nLearn about model pricing and billing methodsRate Limits\\n\\nLearn about API rate limits and quotasFAQ\\n\\nFind answers to common questions\\n\\n## Models and Pricing\\n\\nView all\\n\\nkimi-k2-turbo-preview\\n\\npreviewkimi-k2-thinking\\n\\nlong thinkkimi-latest\\n\\nImage understanding [...] ## More Resources\\n\\nK2 Model Details\\n\\nLearn about K2 model features and advantagesDevelopment Workbench\\n\\nInteractive interface to experience Kimi model capabilitiesWeb Search Tool\\n\\nUse Kimi's official web search toolBest Practices\\n\\nPrompt writing and application examples\\n\\nUser Manual\", \"score\": 0.75442725, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"moonshotai/Kimi-K2-Thinking\", \"url\": \"https://huggingface.co/moonshotai/Kimi-K2-Thinking\", \"content\": \"Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. It sets a new state-of-the-art on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks by dramatically scaling multi-step reasoning depth and maintaining stable tool-use across 200300 sequential calls. At the same time, K2 Thinking is a native INT4 quantization model with 256k context [...] | | |\\n --- |\\n| Architecture | Mixture-of-Experts (MoE) |\\n| Total Parameters | 1T |\\n| Activated Parameters | 32B |\\n| Number of Layers (Dense layer included) | 61 |\\n| Number of Dense Layers | 1 |\\n| Attention Hidden Dimension | 7168 |\\n| MoE Hidden Dimension (per Expert) | 2048 |\\n| Number of Attention Heads | 64 |\\n| Number of Experts | 384 |\\n| Selected Experts per Token | 8 |\\n| Number of Shared Experts | 1 |\\n| Vocabulary Size | 160K |\\n| Context Length | 256K |\\n| Attention Mechanism | MLA | [...] messages.append({\\n \\\"role\\\": \\\"tool\\\",\\n \\\"tool_call_id\\\": tool_call.id,\\n \\\"name\\\": tool_call_name,\\n \\\"content\\\": json.dumps(tool_result)\\n })\\n print(\\\"-\\\" 100)\\n print(choice.message.content)\", \"score\": 0.69870013, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"Kimi K2 is the large language model series developed by Moonshot ...\", \"url\": \"https://github.com/MoonshotAI/Kimi-K2\", \"content\": \"Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.\\n\\n### Key Features [...] |\\n| |\\n| Architecture | Mixture-of-Experts (MoE) |\\n| Total Parameters | 1T |\\n| Activated Parameters | 32B |\\n| Number of Layers (Dense layer included) | 61 |\\n| Number of Dense Layers | 1 |\\n| Attention Hidden Dimension | 7168 |\\n| MoE Hidden Dimension (per Expert) | 2048 |\\n| Number of Attention Heads | 64 |\\n| Number of Experts | 384 |\\n| Selected Experts per Token | 8 |\\n| Number of Shared Experts | 1 |\\n| Vocabulary Size | 160K |\\n| Context Length | 128K |\\n| Attention Mechanism | MLA | [...] Kimi-K2-Base: The foundation model, a strong start for researchers and builders who want full control for fine-tuning and custom solutions.\\n Kimi-K2-Instruct: The post-trained model best for drop-in, general-purpose chat and agentic experiences. It is a reflex-grade model without long thinking.\\n\\n## 2. Model Summary\", \"score\": 0.6929958, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"moonshotai/Kimi-K2-Instruct - Hugging Face\", \"url\": \"https://huggingface.co/moonshotai/Kimi-K2-Instruct\", \"content\": \"Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.\\n\\n### Key Features [...] | | |\\n --- |\\n| Architecture | Mixture-of-Experts (MoE) |\\n| Total Parameters | 1T |\\n| Activated Parameters | 32B |\\n| Number of Layers (Dense layer included) | 61 |\\n| Number of Dense Layers | 1 |\\n| Attention Hidden Dimension | 7168 |\\n| MoE Hidden Dimension (per Expert) | 2048 |\\n| Number of Attention Heads | 64 |\\n| Number of Experts | 384 |\\n| Selected Experts per Token | 8 |\\n| Number of Shared Experts | 1 |\\n| Vocabulary Size | 160K |\\n| Context Length | 128K |\\n| Attention Mechanism | MLA | [...] > You can access Kimi K2's API on , we provide OpenAI/Anthropic-compatible API for you.\\n>\\n> The Anthropic-compatible API maps temperature by `real_temperature = request_temperature 0.6` for better compatible with existing applications.\\n\\nOur model checkpoints are stored in the block-fp8 format, you can find it on Huggingface.\\n\\nCurrently, Kimi-K2 is recommended to run on the following inference engines:\\n\\n vLLM\\n SGLang\\n KTransformers\\n TensorRT-LLM\", \"score\": 0.6773251, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"Kimi K2 Thinking突袭智能体&推理能力超GPT-5网友\", \"url\": \"https://zhuanlan.zhihu.com/p/1970062429741385218\", \"content\": \"作为今年最受关注的开源模型系列之一Kimi K2的Thinking版本一上线就成为热议对象再次缩小了开源模型与闭源模型的差距。\\n\\nImage 6\\n\\n更多技术细节速览在此\\n\\n 1TB参数激活参数32B采用INT4而非FP8。\\n 256K上下文窗口。\\n 更多专家更少head更多思考。\\n\\nImage 7\\n\\n△与DeepSeek R1的架构对比图源x@rasbt\\n\\n在人类最后的考试HLE、测试自主网络浏览能力的BrowseComp和复杂信息收集推理基准测试SEAL-0等评测基准中Kimi K2 Thinking都刷新了SOTA超越GPT-5、Claude Sonnet 4.5Thinking等闭源模型。\\n\\nImage 8\\n\\nKimi K2 Thinking的代码和权重均遵循最为宽松的MIT协议。新模型也已第一时间上线\\n\\n技术细节\\n\\n官方提到K2 Thinking是月之暗面在Test-Time Scaling测试时扩展领域的最新进展通过同时扩展思考Token和工具调用轮次模型实现了更强的Agent和推理性能。\\n\\n智能体、推理能力全面提升\", \"score\": 0.6439794, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"Kimi K2: Open Agentic Intelligence\", \"url\": \"https://moonshotai.github.io/Kimi-K2/\", \"content\": \"The Kimi Platform offers an OpenAI/Anthropic compatible interface, allowing for easy adaptation of your existing applications to Kimi K2. We encourage developers to explore our tool calling API for building agent applications. For detailed information, visit platform.moonshot.ai.\\n\\n#### Serve Kimi K2 on your own [...] The Kimi Platform offers an OpenAI/Anthropic compatible interface, allowing for easy adaptation of your existing applications to Kimi K2. We encourage developers to explore our tool calling API for building agent applications. For detailed information, visit platform.moonshot.ai.\\n#### Serve Kimi K2 on your own [...] Kimi K2 is our latest Mixture-of-Experts model with 32 billion activated parameters and 1 trillion total parameters. It achieves state-of-the-art performance in frontier knowledge, math, and coding among non-thinking models. But it goes further — meticulously optimized for agentic tasks, Kimi K2 does not just answer; it acts.\\n\\nAnd now, it is within your reach. Today, we are open-sourcing:\", \"score\": 0.63234437, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"Moonshot AI\", \"url\": \"https://www.moonshot.ai/\", \"content\": \"Image 5Kimi Kimi is driven by Moonshot's proprietary LLM. It supports online search, deep thinking, multimodal reasoning, and long-form text conversations. Chat with KimiKimi Open Platform Kimi Open Platform supports flexible API calls, so you can easily integrate advanced features and give your applications a competitive edge. Experience Now\\n\\nLatest Research\", \"score\": 0.5973971, \"published_date\": \"\"}], \"total_results\": 8}",
"timestamp": "2025-11-09T11:35:29.523127",
"tool_call_id": "web_search:2",
"name": "web_search"
},
{
"role": "assistant",
"content": "让我查看GitHub上的官方文档了解更详细的API信息。",
"timestamp": "2025-11-09T11:35:33.478812"
},
{
"role": "assistant",
"content": "",
"timestamp": "2025-11-09T11:35:33.486337",
"tool_calls": [
{
"id": "extract_webpage:3",
"index": 0,
"type": "function",
"function": {
"name": "extract_webpage",
"arguments": "{\"url\": \"https://github.com/MoonshotAI/Kimi-K2\"}"
}
}
]
},
{
"role": "tool",
"content": "{\"success\": true, \"url\": \"https://github.com/MoonshotAI/Kimi-K2\", \"content\": \"🌐 网页内容 (17374 字符):\\n📍 URL: https://github.com/MoonshotAI/Kimi-K2\\n==================================================\\n[Skip to content](#start-of-content) \\n\\n\\n\\n## Navigation Menu\\n\\n[Sign in](/login?return_to=https%3A%2F%2Fgithub.com%2FMoonshotAI%2FKimi-K2) \\n\\nAppearance settings\\n\\n# Search code, repositories, users, issues, pull requests...\\n\\n[Search syntax tips](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax)\\n\\n[Sign in](/login?return_to=https%3A%2F%2Fgithub.com%2FMoonshotAI%2FKimi-K2)\\n\\n [Sign up](/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E&source=header-repo&source_repo=MoonshotAI%2FKimi-K2) \\n\\nAppearance settings\\n\\nYou signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert\\n\\n{{ message }}\\n\\n[MoonshotAI](/MoonshotAI) / **[Kimi-K2](/MoonshotAI/Kimi-K2)** Public\\n\\n* [Notifications](/login?return_to=%2FMoonshotAI%2FKimi-K2) You must be signed in to change notification settings\\n* [Fork 553](/login?return_to=%2FMoonshotAI%2FKimi-K2)\\n* [Star 8.4k](/login?return_to=%2FMoonshotAI%2FKimi-K2)\\n\\nKimi K2 is the large language model series developed by Moonshot AI team\\n\\n### License\\n\\n[View license](/MoonshotAI/Kimi-K2/blob/main/LICENSE)\\n\\n[8.4k stars](/MoonshotAI/Kimi-K2/stargazers) [553 forks](/MoonshotAI/Kimi-K2/forks) [Branches](/MoonshotAI/Kimi-K2/branches) [Tags](/MoonshotAI/Kimi-K2/tags) [Activity](/MoonshotAI/Kimi-K2/activity)\\n\\n[Star](/login?return_to=%2FMoonshotAI%2FKimi-K2)\\n\\n[Notifications](/login?return_to=%2FMoonshotAI%2FKimi-K2) You must be signed in to change notification settings\\n\\n# MoonshotAI/Kimi-K2\\n\\n[Branches](/MoonshotAI/Kimi-K2/branches)[Tags](/MoonshotAI/Kimi-K2/tags)\\n\\nOpen more actions menu\\n\\n## Folders and files\\n\\n| Name | Name | Last commit message | Last commit date |\\n| --- | --- | --- | --- |\\n| Latest commit History[23 Commits](/MoonshotAI/Kimi-K2/commits/main/) |\\n| [docs](/MoonshotAI/Kimi-K2/tree/main/docs \\\"docs\\\") | [docs](/MoonshotAI/Kimi-K2/tree/main/docs \\\"docs\\\") | | |\\n| [figures](/MoonshotAI/Kimi-K2/tree/main/figures \\\"figures\\\") | [figures](/MoonshotAI/Kimi-K2/tree/main/figures \\\"figures\\\") | | |\\n| [LICENSE](/MoonshotAI/Kimi-K2/blob/main/LICENSE \\\"LICENSE\\\") | [LICENSE](/MoonshotAI/Kimi-K2/blob/main/LICENSE \\\"LICENSE\\\") | | |\\n| [README.md](/MoonshotAI/Kimi-K2/blob/main/README.md \\\"README.md\\\") | [README.md](/MoonshotAI/Kimi-K2/blob/main/README.md \\\"README.md\\\") | | |\\n| [tech\\\\_report.pdf](/MoonshotAI/Kimi-K2/blob/main/tech_report.pdf \\\"tech_report.pdf\\\") | [tech\\\\_report.pdf](/MoonshotAI/Kimi-K2/blob/main/tech_report.pdf \\\"tech_report.pdf\\\") | | |\\n| |\\n\\n## Repository files navigation\\n\\n---\\n\\n**📰  [Tech Blog](https://moonshotai.github.io/Kimi-K2/)**     |     **📄  [Full Report](https://www.arxiv.org/abs/2507.20534)**\\n\\n## 1. Model Introduction\\n\\nKimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.\\n\\n### Key Features\\n\\n* Large-Scale Training: Pre-trained a 1T parameter MoE model on 15.5T tokens with zero training instability.\\n* MuonClip Optimizer: We apply the Muon optimizer to an unprecedented scale, and develop novel optimization techniques to resolve instabilities while scaling up.\\n* Agentic Intelligence: Specifically designed for tool use, reasoning, and autonomous problem-solving.\\n\\n### Model Variants\\n\\n* **Kimi-K2-Base**: The foundation model, a strong start for researchers and builders who want full control for fine-tuning and custom solutions.\\n* **Kimi-K2-Instruct**: The post-trained model best for drop-in, general-purpose chat and agentic experiences. It is a reflex-grade model without long thinking.\\n\\n## 2. Model Summary\\n\\n|\\n| |\\n| **Architecture** | Mixture-of-Experts (MoE) |\\n| **Total Parameters** | 1T |\\n| **Activated Parameters** | 32B |\\n| **Number of Layers** (Dense layer included) | 61 |\\n| **Number of Dense Layers** | 1 |\\n| **Attention Hidden Dimension** | 7168 |\\n| **MoE Hidden Dimension** (per Expert) | 2048 |\\n| **Number of Attention Heads** | 64 |\\n| **Number of Experts** | 384 |\\n| **Selected Experts per Token** | 8 |\\n| **Number of Shared Experts** | 1 |\\n| **Vocabulary Size** | 160K |\\n| **Context Length** | 128K |\\n| **Attention Mechanism** | MLA |\\n| **Activation Function** | SwiGLU |\\n\\n## 3. Evaluation Results\\n\\n#### Instruction model evaluation results\\n\\n| Benchmark | Metric | Kimi K2 Instruct | DeepSeek-V3-0324 | Qwen3-235B-A22B (non-thinking) | Claude Sonnet 4 (w/o extended thinking) | Claude Opus 4 (w/o extended thinking) | GPT-4.1 | Gemini 2.5 Flash Preview (05-20) |\\n| --- | --- | --- | --- | --- | --- | --- | --- | --- |\\n| **Coding Tasks** |\\n| LiveCodeBench v6 (Aug 24 - May 25) | Pass@1 | **53.7** | 46.9 | 37.0 | 48.5 | 47.4 | 44.7 | 44.7 |\\n| OJBench | Pass@1 | **27.1** | 24.0 | 11.3 | 15.3 | 19.6 | 19.5 | 19.5 |\\n| MultiPL-E | Pass@1 | **85.7** | 83.1 | 78.2 | 88.6 | **89.6** | 86.7 | 85.6 |\\n| SWE-bench Verified (Agentless Coding) | Single Patch w/o Test (Acc) | **51.8** | 36.6 | 39.4 | 50.2 | **53.0** | 40.8 | 32.6 |\\n| SWE-bench Verified (Agentic Coding) | Single Attempt (Acc) | **65.8** | 38.8 | 34.4 | **72.7**\\\\* | 72.5\\\\* | 54.6 | — |\\n| Multiple Attempts (Acc) | **71.6** | — | — | **80.2** | 79.4\\\\* | — | — |\\n| SWE-bench Multilingual (Agentic Coding) | Single Attempt (Acc) | **47.3** | 25.8 | 20.9 | **51.0** | — | 31.5 | — |\\n| TerminalBench | Inhouse Framework (Acc) | **30.0** | — | — | 35.5 | **43.2** | 8.3 | — |\\n| Terminus (Acc) | **25.0** | 16.3 | 6.6 | — | — | **30.3** | 16.8 |\\n| Aider-Polyglot | Acc | 60.0 | 55.1 | **61.8** | 56.4 | **70.7** | 52.4 | 44.0 |\\n| **Tool Use Tasks** |\\n| Tau2 retail | Avg@4 | **70.6** | 69.1 | 57.0 | 75.0 | **81.8** | 74.8 | 64.3 |\\n| Tau2 airline | Avg@4 | **56.5** | 39.0 | 26.5 | 55.5 | **60.0** | 54.5 | 42.5 |\\n| Tau2 telecom | Avg@4 | **65.8** | 32.5 | 22.1 | 45.2 | 57.0 | 38.6 | 16.9 |\\n| AceBench | Acc | **76.5** | 72.7 | 70.5 | 76.2 | 75.6 | **80.1** | 74.5 |\\n| **Math & STEM Tasks** |\\n| AIME 2024 | Avg@64 | **69.6** | 59.4\\\\* | 40.1\\\\* | 43.4 | 48.2 | 46.5 | 61.3 |\\n| AIME 2025 | Avg@64 | **49.5** | 46.7 | 24.7\\\\* | 33.1\\\\* | 33.9\\\\* | 37.0 | 46.6 |\\n| MATH-500 | Acc | **97.4** | 94.0\\\\* | 91.2\\\\* | 94.0 | 94.4 | 92.4 | 95.4 |\\n| HMMT 2025 | Avg@32 | **38.8** | 27.5 | 11.9 | 15.9 | 15.9 | 19.4 | 34.7 |\\n| CNMO 2024 | Avg@16 | 74.3 | **74.7** | 48.6 | 60.4 | 57.6 | 56.6 | **75.0** |\\n| PolyMath-en | Avg@4 | **65.1** | 59.5 | 51.9 | 52.8 | 49.8 | 54.0 | 49.9 |\\n| ZebraLogic | Acc | **89.0** | 84.0 | 37.7\\\\* | 73.7 | 59.3 | 58.5 | 57.9 |\\n| AutoLogi | Acc | **89.5** | 88.9 | 83.3 | **89.8** | 86.1 | 88.2 | 84.1 |\\n| GPQA-Diamond | Avg@8 | **75.1** | 68.4\\\\* | 62.9\\\\* | 70.0\\\\* | 74.9\\\\* | 66.3 | 68.2 |\\n| SuperGPQA | Acc | **57.2** | 53.7 | 50.2 | 55.7 | 56.5 | 50.8 | 49.6 |\\n| Humanity's Last Exam (Text Only) | - | 4.7 | 5.2 | **5.7** | 5.8 | **7.1** | 3.7 | 5.6 |\\n| **General Tasks** |\\n| MMLU | EM | **89.5** | 89.4 | 87.0 | 91.5 | **92.9** | 90.4 | 90.1 |\\n| MMLU-Redux | EM | **92.7** | 90.5 | 89.2 | 93.6 | **94.2** | 92.4 | 90.6 |\\n| MMLU-Pro | EM | 81.1 | **81.2**\\\\* | 77.3 | 83.7 | **86.6** | 81.8 | 79.4 |\\n| IFEval | Prompt Strict | **89.8** | 81.1 | 83.2\\\\* | 87.6 | 87.4 | 88.0 | 84.3 |\\n| Multi-Challenge | Acc | **54.1** | 31.4 | 34.0 | 46.8 | 49.0 | 36.4 | 39.5 |\\n| SimpleQA | Correct | **31.0** | 27.7 | 13.2 | 15.9 | 22.8 | **42.3** | 23.3 |\\n| Livebench | Pass@1 | **76.4** | 72.4 | 67.6 | 74.8 | 74.6 | 69.8 | 67.8 |\\n\\n • Bold denotes global SOTA, and underlined denotes open-source SOTA. \\n • Data points marked with \\\\* are directly from the model's tech report or blog. \\n • All metrics, except for SWE-bench Verified (Agentless), are evaluated with an 8k output token length. SWE-bench Verified (Agentless) is limited to a 16k output token length. \\n • Kimi K2 achieves 65.8% pass@1 on the SWE-bench Verified tests with bash/editor tools (single-attempt patches, no test-time compute). It also achieves a 47.3% pass@1 on the SWE-bench Multilingual tests under the same conditions. Additionally, we report results on SWE-bench Verified tests (71.6%) that leverage parallel test-time compute by sampling multiple sequences and selecting the single best via an internal scoring model. \\n • To ensure the stability of the evaluation, we employed avg@k on the AIME, HMMT, CNMO, PolyMath-en, GPQA-Diamond, EvalPlus, Tau2. \\n • Some data points have been omitted due to prohibitively expensive evaluation costs. \\n\\n---\\n\\n#### Base model evaluation results\\n\\n| Benchmark | Metric | Shot | Kimi K2 Base | Deepseek-V3-Base | Qwen2.5-72B | Llama 4 Maverick |\\n| --- | --- | --- | --- | --- | --- | --- |\\n| **General Tasks** |\\n| MMLU | EM | 5-shot | **87.8** | 87.1 | 86.1 | 84.9 |\\n| MMLU-pro | EM | 5-shot | **69.2** | 60.6 | 62.8 | 63.5 |\\n| MMLU-redux-2.0 | EM | 5-shot | **90.2** | 89.5 | 87.8 | 88.2 |\\n| SimpleQA | Correct | 5-shot | **35.3** | 26.5 | 10.3 | 23.7 |\\n| TriviaQA | EM | 5-shot | **85.1** | 84.1 | 76.0 | 79.3 |\\n| GPQA-Diamond | Avg@8 | 5-shot | 48.1 | **50.5** | 40.8 | 49.4 |\\n| SuperGPQA | EM | 5-shot | **44.7** | 39.2 | 34.2 | 38.8 |\\n| **Coding Tasks** |\\n| LiveCodeBench v6 | Pass@1 | 1-shot | **26.3** | 22.9 | 21.1 | 25.1 |\\n| EvalPlus | Pass@1 | - | **80.3** | 65.6 | 66.0 | 65.5 |\\n| **Mathematics Tasks** |\\n| MATH | EM | 4-shot | **70.2** | 60.1 | 61.0 | 63.0 |\\n| GSM8k | EM | 8-shot | **92.1** | 91.7 | 90.4 | 86.3 |\\n| **Chinese Tasks** |\\n| C-Eval | EM | 5-shot | **92.5** | 90.0 | 90.9 | 80.9 |\\n| CSimpleQA | Correct | 5-shot | **77.6** | 72.1 | 50.5 | 53.5 |\\n\\n • We only evaluate open-source pretrained models in this work. We report results for Qwen2.5-72B because the base checkpoint for Qwen3-235B-A22B was not open-sourced at the time of our study. \\n • All models are evaluated using the same evaluation protocol. \\n\\n## 4. Deployment\\n\\nNote\\n\\nYou can access Kimi K2's API on <https://platform.moonshot.ai> , we provide OpenAI/Anthropic-compatible API for you.\\n\\nThe Anthropic-compatible API maps temperature by `real_temperature = request_temperature * 0.6` for better compatiblity with existing applications.\\n\\nOur model checkpoints are stored in block-fp8 format, you can find it on [Huggingface](https://huggingface.co/moonshotai/Kimi-K2-Instruct).\\n\\nCurrently, it is recommended to run Kimi-K2 on the following inference engines:\\n\\n* vLLM\\n* SGLang\\n* KTransformers\\n* TensorRT-LLM\\n\\nDeployment examples for vLLM and SGLang can be found in the [Model Deployment Guide](/MoonshotAI/Kimi-K2/blob/main/docs/deploy_guidance.md).\\n\\n---\\n\\n## 5. Model Usage\\n\\n### Chat Completion\\n\\nOnce the local inference service is set up, you can interact with it through the chat endpoint:\\n\\n```\\ndef simple_chat client OpenAI model_name str messages = \\\"role\\\" \\\"system\\\" \\\"content\\\"\\\"You are Kimi, an AI assistant created by Moonshot AI.\\\" \\\"role\\\" \\\"user\\\" \\\"content\\\" \\\"type\\\" \\\"text\\\" \\\"text\\\"\\\"Please give a brief self-introduction.\\\" response = client chat completions create model = model_name messages = messages stream = False temperature =0.6 max_tokens = 256 print response choices 0 message content\\n```\\n\\nNote\\n\\nThe recommended temperature for Kimi-K2-Instruct is `temperature = 0.6`. If no special instructions are required, the system prompt above is a good default.\\n\\n---\\n\\n### Tool Calling\\n\\nKimi-K2-Instruct has strong tool-calling capabilities. To enable them, you need to pass the list of available tools in each request, then the model will autonomously decide when and how to invoke them.\\n\\nThe following example demonstrates calling a weather tool end-to-end:\\n\\n```\\n# Your tool implementation def get_weather city str -> dict return \\\"weather\\\" \\\"Sunny\\\" # Tool schema definition tools = \\\"type\\\" \\\"function\\\" \\\"function\\\" \\\"name\\\" \\\"get_weather\\\" \\\"description\\\"\\\"Retrieve current weather information. Call this when the user asks about the weather.\\\" \\\"parameters\\\" \\\"type\\\" \\\"object\\\" \\\"required\\\" \\\"city\\\" \\\"properties\\\" \\\"city\\\" \\\"type\\\" \\\"string\\\" \\\"description\\\" \\\"Name of the city\\\" # Map tool names to their implementations tool_map = \\\"get_weather\\\" get_weather def tool_call_with_client client OpenAI model_name str messages = \\\"role\\\" \\\"system\\\" \\\"content\\\"\\\"You are Kimi, an AI assistant created by Moonshot AI.\\\" \\\"role\\\" \\\"user\\\" \\\"content\\\"\\\"What's the weather like in Beijing today? Use the tool to check.\\\" finish_reason = None while finish_reason is None or finish_reason == \\\"tool_calls\\\" completion = client chat completions create model = model_name messages = messages temperature =0.6 tools = tools # tool list defined above tool_choice = \\\"auto\\\" choice = completion choices 0 finish_reason = choice finish_reason if finish_reason == \\\"tool_calls\\\" messages append choice message for tool_call in choice message tool_calls tool_call_name = tool_call function name tool_call_arguments = json loads tool_call function arguments tool_function = tool_map tool_call_name tool_result = tool_function** tool_call_arguments print\\\"tool_result:\\\" tool_result messages append \\\"role\\\" \\\"tool\\\" \\\"tool_call_id\\\" tool_call id \\\"name\\\" tool_call_name \\\"content\\\" json dumps tool_result print\\\"-\\\"* 100 print choice message content\\n```\\n\\nThe `tool_call_with_client` function implements the pipeline from user query to tool execution. This pipeline requires the inference engine to support Kimi-K2s native tool-parsing logic. For streaming output and manual tool-parsing, see the [Tool Calling Guide](/MoonshotAI/Kimi-K2/blob/main/docs/tool_call_guidance.md).\\n\\n---\\n\\n## 6. License\\n\\nBoth the code and the model weights are released under the [Modified MIT License](/MoonshotAI/Kimi-K2/blob/main/LICENSE).\\n\\n---\\n\\n## 7. Citation\\n\\n```\\n@misc{kimiteam2025kimik2openagentic, title={Kimi K2: Open Agentic Intelligence}, author={Kimi Team and Yifan Bai and Yiping Bao and Guanduo Chen and Jiahao Chen and Ningxin Chen and Ruijue Chen and Yanru Chen and Yuankun Chen and Yutian Chen and Zhuofu Chen and Jialei Cui and Hao Ding and Mengnan Dong and Angang Du and Chenzhuang Du and Dikang Du and Yulun Du and Yu Fan and Yichen Feng and Kelin Fu and Bofei Gao and Hongcheng Gao and Peizhong Gao and Tong Gao and Xinran Gu and Longyu Guan and Haiqing Guo and Jianhang Guo and Hao Hu and Xiaoru Hao and Tianhong He and Weiran He and Wenyang He and Chao Hong and Yangyang Hu and Zhenxing Hu and Weixiao Huang and Zhiqi Huang and Zihao Huang and Tao Jiang and Zhejun Jiang and Xinyi Jin and Yongsheng Kang and Guokun Lai and Cheng Li and Fang Li and Haoyang Li and Ming Li and Wentao Li and Yanhao Li and Yiwei Li and Zhaowei Li and Zheming Li and Hongzhan Lin and Xiaohan Lin and Zongyu Lin and Chengyin Liu and Chenyu Liu and Hongzhang Liu and Jingyuan Liu and Junqi Liu and Liang Liu and Shaowei Liu and T. Y. Liu and Tianwei Liu and Weizhou Liu and Yangyang Liu and Yibo Liu and Yiping Liu and Yue Liu and Zhengying Liu and Enzhe Lu and Lijun Lu and Shengling Ma and Xinyu Ma and Yingwei Ma and Shaoguang Mao and Jie Mei and Xin Men and Yibo Miao and Siyuan Pan and Yebo Peng and Ruoyu Qin and Bowen Qu and Zeyu Shang and Lidong Shi and Shengyuan Shi and Feifan Song and Jianlin Su and Zhengyuan Su and Xinjie Sun and Flood Sung and Heyi Tang and Jiawen Tao and Qifeng Teng and Chensi Wang and Dinglu Wang and Feng Wang and Haiming Wang and Jianzhou Wang and Jiaxing Wang and Jinhong Wang and Shengjie Wang and Shuyi Wang and Yao Wang and Yejie Wang and Yiqin Wang and Yuxin Wang and Yuzhi Wang and Zhaoji Wang and Zhengtao Wang and Zhexu Wang and Chu Wei and Qianqian Wei and Wenhao Wu and Xingzhe Wu and Yuxin Wu and Chenjun Xiao and Xiaotong Xie and Weimin Xiong and Boyu Xu and Jing Xu and Jinjing Xu and L. H. Xu and Lin Xu and Suting Xu and Weixin Xu and Xinran Xu and Yangchuan Xu and Ziyao Xu and Junjie Yan and Yuzi Yan and Xiaofei Yang and Ying Yang and Zhen Yang and Zhilin Yang and Zonghan Yang and Haotian Yao and Xingcheng Yao and Wenjie Ye and Zhuorui Ye and Bohong Yin and Longhui Yu and Enming Yuan and Hongbang Yuan and Mengjie Yuan and Haobing Zhan and Dehao Zhang and Hao Zhang and Wanlu Zhang and Xiaobin Zhang and Yangkun Zhang and Yizhi Zhang and Yongting Zhang and Yu Zhang and Yutao Zhang and Yutong Zhang and Zheng Zhang and Haotian Zhao and Yikai Zhao and Huabin Zheng and Shaojie Zheng and Jianren Zhou and Xinyu Zhou and Zaida Zhou and Zhen Zhu and Weiyu Zhuang and Xinxing Zu}, year={2025}, eprint={2507.20534}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2507.20534}, } \\n```\\n\\n---\\n\\n## 8. Contact Us\\n\\nIf you have any questions or concerns, please reach out to us at [support@moonshot.cn](mailto:support@moonshot.cn).\\n\\n## About\\n\\nKimi K2 is the large language model series developed by Moonshot AI team\\n\\n### Resources\\n\\n### License\\n\\n[View license](#License-1-ov-file)\\n\\n### Uh oh!\\n\\nThere was an error while loading. Please reload this page.\\n\\n[Custom properties](/MoonshotAI/Kimi-K2/custom-properties)\\n\\n### Stars\\n\\n[**8.4k** stars](/MoonshotAI/Kimi-K2/stargazers)\\n\\n### Watchers\\n\\n[**63** watching](/MoonshotAI/Kimi-K2/watchers)\\n\\n### Forks\\n\\n[**553** forks](/MoonshotAI/Kimi-K2/forks)\\n\\n[Report repository](/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2FMoonshotAI%2FKimi-K2&report=MoonshotAI+%28user%29)\\n\\n### Uh oh!\\n\\nThere was an error while loading. Please reload this page.\\n\\n## [Contributors 10](/MoonshotAI/Kimi-K2/graphs/contributors)\\n\\nYou cant perform that action at this time.\\n==================================================\"}",
"timestamp": "2025-11-09T11:35:35.299441",
"tool_call_id": "extract_webpage:3",
"name": "extract_webpage"
},
{
"role": "assistant",
"content": "基于我查到的信息我可以给你详细解释月之暗面Kimi-k2模型控制思考模式的机制\n\n## Kimi-k2模型思考模式控制机制\n\n根据官方文档**Kimi-k2模型是通过模型名称来控制是否启用思考模式的**,而不是通过请求参数。\n\n### 核心结论:\n- **使用 `kimi-k2-thinking` 模型名称** → 启用思考模式\n- **使用 `kimi-k2` 或 `kimi-k2-instruct` 模型名称** → 不启用思考模式\n\n### 具体区别:\n\n**1. 思考模型 (`kimi-k2-thinking`)**\n- 具有深度推理能力\n- 会输出 `reasoning_content` 字段显示思考过程\n- 支持多步工具调用200-300次连续调用\n- 适用于复杂问题解决\n\n**2. 非思考模型 (`kimi-k2-instruct`)**\n- 快速响应,无长篇思考过程\n- 官方描述为 \"reflex-grade model without long thinking\"\n- 适用于日常对话和简单任务\n\n### API调用示例\n\n```python\n# 启用思考模式\ncompletion = client.chat.completions.create(\n model=\"kimi-k2-thinking\", # 关键:模型名称\n messages=messages,\n temperature=1.0, # 推荐温度\n max_tokens=16000 # 推荐设置\n)\n\n# 不启用思考模式 \ncompletion = client.chat.completions.create(\n model=\"kimi-k2-instruct\", # 关键:模型名称\n messages=messages,\n temperature=0.6 # 推荐温度\n)\n```\n\n### 重要参数设置:\n- **思考模型推荐**`temperature=1.0``max_tokens>=16000`\n- **非思考模型推荐**`temperature=0.6`\n\n所以答案是**通过修改模型名称来控制而不是通过thinking参数**。请求时使用 `kimi-k2-thinking` 就启用思考模式,使用 `kimi-k2-instruct` 就不启用思考模式。",
"timestamp": "2025-11-09T11:35:56.646967"
},
{
"role": "user",
"content": "那如果使用kimi-k2-thinking模型时请求传入一个thinking参数为disabled他还会思考吗",
"timestamp": "2025-11-09T11:50:37.601254"
},
{
"role": "assistant",
"content": "让我查一下官方文档中是否有关于thinking参数的具体说明以及是否存在禁用思考模式的选项。",
"timestamp": "2025-11-09T11:50:43.340797"
},
{
"role": "assistant",
"content": "",
"timestamp": "2025-11-09T11:50:43.347987",
"tool_calls": [
{
"id": "web_search:4",
"index": 0,
"type": "function",
"function": {
"name": "web_search",
"arguments": "{\"query\": \"kimi-k2-thinking thinking parameter disabled false true moonshot api documentation\", \"max_results\": 10}"
}
}
]
},
{
"role": "tool",
"content": "{\"success\": true, \"summary\": \"🔍 搜索查询: kimi-k2-thinking thinking parameter disabled false true moonshot api documentation\\n📅 搜索时间: 2025-11-09T11:50:47.566321\\n🎯 过滤条件: Topic: general\\n\\n📝 AI摘要:\\nThe Kimi K2 Thinking model supports multi-step reasoning and tool use. It has 32 billion activated parameters and can handle 256k context length. The \\\"thinking\\\" parameter is disabled for Moonshot API.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. Moonshot AI Open Platform - Kimi Large Language Model API Service\\n 🔗 https://platform.moonshot.ai/\\n 📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nWelcome\\n\\n# Welcome to Kimi API Docs\\n\\n# Kimi K2 Model\\n\\nA MoE architecture base model with powe...\\n\\n2. [BUG] LiteLLM Moonshot AI Kimi K2 thinking fails on multi-turn ...\\n 🔗 https://github.com/strands-agents/sdk-python/issues/1150\\n 📄 Moonshot API only accepts OpenAI-compatible content types: `\\\"text\\\"` and `\\\"image_url\\\"`. The `\\\"thinking\\\"` type works with Claude via Bedrock but is rejected by Moonshot and other OpenAI-compatible provi...\\n\\n3. Kimi K2 Thinking - API, Providers, Stats - OpenRouter\\n 🔗 https://openrouter.ai/moonshotai/kimi-k2-thinking\\n 📄 Kimi K2 Thinking is Moonshot AIs most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) arc...\\n\\n4. How to Use Kimi K2 Thinking API - Apidog\\n 🔗 https://apidog.com/blog/kimi-k2-thinking-api/\\n 📄 As AI models push the boundaries of reasoning and agentic capabilities, Kimi K2 Thinking emerges as a standout innovation from Moonshot AI, blending open-source accessibility with enterprise-grade per...\\n\\n5. Using the kimi-k2-thinking model - Moonshot AI Open Platform\\n 🔗 https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\\n 📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The ...\\n\\n6. Kimi-K2-Thinking Usage Guide - vLLM Recipes\\n 🔗 https://docs.vllm.ai/projects/recipes/en/latest/moonshotai/Kimi-K2-Think.html\\n 📄 ```\\nvllm serve moonshotai/Kimi-K2-Thinking \\\\ \\\\ --tensor-parallel-size 8 \\\\ 8 \\\\ --enable-auto-tool-choice \\\\ \\\\ --tool-call-parser kimi_k2 \\\\ \\\\ --reasoning-parser kimi_k2 \\\\ \\\\ --trust-remote-c...\\n\\n7. kimi-k2-preview - AI/ML API Documentation\\n 🔗 https://docs.aimlapi.com/api-references/text-models-llm/moonshot/kimi-k2-preview\\n 📄 body: JSON.stringify({\\n\\n \\n \\n model: 'moonshot/kimi-k2-preview',\\n\\n ...\\n\\n8. moonshotai/Kimi-K2-Thinking - Hugging Face\\n 🔗 https://huggingface.co/moonshotai/Kimi-K2-Thinking\\n 📄 Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. I...\\n\\n9. Kimi K2 is the large language model series developed by Moonshot ...\\n 🔗 https://github.com/MoonshotAI/Kimi-K2\\n 📄 Notifications You must be signed in to change notification settings\\n Fork 553\\n Star 8.4k\\n\\nKimi K2 is the large language model series developed by Moonshot AI team\\n\\n### License\\n\\nView license\\n\\n8.4k st...\\n\\n10. Quickstart with the Kimi API - Moonshot AI Open Platform\\n 🔗 https://platform.moonshot.ai/docs/guide/start-using-kimi-api\\n 📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nQuickstart with Kimi API\\n\\n# Quickstart with the Kimi API\\n\\nThe Kimi API...\", \"filters\": {\"topic\": \"general\"}, \"query\": \"kimi-k2-thinking thinking parameter disabled false true moonshot api documentation\", \"results\": [{\"index\": 1, \"title\": \"Moonshot AI Open Platform - Kimi Large Language Model API Service\", \"url\": \"https://platform.moonshot.ai/\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nWelcome\\n\\n# Welcome to Kimi API Docs\\n\\n# Kimi K2 Model\\n\\nA MoE architecture base model with powerful code and Agent capabilities, featuring 1T total parameters with 32B activated parameters and up to 256k context length. In benchmark tests for general knowledge reasoning, programming, mathematics, and Agent capabilities, K2 model outperforms other mainstream open-source models\\n\\n## Quick Start [...] Getting Started with Kimi K2\\n\\nKimi K2 quick start guide and basic conceptsConsole\\n\\nManage API keys, usage and billingUsing K2 Model in Software Agents\\n\\nExperience K2 model's code writing and tool calling capabilitiesPricing & Billing\\n\\nLearn about model pricing and billing methodsRate Limits\\n\\nLearn about API rate limits and quotasFAQ\\n\\nFind answers to common questions\\n\\n## Models and Pricing\\n\\nView all\\n\\nkimi-k2-turbo-preview\\n\\npreviewkimi-k2-thinking\\n\\nlong thinkkimi-latest\\n\\nImage understanding [...] ## More Resources\\n\\nK2 Model Details\\n\\nLearn about K2 model features and advantagesDevelopment Workbench\\n\\nInteractive interface to experience Kimi model capabilitiesWeb Search Tool\\n\\nUse Kimi's official web search toolBest Practices\\n\\nPrompt writing and application examples\\n\\nUser Manual\", \"score\": 0.63152665, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"[BUG] LiteLLM Moonshot AI Kimi K2 thinking fails on multi-turn ...\", \"url\": \"https://github.com/strands-agents/sdk-python/issues/1150\", \"content\": \"Moonshot API only accepts OpenAI-compatible content types: `\\\"text\\\"` and `\\\"image_url\\\"`. The `\\\"thinking\\\"` type works with Claude via Bedrock but is rejected by Moonshot and other OpenAI-compatible providers.\\n\\n## Possible Solution\\n\\nModify `src/strands/models/litellm.py` lines 95-102:\\n\\n```\\nif \\\"reasoningContent\\\" in content reasoning_text = content \\\"reasoningContent\\\" \\\"reasoningText\\\"# Use text type for OpenAI-compatible providers return \\\"type\\\" \\\"text\\\" \\\"text\\\" reasoning_text \\\"text\\\"\\n``` [...] opened on Nov 7, 2025\\n\\nIssue body actions\\n\\n## Checks\\n\\n I have updated to the latest minor and patch version of Strands\\n I have checked the documentation and this is not expected behavior\\n I have searched issues and there are no duplicates\\n\\n## Strands Version\\n\\n1.15.0\\n\\n## Python Version\\n\\n3.10.18\\n\\n## Operating System\\n\\nmacOS (Darwin 25.0.0)\\n\\n## Installation Method\\n\\npip install strands==1.15.0\\n\\n## Steps to Reproduce\\n\\n1. Install Strands and configure Moonshot AI via LiteLLM: [...] ```\\nfrom strands models import LiteLLMModel model = LiteLLMModel model_id =\\\"moonshot/kimi-k2-thinking\\\" client_args =\\n```\\n\\n1. Make initial request with tool calling:\\n\\n```\\nmessages = \\\"role\\\" \\\"user\\\" \\\"content\\\" \\\"text\\\" \\\"Analyze this target\\\" response = await model stream messages = messages tool_specs = tools\\n```\\n\\n1. Add response to conversation history and make second request:\", \"score\": 0.60861784, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"Kimi K2 Thinking - API, Providers, Stats - OpenRouter\", \"url\": \"https://openrouter.ai/moonshotai/kimi-k2-thinking\", \"content\": \"Kimi K2 Thinking is Moonshot AIs most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves [...] # MoonshotAI: Kimi K2 Thinking\\n\\n### moonshotai/kimi-k2-thinking\\n\\nCreated Nov 6, 2025262,144 context\\n\\n$0.60/M input tokens$2.50/M output tokens [...] For information about using third-party SDKs and frameworks with OpenRouter, please see ourframeworks documentation.\\n\\nSee the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.\", \"score\": 0.6052631, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"How to Use Kimi K2 Thinking API - Apidog\", \"url\": \"https://apidog.com/blog/kimi-k2-thinking-api/\", \"content\": \"As AI models push the boundaries of reasoning and agentic capabilities, Kimi K2 Thinking emerges as a standout innovation from Moonshot AI, blending open-source accessibility with enterprise-grade performance. This trillion-parameter thinking agent model redefines how developers interact with large language models, particularly through its robust API. Designed for tasks requiring deep inference and tool chaining, the Kimi K2 Thinking API allows seamless integration into applications, from [...] Kimi K2 Thinking represents Moonshot AI's bold step forward in open-source AI, launching as a specialized thinking agent model optimized for sequential reasoning and tool use. At its core, Kimi K2 Thinking is built to simulate human-like deliberation, processing queries through extended \\\"thinking\\\" tokens that enable multi-turn tool interactions without constant human input. This model, available via API for developers, excels in environments demanding prolonged context retention and adaptive [...] Securing access to the Kimi K2 Thinking API starts at platform.moonshot.ai. Sign up or log in, then navigate to the API console under \\\"API Keys.\\\" Click \\\"Create New Key,\\\" select permissions (e.g., chat completions, tool calls), and generate—copy the key immediately, as it's shown once.\\n\\nWith the key, configure your SDK as above. Test with a simple curl:\", \"score\": 0.5911811, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"Using the kimi-k2-thinking model - Moonshot AI Open Platform\", \"url\": \"https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The `kimi-k2-thinking` model is a general-purpose agentic reasoning model developed by Moonshot AI. Thanks to its strength in deep reasoning and multi-turn tool use, it can solve even the hardest problems.\\n\\n## \\n\\nYou can simply use it by switching the `model` parameter: [...] ### \\n\\nA: Yes, `reasoning_content` counts towards your input/output token quota. For detailed pricing, please refer to MoonshotAI's pricing documentation.\\n\\nKimi K2 QuickstartQuickstart with Kimi API [...] ```\\n$ curl \\\\\\n -H \\\"Content-Type: application/json\\\" \\\\\\n -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\\\n -d '{\\n \\\"model\\\": \\\"kimi-k2-thinking\\\",\\n \\\"messages\\\": [\\n {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"hello\\\"}\\n ],\\n \\\"temperature\\\": 1.0\\n }'\\n \\n```\\n\\n## \\n\\nIn the `kimi-k2-thinking` model API response, we use the `reasoning_content` field as the carrier for the model's reasoning. About the `reasoning_content` field:\", \"score\": 0.59033114, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"Kimi-K2-Thinking Usage Guide - vLLM Recipes\", \"url\": \"https://docs.vllm.ai/projects/recipes/en/latest/moonshotai/Kimi-K2-Think.html\", \"content\": \"```\\nvllm serve moonshotai/Kimi-K2-Thinking \\\\ \\\\ --tensor-parallel-size 8 \\\\ 8 \\\\ --enable-auto-tool-choice \\\\ \\\\ --tool-call-parser kimi_k2 \\\\ \\\\ --reasoning-parser kimi_k2 \\\\ \\\\ --trust-remote-code \\n``` [...] Skip to content\\n\\n Installing vLLM\\n Launching Kimi-K2-Thinking with vLLM\\n Metrics\\n Benchmarking \\n\\n + TP8 Benchmark Output\\n + TP8+DCP8 Benchmark Output\\n DCP Gain Analysis\\n\\n# Kimi-K2-Thinking Usage Guide¶\\n\\nKimi K2 Thinking is an advanced large language model created by moonshotai. It offers the following highlights: [...] ```\\nvllm serve moonshotai/Kimi-K2-Thinking \\\\ \\\\ --tensor-parallel-size 8 \\\\ 8 \\\\ --decode-context-parallel-size 8 \\\\ 8 \\\\ --enable-auto-tool-choice \\\\ \\\\ --tool-call-parser kimi_k2 \\\\ \\\\ --reasoning-parser kimi_k2 \\\\ \\\\ --trust-remote-code \\n```\\n\\nThe `--reasoning-parser` flag specifies the reasoning parser to use for extracting reasoning content from the model output. \\n\\n## Metrics¶\\n\\nWe tested the `GSM8K` accuracy of 2 types of launch scripts (TP8 vs TP8+DCP8)\\n\\n TP8\", \"score\": 0.5565207, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"kimi-k2-preview - AI/ML API Documentation\", \"url\": \"https://docs.aimlapi.com/api-references/text-models-llm/moonshot/kimi-k2-preview\", \"content\": \"body: JSON.stringify({\\n\\n \\n \\n model: 'moonshot/kimi-k2-preview',\\n\\n \\n \\n messages:[ [...] },\\n\\n \\n \\n json={\\n\\n \\n \\n \\\"model\\\":\\\"moonshot/kimi-k2-preview\\\",\\n\\n \\n \\n \\\"messages\\\":[ [...] messages = [\\n\\n \\n \\n {\\\"role\\\": \\\"system\\\", \\\"content\\\": \\\"You are Kimi.\\\"},\\n\\n \\n \\n {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Please search for Moonshot AI Context Caching technology and tell me what it is in English.\\\"}\", \"score\": 0.512376, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"moonshotai/Kimi-K2-Thinking - Hugging Face\", \"url\": \"https://huggingface.co/moonshotai/Kimi-K2-Thinking\", \"content\": \"Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. It sets a new state-of-the-art on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks by dramatically scaling multi-step reasoning depth and maintaining stable tool-use across 200300 sequential calls. At the same time, K2 Thinking is a native INT4 quantization model with 256k context [...] messages.append({\\n \\\"role\\\": \\\"tool\\\",\\n \\\"tool_call_id\\\": tool_call.id,\\n \\\"name\\\": tool_call_name,\\n \\\"content\\\": json.dumps(tool_result)\\n })\\n print(\\\"-\\\" 100)\\n print(choice.message.content) [...] ```\\n\\n> The recommended temperature for Kimi-K2-Thinking is `temperature = 1.0`.\\n> If no special instructions are required, the system prompt above is a good default.\\n\\n### Tool Calling\\n\\nKimi-K2-Thinking has the same tool calling settings as Kimi-K2-Instruct.\\n\\nTo enable them, you need to pass the list of available tools in each request, then the model will autonomously decide when and how to invoke them.\\n\\nThe following example demonstrates calling a weather tool end-to-end:\", \"score\": 0.5059329, \"published_date\": \"\"}, {\"index\": 9, \"title\": \"Kimi K2 is the large language model series developed by Moonshot ...\", \"url\": \"https://github.com/MoonshotAI/Kimi-K2\", \"content\": \"Notifications You must be signed in to change notification settings\\n Fork 553\\n Star 8.4k\\n\\nKimi K2 is the large language model series developed by Moonshot AI team\\n\\n### License\\n\\nView license\\n\\n8.4k stars 553 forks Branches Tags Activity\\n\\nStar\\n\\nNotifications You must be signed in to change notification settings\\n\\n# MoonshotAI/Kimi-K2\\n\\nBranchesTags\\n\\nOpen more actions menu\\n\\n## Folders and files [...] ## Contributors 10\\n\\nYou cant perform that action at this time. [...] Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.\\n\\n### Key Features\", \"score\": 0.47256237, \"published_date\": \"\"}, {\"index\": 10, \"title\": \"Quickstart with the Kimi API - Moonshot AI Open Platform\", \"url\": \"https://platform.moonshot.ai/docs/guide/start-using-kimi-api\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nQuickstart with Kimi API\\n\\n# Quickstart with the Kimi API\\n\\nThe Kimi API allows you to interact with the Kimi large language model. Here is a simple example code: [...] {\\\"role\\\": \\\"system\\\", \\\"content\\\": \\\"You are Kimi, an AI assistant provided by Moonshot AI. You are proficient in Chinese and English conversations. You provide users with safe, helpful, and accurate answers. You will reject any requests involving terrorism, racism, or explicit content. Moonshot AI is a proper noun and should not be translated.\\\"},\\n {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello, my name is Li Lei. What is 1+1?\\\"}\\n ],\\n temperature = 0.6,\\n) [...] ```\\nfrom openai import OpenAI\\n \\nclient = OpenAI(\\n api_key=\\\"MOONSHOT_API_KEY\\\", # Replace MOONSHOT_API_KEY with the API Key you obtained from the Kimi Open Platform\\n base_url=\\\"\\n)\\n \\ncompletion = client.chat.completions.create(\\n model = \\\"kimi-k2-turbo-preview\\\",\\n messages = [\", \"score\": 0.44594026, \"published_date\": \"\"}], \"total_results\": 10}",
"timestamp": "2025-11-09T11:50:47.573111",
"tool_call_id": "web_search:4",
"name": "web_search"
},
{
"role": "assistant",
"content": "让我查看更详细的官方文档确认是否存在thinking参数以及其作用。",
"timestamp": "2025-11-09T11:50:55.230973"
},
{
"role": "assistant",
"content": "",
"timestamp": "2025-11-09T11:50:55.243537",
"tool_calls": [
{
"id": "extract_webpage:5",
"index": 0,
"type": "function",
"function": {
"name": "extract_webpage",
"arguments": "{\"url\": \"https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\"}"
}
}
]
},
{
"role": "tool",
"content": "{\"success\": true, \"url\": \"https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"🌐 网页内容 (10344 字符):\\n📍 URL: https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\\n==================================================\\n🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The `kimi-k2-thinking` model is a general-purpose agentic reasoning model developed by Moonshot AI. Thanks to its strength in deep reasoning and multi-turn tool use, it can solve even the hardest problems.\\n\\n## \\n\\nYou can simply use it by switching the `model` parameter:\\n\\n```\\n$ curl https://api.moonshot.cn/v1/chat/completions \\\\\\n -H \\\"Content-Type: application/json\\\" \\\\\\n -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\\\n -d '{\\n \\\"model\\\": \\\"kimi-k2-thinking\\\",\\n \\\"messages\\\": [\\n {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"hello\\\"}\\n ],\\n \\\"temperature\\\": 1.0\\n }'\\n \\n```\\n\\n## \\n\\nIn the `kimi-k2-thinking` model API response, we use the `reasoning_content` field as the carrier for the model's reasoning. About the `reasoning_content` field:\\n\\n* In the OpenAI SDK, `ChoiceDelta` and `ChatCompletionMessage` types do not provide a `reasoning_content` field directly, so you cannot access it via `.reasoning_content`. You must use `hasattr(obj, \\\"reasoning_content\\\")` to check if the field exists, and if so, use `getattr(obj, \\\"reasoning_content\\\")` to retrieve its value.\\n* If you use other frameworks or directly interface with the HTTP API, you can directly obtain the `reasoning_content` field at the same level as the `content` field.\\n* In streaming output (`stream=True`), the `reasoning_content` field will always appear before the `content` field. In your business logic, you can detect if the `content` field has been output to determine if the reasoning (inference process) is finished.\\n* Tokens in `reasoning_content` are also controlled by the `max_tokens` parameter: the sum of tokens in `reasoning_content` and `content` must be less than or equal to `max_tokens`.\\n\\n## \\n\\n`kimi-k2-thinking` is designed to perform deep reasoning across multiple tool calls, enabling it to tackle highly complex tasks.\\n\\n### \\n\\nTo get reliable results, always follow these rules when calling the kimi-k2-thinking model :\\n\\n* Include the entire reasoning content from the context (the reasoning\\\\_content field) in your input. The model will decide which parts are necessary and forward them for further reasoning.\\n* Set max\\\\_tokens ≥ 16,000 to ensure the full reasoning\\\\_content and final content can be returned without truncation.\\n* **Set temperature = 1.0 to get the best performance.**\\n* Enable streaming (stream = true). Because kimi-k2-thinking returns both reasoning\\\\_content and regular content, the response is larger than usual. Streaming delivers a better user experience and helps avoid network-timeout issues.\\n\\n### \\n\\nWe walk through a complete example that shows how to properly use kimi-k2-thinking together with official tools for multi-step tool call and extended reasoning.\\n\\nThe example below demonstrates a \\\"Daily News Report Generation\\\" scenario. The model will sequentially call official tools like `date` (to get the date) and `web_search` (to search today's news), and will present deep reasoning throughout this process:\\n\\n```\\nimport os\\nimport json\\nimport httpx\\nimport openai\\n \\n \\nclass FormulaChatClient:\\n def __init__(self, base_url: str, api_key: str):\\n \\\"\\\"\\\"Initialize Formula client\\\"\\\"\\\"\\n self.base_url = base_url\\n self.api_key = api_key\\n self.openai = openai.Client(\\n base_url=base_url,\\n api_key=api_key,\\n )\\n self.httpx = httpx.Client(\\n base_url=base_url,\\n headers={\\\"Authorization\\\": f\\\"Bearer {api_key}\\\"},\\n timeout=30.0,\\n )\\n self.model = \\\"kimi-k2-thinking\\\"\\n \\n def get_tools(self, formula_uri: str):\\n \\\"\\\"\\\"Get tool definitions from Formula API\\\"\\\"\\\"\\n response = self.httpx.get(f\\\"/formulas/{formula_uri}/tools\\\")\\n response.raise_for_status()\\n \\n try:\\n return response.json().get(\\\"tools\\\", [])\\n except json.JSONDecodeError as e:\\n print(f\\\"Error: Unable to parse JSON (status code: {response.status_code})\\\")\\n print(f\\\"Response content: {response.text[:500]}\\\")\\n raise\\n \\n def call_tool(self, formula_uri: str, function: str, args: dict):\\n \\\"\\\"\\\"Call an official tool\\\"\\\"\\\"\\n response = self.httpx.post(\\n f\\\"/formulas/{formula_uri}/fibers\\\",\\n json={\\\"name\\\": function, \\\"arguments\\\": json.dumps(args)},\\n )\\n response.raise_for_status()\\n fiber = response.json()\\n \\n if fiber.get(\\\"status\\\", \\\"\\\") == \\\"succeeded\\\":\\n return fiber[\\\"context\\\"].get(\\\"output\\\") or fiber[\\\"context\\\"].get(\\\"encrypted_output\\\")\\n \\n if \\\"error\\\" in fiber:\\n return f\\\"Error: {fiber['error']}\\\"\\n if \\\"error\\\" in fiber.get(\\\"context\\\", {}):\\n return f\\\"Error: {fiber['context']['error']}\\\"\\n return \\\"Error: Unknown error\\\"\\n \\n def close(self):\\n \\\"\\\"\\\"Close the client connection\\\"\\\"\\\"\\n self.httpx.close()\\n \\n \\n# Initialize client\\nbase_url = os.getenv(\\\"MOONSHOT_BASE_URL\\\", \\\"https://api.moonshot.cn/v1\\\")\\napi_key = os.getenv(\\\"MOONSHOT_API_KEY\\\")\\n \\nif not api_key:\\n raise ValueError(\\\"MOONSHOT_API_KEY environment variable not set. Please set your API key.\\\")\\n \\nprint(f\\\"Base URL: {base_url}\\\")\\nprint(f\\\"API Key: {api_key[:10]}...{api_key[-10:] if len(api_key) > 20 else api_key}\\\\n\\\")\\n \\nclient = FormulaChatClient(base_url, api_key)\\n \\n# Define the official tool Formula URIs to use\\nformula_uris = [\\n \\\"moonshot/date:latest\\\",\\n \\\"moonshot/web-search:latest\\\"\\n]\\n \\n# Load all tool definitions and build mapping\\nprint(\\\"Loading official tools...\\\")\\nall_tools = []\\ntool_to_uri = {} # function.name -> formula_uri\\n \\nfor uri in formula_uris:\\n try:\\n tools = client.get_tools(uri)\\n for tool in tools:\\n func = tool.get(\\\"function\\\")\\n if func:\\n func_name = func.get(\\\"name\\\")\\n if func_name:\\n tool_to_uri[func_name] = uri\\n all_tools.append(tool)\\n print(f\\\" Loaded tool: {func_name} from {uri}\\\")\\n except Exception as e:\\n print(f\\\" Warning: Failed to load tool {uri}: {e}\\\")\\n continue\\n \\nprint(f\\\"Loaded {len(all_tools)} tools in total\\\\n\\\")\\n \\nif not all_tools:\\n raise ValueError(\\\"No tools loaded. Please check API key and network connection.\\\")\\n \\n# Initialize message list\\nmessages = [\\n {\\n \\\"role\\\": \\\"system\\\",\\n \\\"content\\\": \\\"You are Kimi, a professional news analyst. You excel at collecting, analyzing, and organizing information to generate high-quality news reports.\\\",\\n },\\n]\\n \\n# User request to generate today's news report\\nuser_request = \\\"Please help me generate a daily news report including important technology, economy, and society news.\\\"\\nmessages.append({\\n \\\"role\\\": \\\"user\\\",\\n \\\"content\\\": user_request\\n})\\n \\nprint(f\\\"User request: {user_request}\\\\n\\\")\\n \\n# Begin multi-turn conversation loop\\nmax_iterations = 10 # Prevent infinite loops\\nfor iteration in range(max_iterations):\\n try:\\n completion = client.openai.chat.completions.create(\\n model=client.model,\\n messages=messages,\\n max_tokens=1024 * 32,\\n tools=all_tools,\\n temperature=1.0,\\n )\\n except openai.AuthenticationError as e:\\n print(f\\\"Authentication error: {e}\\\")\\n print(\\\"Please check if the API key is correct and has the required permissions\\\")\\n raise\\n except Exception as e:\\n print(f\\\"Error while calling the model: {e}\\\")\\n raise\\n \\n # Get response\\n message = completion.choices[0].message\\n \\n # Print reasoning process\\n if hasattr(message, \\\"reasoning_content\\\"):\\n print(f\\\"=============Reasoning round {iteration + 1} starts=============\\\")\\n reasoning = getattr(message, \\\"reasoning_content\\\")\\n if reasoning:\\n print(reasoning[:500] + \\\"...\\\" if len(reasoning) > 500 else reasoning)\\n print(f\\\"=============Reasoning round {iteration + 1} ends=============\\\\n\\\")\\n \\n # Add assistant message to context (preserve reasoning_content)\\n messages.append(message)\\n \\n # If the model did not call any tools, conversation is done\\n if not message.tool_calls:\\n print(\\\"=============Final Answer=============\\\")\\n print(message.content)\\n break\\n \\n # Handle tool calls\\n print(f\\\"The model decided to call {len(message.tool_calls)} tool(s):\\\\n\\\")\\n \\n for tool_call in message.tool_calls:\\n func_name = tool_call.function.name\\n args = json.loads(tool_call.function.arguments)\\n \\n print(f\\\"Calling tool: {func_name}\\\")\\n print(f\\\"Arguments: {json.dumps(args, ensure_ascii=False, indent=2)}\\\")\\n \\n # Get corresponding formula_uri\\n formula_uri = tool_to_uri.get(func_name)\\n if not formula_uri:\\n print(f\\\"Error: Could not find Formula URI for tool {func_name}\\\")\\n continue\\n \\n # Call the tool\\n result = client.call_tool(formula_uri, func_name, args)\\n \\n # Print result (truncate if too long)\\n if len(str(result)) > 200:\\n print(f\\\"Tool result: {str(result)[:200]}...\\\\n\\\")\\n else:\\n print(f\\\"Tool result: {result}\\\\n\\\")\\n \\n # Add tool result to message list\\n tool_message = {\\n \\\"role\\\": \\\"tool\\\",\\n \\\"tool_call_id\\\": tool_call.id,\\n \\\"name\\\": func_name,\\n \\\"content\\\": result\\n }\\n messages.append(tool_message)\\n \\nprint(\\\"\\\\nConversation completed!\\\")\\n \\n# Cleanup\\nclient.close()\\n```\\n\\nThis process demonstrates how the `kimi-k2-thinking` model uses deep reasoning to plan and execute complex multi-step tasks, with detailed reasoning steps (`reasoning_content`) preserved in the context to ensure accurate tool use at every stage.\\n\\n## \\n\\n### \\n\\nA: Keeping the `reasoning_content` ensures the model maintains reasoning continuity in multi-turn conversations, especially when calling tools. The server will automatically handle these fields; users do not need to manage them manually.\\n\\n### \\n\\nA: Yes, `reasoning_content` counts towards your input/output token quota. For detailed pricing, please refer to MoonshotAI's pricing documentation.\\n\\n[Kimi K2 Quickstart](/docs/guide/kimi-k2-quickstart \\\"Kimi K2 Quickstart\\\")[Quickstart with Kimi API](/docs/guide/start-using-kimi-api \\\"Quickstart with Kimi API\\\")\\n==================================================\"}",
"timestamp": "2025-11-09T11:50:57.058186",
"tool_call_id": "extract_webpage:5",
"name": "extract_webpage"
},
{
"role": "assistant",
"content": "让我再查一下是否存在关于禁用思考功能的参数说明。",
"timestamp": "2025-11-09T11:51:01.784881"
},
{
"role": "assistant",
"content": "",
"timestamp": "2025-11-09T11:51:01.800322",
"tool_calls": [
{
"id": "web_search:6",
"index": 0,
"type": "function",
"function": {
"name": "web_search",
"arguments": "{\"query\": \"\\\"thinking\\\" parameter \\\"disabled\\\" moonshot api kimi-k2-thinking documentation\", \"max_results\": 8}"
}
}
]
},
{
"role": "tool",
"content": "{\"success\": true, \"summary\": \"🔍 搜索查询: \\\"thinking\\\" parameter \\\"disabled\\\" moonshot api kimi-k2-thinking documentation\\n📅 搜索时间: 2025-11-09T11:51:05.124344\\n🎯 过滤条件: Topic: general\\n\\n📝 AI摘要:\\nThe \\\"thinking\\\" parameter in the Kimi K2 Thinking API is disabled by default. To enable it, set the model parameter to \\\"kimi-k2-thinking\\\" in the API request. The model supports multi-step reasoning and tool use.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. Using the kimi-k2-thinking model\\n 🔗 https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\\n 📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The ...\\n\\n2. How to Use Kimi K2 Thinking API\\n 🔗 https://apidog.com/blog/kimi-k2-thinking-api/\\n 📄 Kimi K2 Thinking represents Moonshot AI's bold step forward in open-source AI, launching as a specialized thinking agent model optimized for sequential reasoning and tool use. At its core, Kimi K2 Thi...\\n\\n3. Kimi K2 Thinking is Here and It Beats GPT-5!\\n 🔗 https://www.analyticsvidhya.com/blog/2025/11/kimi-k2-thinking/\\n 📄 You can access the model via the chatbot.\\n Weights and code are available on Hugging Face.\\n Via API, you can simply use it by switching the `model` parameter:\\n\\n```\\n$ curl \\\\ -H \\\"Content-Type: applicat...\\n\\n4. Kimi K2 Thinking: 1T-A32B params, SOTA HLE, BrowseComp ...\\n 🔗 https://news.smol.ai/issues/25-11-06-kimi-k2/\\n 📄 Kimi released Kimi K2 Thinking, an open-source trillion-parameter reasoning model (Activity: 778): Kimi K2 Thinking is a newly released open-source trillion-parameter reasoning model by Moonshot AI, a...\\n\\n5. moonshotai/Kimi-K2-Thinking\\n 🔗 https://huggingface.co/moonshotai/Kimi-K2-Thinking\\n 📄 messages.append({\\n \\\"role\\\": \\\"tool\\\",\\n \\\"tool_call_id\\\": tool_call.id,\\n \\\"name\\\": tool_call_name,\\n \\\"content\\\": json.dumps(tool_resul...\\n\\n6. Kimi K2 Thinking: How to Run Locally | Unsloth Documentation\\n 🔗 https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally\\n 📄 Kimi-K2 and Kimi-K2-Thinking achieve SOTA performance in knowledge, reasoning, coding, and agentic tasks. The full 1T parameter models from Moonshot AI requires 1.09TB of disk space, while the quantiz...\\n\\n7. karthink/gptel: A simple LLM client for Emacs\\n 🔗 https://github.com/karthink/gptel\\n 📄 To use Claude 3.7 Sonnet model in its “thinking” mode, you can define a second Claude backend and select it via the UI or elisp:\\n\\n```\\n\\\"Claude-thinking\\\" \\\" \\\";Any name you want;:key\\\"your-API-key\\\" \\\" \\\":str...\\n\\n8. Migrating from OpenAI to Kimi API\\n 🔗 https://platform.moonshot.ai/docs/guide/migrating-from-openai-to-kimi\\n 📄 \\\"content\\\": \\\"You are Kimi, an artificial intelligence assistant provided by Moonshot AI. You are more proficient in Chinese and English conversations. You provide users with safe, helpful, and accurate...\", \"filters\": {\"topic\": \"general\"}, \"query\": \"\\\"thinking\\\" parameter \\\"disabled\\\" moonshot api kimi-k2-thinking documentation\", \"results\": [{\"index\": 1, \"title\": \"Using the kimi-k2-thinking model\", \"url\": \"https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The `kimi-k2-thinking` model is a general-purpose agentic reasoning model developed by Moonshot AI. Thanks to its strength in deep reasoning and multi-turn tool use, it can solve even the hardest problems.\\n\\n## \\n\\nYou can simply use it by switching the `model` parameter: [...] ```\\n$ curl \\\\\\n -H \\\"Content-Type: application/json\\\" \\\\\\n -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\\\n -d '{\\n \\\"model\\\": \\\"kimi-k2-thinking\\\",\\n \\\"messages\\\": [\\n {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"hello\\\"}\\n ],\\n \\\"temperature\\\": 1.0\\n }'\\n \\n```\\n\\n## \\n\\nIn the `kimi-k2-thinking` model API response, we use the `reasoning_content` field as the carrier for the model's reasoning. About the `reasoning_content` field: [...] ## \\n\\n`kimi-k2-thinking` is designed to perform deep reasoning across multiple tool calls, enabling it to tackle highly complex tasks.\\n\\n### \\n\\nTo get reliable results, always follow these rules when calling the kimi-k2-thinking model :\", \"score\": 0.6609831, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"How to Use Kimi K2 Thinking API\", \"url\": \"https://apidog.com/blog/kimi-k2-thinking-api/\", \"content\": \"Kimi K2 Thinking represents Moonshot AI's bold step forward in open-source AI, launching as a specialized thinking agent model optimized for sequential reasoning and tool use. At its core, Kimi K2 Thinking is built to simulate human-like deliberation, processing queries through extended \\\"thinking\\\" tokens that enable multi-turn tool interactions without constant human input. This model, available via API for developers, excels in environments demanding prolonged context retention and adaptive [...] As AI models push the boundaries of reasoning and agentic capabilities, Kimi K2 Thinking emerges as a standout innovation from Moonshot AI, blending open-source accessibility with enterprise-grade performance. This trillion-parameter thinking agent model redefines how developers interact with large language models, particularly through its robust API. Designed for tasks requiring deep inference and tool chaining, the Kimi K2 Thinking API allows seamless integration into applications, from [...] Securing access to the Kimi K2 Thinking API starts at platform.moonshot.ai. Sign up or log in, then navigate to the API console under \\\"API Keys.\\\" Click \\\"Create New Key,\\\" select permissions (e.g., chat completions, tool calls), and generate—copy the key immediately, as it's shown once.\\n\\nWith the key, configure your SDK as above. Test with a simple curl:\", \"score\": 0.6407488, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"Kimi K2 Thinking is Here and It Beats GPT-5!\", \"url\": \"https://www.analyticsvidhya.com/blog/2025/11/kimi-k2-thinking/\", \"content\": \"You can access the model via the chatbot.\\n Weights and code are available on Hugging Face.\\n Via API, you can simply use it by switching the `model` parameter:\\n\\n```\\n$ curl \\\\ -H \\\"Content-Type: application/json\\\" \\\\ -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\ -d '{ \\\"model\\\": \\\"kimi-k2-thinking\\\", \\\"messages\\\": [ {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"hello\\\"} ], \\\"temperature\\\": 1.0 }'\\n```\\n\\nFor more details on API use, checkout this guide.\\n\\nAlso Read: Kimi OK Computer: A Hands-On Guide to the Free AI Agent [...] Out of all the Chinese AI models available today, Moonshots Kimi is my personal favorite! Whether its generating slides from a single prompt or performing agentic web browsing, Kimi truly does it all. Just when we thought Kimi K2 was their best model, Moonshot launched an even more powerful upgrade: Kimi K2 Thinking. It is an open-source thinking agent model designed to reason, plan, and act autonomously. Built on test-time scaling, K2 Thinking dynamically expands its reasoning steps and tool [...] Kimi K2 Thinking delivered an outstanding performance on the task of simulating a historically grounded debate between Nikola Tesla and Thomas Edison on the ethics of modern AI. It accurately reflected each inventors documented philosophies. Teslas idealism, emphasis on open knowledge, and vision of technology serving humanity, versus Edisons pragmatism, commercial protectionism, and belief in controlled innovation. Extended these worldviews coherently to contemporary issues like deepfakes,\", \"score\": 0.6100127, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"Kimi K2 Thinking: 1T-A32B params, SOTA HLE, BrowseComp ...\", \"url\": \"https://news.smol.ai/issues/25-11-06-kimi-k2/\", \"content\": \"Kimi released Kimi K2 Thinking, an open-source trillion-parameter reasoning model (Activity: 778): Kimi K2 Thinking is a newly released open-source trillion-parameter reasoning model by Moonshot AI, available on Hugging Face. The model is designed to achieve state-of-the-art (SOTA) performance on the HLE benchmark, showcasing its advanced reasoning capabilities. The technical blog provides insights into its architecture and implementation, emphasizing its potential for high-performance [...] Kimi K2 Thinking Goes Agentic, Breaks SOTA: Moonshot AI launched Kimi K2 Thinking with a 256K context window and autonomous 200300 tool calls, claiming SOTA on HLE (44.9%) and BrowseComp (60.2%); see the technical blog Kimi K2 Thinking and weights on moonshotai at Hugging Face. The model targets reasoning, agentic search, and coding, and is live on kimi.com with API at platform.moonshot.ai. [...] Kimi K2 Thinking Huggingface (Activity: 250): Kimi K2 Thinking is a cutting-edge open-source reasoning model from Huggingface, featuring a `1 trillion` parameter Mixture-of-Experts (MoE) architecture. It utilizes native `INT4` quantization, contrary to the stated `I32`, and employs Quantization-Aware Training (QAT) for enhanced inference speed and accuracy. The model excels in benchmarks like Humanitys Last Exam (HLE) and BrowseComp, supporting `200-300` tool calls with stable long-horizon\", \"score\": 0.5914643, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"moonshotai/Kimi-K2-Thinking\", \"url\": \"https://huggingface.co/moonshotai/Kimi-K2-Thinking\", \"content\": \"messages.append({\\n \\\"role\\\": \\\"tool\\\",\\n \\\"tool_call_id\\\": tool_call.id,\\n \\\"name\\\": tool_call_name,\\n \\\"content\\\": json.dumps(tool_result)\\n })\\n print(\\\"-\\\" 100)\\n print(choice.message.content) [...] Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. It sets a new state-of-the-art on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks by dramatically scaling multi-step reasoning depth and maintaining stable tool-use across 200300 sequential calls. At the same time, K2 Thinking is a native INT4 quantization model with 256k context [...] print(f\\\"k2 answer: {response.choices.message.content}\\\")\\n print(\\\"=====below is reasoning content======\\\")\\n print(f\\\"reasoning content: {response.choices.message.reasoning_content}\\\")\", \"score\": 0.58465123, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"Kimi K2 Thinking: How to Run Locally | Unsloth Documentation\", \"url\": \"https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally\", \"content\": \"Kimi-K2 and Kimi-K2-Thinking achieve SOTA performance in knowledge, reasoning, coding, and agentic tasks. The full 1T parameter models from Moonshot AI requires 1.09TB of disk space, while the quantized Unsloth Dynamic 1.8-bit version reduces this to just 230GB (-80% size): Kimi-K2-GGUF\\n\\nYou can now run Kimi-K2-Thinking with our new GGUFs. Use our same settings below but ensure you change the model name from 'Kimi-K2-Instruct' to 'Kimi-K2-Instruct-0905': K2-0905 GGUFs [...] ### 🤔No Thinking Tags?\\n\\nYou may notice that there are no thinking tags when you run the model. This is normal and intended behavior.\\n\\nIn your `llama.cpp` script, make sure to include the `--special` flag at the very end of your command. Once you do, youll see the token appear as expected. [...] ```\\n You are Kimi, an AI assistant created by Moonshot AI.You are Kimi, an AI assistant created by Moonshot AI.\\n ```\\n\\nWe recommend setting min\\\\_p to 0.01 to suppress the occurrence of unlikely tokens with low probabilities.\\n\\n### 🔢 Chat template and prompt format\", \"score\": 0.5835125, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"karthink/gptel: A simple LLM client for Emacs\", \"url\": \"https://github.com/karthink/gptel\", \"content\": \"To use Claude 3.7 Sonnet model in its “thinking” mode, you can define a second Claude backend and select it via the UI or elisp:\\n\\n```\\n\\\"Claude-thinking\\\" \\\" \\\";Any name you want;:key\\\"your-API-key\\\" \\\" \\\":stream t:models:request-params:thinking:type \\\"enabled\\\" \\\" \\\":budget_tokens 2048:max_tokens 4096\\n```\\n\\nYou can set the reasoning budget tokens and max tokens for this usage via the `:budget_tokens` and `:max_tokens` keys here, respectively.\", \"score\": 0.5177909, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"Migrating from OpenAI to Kimi API\", \"url\": \"https://platform.moonshot.ai/docs/guide/migrating-from-openai-to-kimi\", \"content\": \"\\\"content\\\": \\\"You are Kimi, an artificial intelligence assistant provided by Moonshot AI. You are more proficient in Chinese and English conversations. You provide users with safe, helpful, and accurate answers. You also refuse to answer any questions involving terrorism, racism, pornography, or violence. Moonshot AI is a proper noun and should not be translated into other languages.\\\"},\", \"score\": 0.48499045, \"published_date\": \"\"}], \"total_results\": 8}",
"timestamp": "2025-11-09T11:51:05.128304",
"tool_call_id": "web_search:6",
"name": "web_search"
},
{
"role": "assistant",
"content": "基于我查到的官方文档和技术资料,我可以明确回答你的问题:\n\n## 答案:**仍然会思考**\n\n### 原因分析:\n\n1. **Kimi-k2-thinking模型没有thinking参数**\n - 官方API文档中没有任何关于`thinking`参数的说明\n - 该模型是**专门设计为思考模式**的,思考是其固有特性\n\n2. **模型架构决定**\n - `kimi-k2-thinking`是一个**思考代理模型**thinking agent\n - 它的核心特性就是**原生具备思考能力**\n - 无论传入什么参数,它都会进行推理过程\n\n3. **思考过程不可禁用**\n - 官方文档强调该模型\"擅长深度推理\"\n - 会输出`reasoning_content`字段显示思考内容\n - 这是模型的**内置行为**,无法通过参数关闭\n\n### 正确的控制方式:\n\n| 需求 | 正确做法 | 错误做法 |\n|------|----------|----------|\n| 启用思考 | 使用`model: \"kimi-k2-thinking\"` | 传`thinking: true` |\n| 禁用思考 | 使用`model: \"kimi-k2-instruct\"` | 传`thinking: false` |\n\n### 总结:\n传入`thinking: disabled`参数**不会生效**`kimi-k2-thinking`模型**仍然会思考**。如果你想禁用思考功能,应该使用`kimi-k2-instruct`模型而不是试图通过参数控制思考行为。",
"timestamp": "2025-11-09T11:51:23.959223"
}
],
"todo_list": null,
"metadata": {
"project_path": "/opt/agent/agents/users/jojo/project",
"project_relative_path": "users/jojo/project",
"thinking_mode": false,
"total_messages": 25,
"total_tools": 14,
"status": "active"
},
"token_statistics": {
"total_input_tokens": 266246,
"total_output_tokens": 1673,
"updated_at": "2025-11-09T12:05:51.659516"
}
}