agent/users/jojo/data/conversations/conv_20251109_113433_663.json

{
  "id": "conv_20251109_113433_663",
  "title": "在请求月之暗面的kimi-k2模型时，是如何控制模型思考还是不思考的？是通过请求时加入thinkin...",
  "created_at": "2025-11-09T11:34:33.663230",
  "updated_at": "2025-11-09T12:05:51.659274",
  "messages": [
    {
      "role": "user",
      "content": "在请求月之暗面的kimi-k2模型时，是如何控制模型思考还是不思考的？是通过请求时加入thinking参数，还是通过修改模型的名称为kimi-k2-thinking？帮我查一下",
      "timestamp": "2025-11-09T11:35:10.641137"
    },
    {
      "role": "assistant",
      "content": "我来帮你查一下月之暗面Kimi-k2模型控制思考模式的机制。让我搜索相关的技术文档和使用说明。",
      "timestamp": "2025-11-09T11:35:13.140118"
    },
    {
      "role": "assistant",
      "content": "",
      "timestamp": "2025-11-09T11:35:13.147109",
      "tool_calls": [
        {
          "id": "web_search:0",
          "index": 0,
          "type": "function",
          "function": {
            "name": "web_search",
            "arguments": "{\"query\": \"月之暗面 kimi k2 模型 thinking 参数 思考模式 控制方法\", \"max_results\": 10}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\"success\": true, \"summary\": \"🔍 搜索查询: 月之暗面 kimi k2 模型 thinking 参数 思考模式 控制方法\\n📅 搜索时间: 2025-11-09T11:35:18.395158\\n🎯 过滤条件: Topic: general\\n\\n📝 AI摘要:\\nKimi K2 Thinking is a powerful open-source model with advanced reasoning and tool use capabilities. It can execute up to 300 consecutive tool calls without human intervention. It's designed to perform complex tasks through long-term planning and reasoning.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. 使用kimi-k2-thinking思考模型 - Moonshot AI 开放平台- 月之暗面\\n   🔗 https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\\n   📄 > `kimi-k2-thinking` 模型是月之暗面提供的具有通用 Agentic 能力和推理能力的思考模型，它擅长深度推理，并可通过多步工具调用，帮助解决各类难题。\\n\\n基本示例` 来判断是否存在字段，如果存在，则使用 `getattr(obj, \\\"reasoning_content\\\")` 获取字段值\\n   如果你使用其他框架或自行通过 HTTP 接口对接，可以直接获取与 `content` ...\\n\\n2. 开源新纪录，月之暗面Kimi K2实测超越GPT-5和Claude4.5，完全免费\\n   🔗 https://www.huxiu.com/article/4802189.html\\n   📄 对于大多数研究和企业应用，这一条款相当于一个轻量级的署名要求，同时保留了标准MIT协议的自由度。\\n\\n这使得K2 Thinking成为目前可用的最宽松授权的前沿级模型之一。\\n\\n新的基准测试领跑者\\n\\nKimi K2 Thinking是一个基于万亿参数的混合专家（MoE）模型，每次推理激活320亿参数。\\n\\n它将长程推理与结构化工具使用相结合，能够在无需人工干预的情况下执行200-300次连续的工具调用。...\\n\\n3. 详解Kimi K2 Thinking：谢谢DeepSeek，我先去干掉GPT-5了 - 36氪\\n   🔗 https://m.36kr.com/p/3543851834322816\\n   📄 Raschka的分析指出了一个关键事实，Kimi K2 Thinking对DeepSeek核心架构的“继承”是显而易见的，包括MoE机制、MLA（多头潜在注意力）等设计。这是在一个已被验证的基座上，根据自身的目标进行了针对性的调整和优化。例如，减少注意力头和激活参数量，旨在降低推理成本；而增加专家数量和词汇表，则是为了增强模型的知识容量和表达能力。这种“站在巨人肩膀上”的做法，是开源精神最直接的体...\\n\\n4. 离GPT-5最近的一次！中国1万亿参数开源巨兽突然爆火 - 智源社区\\n   🔗 https://hub.baai.ac.cn/view/50171\\n   📄 # 离GPT-5最近的一次！中国1万亿参数开源巨兽突然爆火\\n\\nMultimodalAIDL\\n\\n新智元 2025-11-08 14:20 分享\\n\\n### \\n\\n编辑：倾倾 桃子\\n\\n##### 【新智元导读】Kimi K2 Thinking重磅开源，1万亿「思考Agent模型」在推理、智能体基准上干翻GPT-5。关键，还能连调300次工具，直出3D模拟。\\n\\n昨天，月之暗面发布全新模型Kimi K2 Th...\\n\\n5. Kimi 迄今能力最强开源思考模型，月之暗面Kimi K2 Thinking 发布\\n   🔗 https://www.ithome.com/0/895/484.htm\\n   📄 据介绍，该模型是月之暗面基于“模型即 Agent”理念训练的新一代 Thinking Agent，原生掌握“边思考，边使用工具”的能力。在人类最后的考试（Humanity's Last Exam）、自主网络浏览能力（BrowseComp）、复杂信息收集推理（SEAL-0）等多项基准测试中表现达到 SOTA 水平，并在 Agentic 搜索、Agentic 编程、写作和综合推理能力等方面取得全面提升...\\n\\n6. 国产之光，Kimi K2 Thinking 原创 - CSDN博客\\n   🔗 https://blog.csdn.net/twelveai/article/details/154575275\\n   📄 本文介绍Kimi K2-0905端侧推理SDK的零代码部署方案，帮助开发者解决边缘AI项目中的高延迟问题。内容涵盖SDK安装、模型加载与推理调用、批处理优化等延迟优化技巧，以及性能监控方法。通过本地模型加载、缓存策略等优化手段，可实现42%的延迟降低。该SDK提供零代码接口，简化部署流程，开发者可快速验证边缘设备上的推理效果。\\n\\nKimi-k2-thinking模型发布\\n\\n架构师刘7的博客\\n\\n11...\\n\\n7. 刚刚月之暗面发布Kimi K2 Thinking模型，藏师傅首测教你用Kimi 编程 ...\\n   🔗 https://m.aitntnews.com/newDetail.html?newId=19581\\n   📄 `uv tool install --python 3.13 kimi-cli`\\n\\n安装后在终端打开你的项目目录输入 kimi 就可以启动 Kimi CLI 了。\\n\\n首次启动后会让你选登录方式，选第一个然后输入刚才从 KFC 获取的 API Key 就行。\\n\\n进去之后按 Tab 键就可以启用思考，这时候用的就是 K2-Thinking 模型了Claude Code 启用思考也是一样的操作。\\n\\n当然...\\n\\n8. Kimi K2 Thinking突袭！智能体&推理能力超GPT-5，网友 - 量子位\\n   🔗 https://www.qbitai.com/2025/11/349957.html\\n   📄 在人类最后的考试（HLE）、测试自主网络浏览能力的BrowseComp，和复杂信息收集推理基准测试SEAL-0等评测基准中，Kimi K2 Thinking都刷新了SOTA，超越GPT-5、Claude Sonnet 4.5（Thinking）等闭源模型。\\n\\nKimi K2 Thinking的代码和权重均遵循最为宽松的MIT协议。新模型也已第一时间上线kimi.com和最新版Kimi手机应用，即刻...\\n\\n9. Kimi K2 Thinking模型发布并开源，该模型哪些信息值得关注？ - 知乎\\n   🔗 https://www.zhihu.com/question/1969558404759544488\\n   📄 HLE 和IMO (确切说是这两天谷歌新出的IMO answerbench）都刷到了SOTA，而且是SOTA，不是“国产SOTA ”。（值得一提的是IMO这个新bench在kimi出来之前sota居然是Deepseek而不是北美模型）\\n\\n不知道heavy模式会不会开放给用户订阅，如果开放的话正好可以看看国内愿意每个月花1000左右买sota的用户到底有多少。\\n\\n这个公司估值只有openai的0.5...\\n\\n10. Kimi K2 快速开始 - Moonshot AI 开放平台\\n   🔗 https://platform.moonshot.cn/docs/guide/kimi-k2-quickstart\\n   📄 🚀 最新发布 kimi-k2-0905-preview 模型，上下文窗口扩展至 256K，代码能力更强，快来体验吧！\\n\\n文档\\n\\n入门指南\\n\\nKimi K2 快速开始\\n\\n# Kimi K2 快速开始\\n\\n## \\n\\nKimi K2 是月之暗面推出的新一代具备自主行动 (agentic) 能力的旗舰模型。采用 1T 总参数量、32B 激活参数的 MoE 架构设计。该模型在代码编程 (AI coding) ...\", \"filters\": {\"topic\": \"general\"}, \"query\": \"月之暗面 kimi k2 模型 thinking 参数 思考模式 控制方法\", \"results\": [{\"index\": 1, \"title\": \"使用kimi-k2-thinking思考模型 - Moonshot AI 开放平台- 月之暗面\", \"url\": \"https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"> `kimi-k2-thinking` 模型是月之暗面提供的具有通用 Agentic 能力和推理能力的思考模型，它擅长深度推理，并可通过多步工具调用，帮助解决各类难题。\\n\\n基本示例` 来判断是否存在字段，如果存在，则使用 `getattr(obj, \\\"reasoning_content\\\")` 获取字段值\\n   如果你使用其他框架或自行通过 HTTP 接口对接，可以直接获取与 `content` 字段同级的 `reasoning_content` 字段\\n   在流式输出（`stream=True`）的场合，`reasoning_content` 字段一定会先于 `content` 字段出现，你可以在业务代码中通过判断是否出现 `content` 字段来识别思考内容（或称推理过程）是否结束\\n   `reasoning_content` 中包含的 Tokens 也受 `max_tokens` 参数控制，`reasoning_content` 的 Tokens 数加上 `content` 的 Tokens 数应小于等于 `max_tokens` [...] 多步工具调用，模型会根据实际情况选择把必要的思考内容送到模型进行推理。\\n   设置 `max_tokens>=16000` 以避免无法输出完整的 `reasoning_content` 和 `content`。\\n   设置 `temperature=1.0`，以获得最佳性能。\\n   使用流式输出（`stream=True`）：`kimi-k2-thinking` 模型的输出内容包含了 `reasoning_content`，相比普通模型其输出内容更多，启用流式输出能获得更好的用户体验，同时一定程度避免网络超时问题。 [...] Kimi K2 快速开始\\n       使用 kimi-k2-thinking 思考模型\\n       开始使用 Kimi API\\n       从 OpenAI 迁移到 Kimi API\\n       使用 API 调试工具\\n       使用 Kimi API 进行多轮对话\\n       使用 Vision 视觉模型\\n       自动选择 Kimi 模型\\n       自动断线重连\\n       使用 Stream 流式输出\\n       使用 Tool Calls\\n       使用联网搜索 Tool\\n       使用 JSON Mode\\n       使用 Partial Mode\\n       使用 Kimi API 进行文件问答\\n       使用开发工作台调试模型\\n       在 ClaudeCode/Cline/RooCode 中使用 kimi-k2 模型\\n       在开发工作台中配置 ModelScope MCP 服务器\\n       在 Kimi API 中使用官方工具\\n       使用 Kimi CLI 调用 Kimi 大模型\", \"score\": 0.8325776, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"开源新纪录，月之暗面Kimi K2实测超越GPT-5和Claude4.5，完全免费\", \"url\": \"https://www.huxiu.com/article/4802189.html\", \"content\": \"对于大多数研究和企业应用，这一条款相当于一个轻量级的署名要求，同时保留了标准MIT协议的自由度。\\n\\n这使得K2 Thinking成为目前可用的最宽松授权的前沿级模型之一。\\n\\n新的基准测试领跑者\\n\\nKimi K2 Thinking是一个基于万亿参数的混合专家（MoE）模型，每次推理激活320亿参数。\\n\\n它将长程推理与结构化工具使用相结合，能够在无需人工干预的情况下执行200-300次连续的工具调用。\\n\\n性能数据一览\\n\\n根据月之暗面公布的测试结果，K2 Thinking取得了：\\n\\n 44.9%-Humanity’s Last Exam（HLE），达到业界最先进水平\\n 60.2%-BrowseComp（智能体网络搜索和推理测试）\\n 71.3%-SWE-Bench Verified和83.1%-LiveCodeBench v6（关键编程评估）\\n 56.3%-Seal-0（真实世界信息检索基准）\\n  \\n\\n在这些任务中，K2 Thinking持续超越GPT-5的相应得分，并超过了几周前MiniMax AI发布的前开源领跑者MiniMax-M2。\\n\\n开源模型超越专有系统 [...] M2和K2 Thinking的快速接连发布，展示了开源研究追赶前沿系统的速度有多快。MiniMax-M2证明了开源模型可以以一小部分计算成本接近GPT-5级别的智能体能力。月之暗面现在将这一前沿推进得更远，将开源权重从平衡推向了彻底领先。\\n\\n两个模型都依赖稀疏激活来提高效率，但K2 Thinking更高的激活数（320亿vs 100亿活跃参数）在各个领域产生了更强的推理保真度。其测试时缩放——扩展\\\"思维tokens\\\"和工具调用轮次——提供了可测量的性能提升，无需重新训练，这是MiniMax-M2中尚未观察到的特性。\\n\\n技术展望\\n\\n月之暗面报告称，K2 Thinking支持原生INT4推理和256k token上下文，性能下降微乎其微。其架构集成了量化、并行轨迹聚合（“重度模式”）以及针对推理任务调优的混合专家路由。\\n\\n在实践中，这些优化使K2 Thinking能够维持复杂的规划循环——代码编译-测试-修复、搜索-分析-总结——跨越数百次工具调用。这种能力支撑了它在BrowseComp和SWE-Bench上的优异表现，而推理连续性在这些任务中至关重要。\\n\\n对AI生态系统的巨大影响 [...] K2 Thinking的核心能力在于其显式推理轨迹。模型输出一个辅助字段reasoning\\\\_content，在每个最终响应之前揭示中间逻辑。这种透明性在长时间多轮任务和多步骤工具调用中保持了连贯性。\\n\\n月之暗面发布的参考实现演示了模型如何自主执行\\\"每日新闻报告\\\"工作流：调用日期和网络搜索工具、分析检索内容、生成结构化输出——同时保持内部推理状态。\\n\\n这种端到端的自主性使模型能够在数百个步骤中进行规划、搜索、执行和综合证据，反映了正在崛起的\\\"智能体AI\\\"系统类别，这些系统以最少的监督运行。\\n\\n效率与访问成本\\n\\n尽管规模达到万亿参数，K2 Thinking的运行成本保持适中。月之暗面列出的使用价格为：\\n\\n $0.15/100万tokens（缓存命中）\\n $0.60/100万tokens（缓存未命中）\\n $2.50/100万tokens（输出）\\n  \\n\\n这些价格甚至优于MiniMax-M2的$0.30输入/$1.20输出定价——比GPT-5（$1.25输入/$10输出）低了一个数量级。\\n\\n对比背景：开源权重加速\", \"score\": 0.7732339, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"详解Kimi K2 Thinking：谢谢DeepSeek，我先去干掉GPT-5了 - 36氪\", \"url\": \"https://m.36kr.com/p/3543851834322816\", \"content\": \"Raschka的分析指出了一个关键事实，Kimi K2 Thinking对DeepSeek核心架构的“继承”是显而易见的，包括MoE机制、MLA（多头潜在注意力）等设计。这是在一个已被验证的基座上，根据自身的目标进行了针对性的调整和优化。例如，减少注意力头和激活参数量，旨在降低推理成本；而增加专家数量和词汇表，则是为了增强模型的知识容量和表达能力。这种“站在巨人肩膀上”的做法，是开源精神最直接的体现。\\n\\n除了对DeepSeek架构的继承，Kimi K2 Thinking的成果也离不开对整个开源生态成果的广泛“化用”。从底层用于加速注意力计算的FlashAttention，到K2技术报告中提到的、为解决训练不稳定性而改进的MuonClip优化器，再到各种数据处理和后训练方法论，都整合了开源社区的集体智慧。\\n\\n如果说架构和开源技术决定了模型的骨架，那么让其血肉丰满的，则是月之暗面自身的工程实现能力。这主要体现在三个方面： [...] Kimi K2 Thinking是基于 Kimi K2 模型训练而来的，专注于提升Agentic能力和推理能力。这是一个拥有1万亿总参数的混合专家模型（Mixture-of-Experts, MoE），每次推理激活约320亿参数，支持256k的上下文窗口，并采用了原生INT4量化技术。设计思路是在保持巨大模型规模的同时，想办法控制计算成本和训练成本。据CNBC援引知情人士的报道，该模型的训练成本仅为460万美元。作为对比，DeepSeek披露的 V3训练成本（租赁价，正式训练阶段）是560 万美元，R1为29.4万美元。这里主要考虑的是GPU预训练费用，不包括研发、基础设施等投资。\\n\\nKimi K2 Thinking的一个核心特性是Agent能力，官方宣称它能够连续执行200-300次工具调用来解决复杂问题。Grok-4等闭源阵营广泛采用RL提升工具使用与长程规划，但在开源模型中看到如此的实现还是第一次。它表明开源社区正在快速跟上智能体技术的前沿，同时也对模型托管服务提出了更高的要求。 [...] 1、训练稳定性：在长达15.5万亿tokens的预训练过程中，Kimi K2 Thinking实现了“零loss spike（损失尖峰）”。这意味着训练过程极其稳定，无需因模型崩溃而进行成本高昂的回滚。这对于万亿参数规模的模型来说，是一项重要的工程成就。\\n\\n2、原生量化推理：Kimi K2 Thinking支持原生INT4量化推理，据称能在极小的性能损失下，将推理速度提升约2倍，并显著降低部署所需的GPU显存。这是将大参数模型从实验室推向广泛应用的关键。\\n\\n3、长程任务执行：模型能够稳定执行200-300轮工具调用，这不仅考验了模型的推理能力，也检验了其系统鲁棒性。在长达数百步的交互中，模型必须能处理各种异常，背后需要一套复杂的工程机制。\", \"score\": 0.76826453, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"离GPT-5最近的一次！中国1万亿参数开源巨兽突然爆火 - 智源社区\", \"url\": \"https://hub.baai.ac.cn/view/50171\", \"content\": \"# 离GPT-5最近的一次！中国1万亿参数开源巨兽突然爆火\\n\\nMultimodalAIDL\\n\\n新智元 2025-11-08 14:20 分享\\n\\n### \\n\\n编辑：倾倾 桃子\\n\\n##### 【新智元导读】Kimi K2 Thinking重磅开源，1万亿「思考Agent模型」在推理、智能体基准上干翻GPT-5。关键，还能连调300次工具，直出3D模拟。\\n\\n昨天，月之暗面发布全新模型Kimi K2 Thinking，一上线就挤爆了服务器。\\n\\n思考，是它的核心卖点，自称是开源的「思考Agent模型」。\\n\\n它同样采用了MoE架构，总参数约1万亿，每次激活约320亿，上下文256K token。\\n\\n在各大基准测试中，Kimi K2 Thinking性能表现亮眼。\\n\\n尤其是，在BrowseComp、HLE测试中，实力完全碾压GPT-5、Claude Sonnet 4.5。\\n\\n在Tau2 Bench Telecom基准测试中，K2 Thinking位列第一。\\n\\n最关键的是，在无人干预情况下，K2 Thinking可连续调用200-300次工具。 [...] 这意味着，K2 Thinking的测试成绩就是它在真实环境下的表现，而非理想化打分。\\n\\n它目前已经在kimi.com上线，并开源API和模型权重，开发者可以直接试用。\\n\\n从实验室到真实场景，这个模型的「思考能力」明显超过了现有的其他模型。\\n\\n智能体编码一流，300次工具调用\\n\\n这一次，月之暗面没再让模型停留在论文里。\\n\\nK2 Thinking不是展示品，而是一台真正能被人用起来的智能体。\\n\\n发布当天，团队同步上线了 kimi.com聊天模式、开放了API接口，还在Hugging Face公布了完整权重。\\n\\n开发者不需要等待内测邀请，也不用注册繁琐流程，任何人都能直接使用。\\n\\nK2 Thinking的从训练开始，到优化，再到上线，周期不到半年。\\n\\n在这个动辄以年为单位更新的大模型时代，这个速度意味着它已具备完整的工程化能力。\\n\\n打开kimi.com，就能直接体验到K2 Thinking的思考过程。\\n\\n与一般聊天模型不同，它在生成答案前，会清晰地展示自己的推理链。\\n\\n此外，研究人员特别提到，K2 Thinking在软件和编码任务上进步显著。 [...] 国外研究者Nathan Lambert 称它为：「开源模型距闭源前沿最近的一次。」\\n\\n这句话在技术圈广为流传，人们也开始重新审视这款模型。\\n\\n不只是聊天工具，K2 Thinking更像是一个会自己推理、自己动手的智能体。\\n\\n一款真正会思考的模型\\n\\nKimi K2 Thinking没有强调算力更大，而是强调更会「思考」。\\n\\n这些配置让它在处理长文本、复杂任务时能维持更稳定的推理过程。\\n\\n苹果大牛Awni Hannun测试后惊叹道：\\n\\n1万亿参数，只用2台M3 Ultra芯片的Mac电脑即可流畅运行，而且int4压缩后性能几乎无损。\\n\\n通过mlx-lm并行技术，它生成了大约3500个token，速度每秒15个token。\\n\\n但真正让人关注的，是它的「思考能力」。\\n\\n如前所述，K2 Thinking可以在一次任务中连续执行200到300次工具调用，全程无需人工干预。\\n\\n有网友实测「工具调用」，立即制作出如下的数学和物理讲解动画。\\n\\n不同于其他模型的胡编乱造，它在面对复杂问题时，会自己拆解步骤、搜索信息、调用外部工具、再整合结果。\", \"score\": 0.7485182, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"Kimi 迄今能力最强开源思考模型，月之暗面Kimi K2 Thinking 发布\", \"url\": \"https://www.ithome.com/0/895/484.htm\", \"content\": \"据介绍，该模型是月之暗面基于“模型即 Agent”理念训练的新一代 Thinking Agent，原生掌握“边思考，边使用工具”的能力。在人类最后的考试（Humanity's Last Exam）、自主网络浏览能力（BrowseComp）、复杂信息收集推理（SEAL-0）等多项基准测试中表现达到 SOTA 水平，并在 Agentic 搜索、Agentic 编程、写作和综合推理能力等方面取得全面提升。\\n\\n该模型无需人类干预，即可自主实现高达 300 轮的工具调用和持续稳定的多轮思考能力，从而帮助用户解决更复杂的问题。\\n\\nIT之家附 Hugging Face、ModelScope 部署链接如下：\\n\\n Hugging Face：\\n ModelScope：\\n\\n“人类最后的考试”是一项涵盖 100 多个专业领域的终极封闭式学术测试。在允许使用工具 —— 搜索、Python、网络浏览工具的同等情况下，Kimi K2 Thinking 在这项基准评测中取得了 44.9% 的 SOTA 成绩。 [...] 在官方提供的另一个例子中，Kimi K2 Thinking 经过两轮搜索和思考，先根据股票回购的已知信息找到了这家制造快艇的公司，然后在美国证券交易委员会（SEC）的官网上找到了股票回购公告信息，得出了准确的答案：\\n\\nKimi K2 Thinking 模型的编码能力也得到了增强，在多语言软件工程基准 SWE-Multilingual、SWE-bench 验证集和 Terminal 终端使用等基准测试中的表现有了进一步提升。\\n\\n月之暗面表示，Kimi K2 Thinking 的通用基础能力也得到了升级： [...] 官方提供的示例中，Kimi K2 Thinking 经过 5 轮搜索和推理，结合每轮搜索到的新信息，层层深入，最终推理出了答案：\\n\\n根据介绍，在复杂搜索和浏览场景中，Kimi K2 Thinking 模型也表现出色。BrowseComp 是由 OpenAI 发布的一个专门评估 AI Agent 网络浏览能力的基准测试，这项测试的初衷是衡量 AI Agent 在信息过载环境中展现出的坚持性与创造力，即能否像人类研究员一样“刨根问底”。在这项极具挑战的任务上，人类平均只能达到 29.2% 的成绩。Kimi K2 Thinking 在这项基准测试中展现出极强的钻研能力，以 60.2% 的成绩成为新的 SOTA 模型。\\n\\n在长程规划和自主搜索能力的驱动下，Kimi K2 Thinking 可借助多达上百轮的“思考 → 搜索 → 浏览网页 → 思考 → 编程”动态循环，持续地提出并完善假设、验证证据、进行推理，并构建出逻辑一致的答案。这种边主动搜索边持续思考的能力，使 Kimi K2 Thinking 能够将模糊且开放式的问题分解为清晰、可执行的子任务。\", \"score\": 0.72814995, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"国产之光，Kimi K2 Thinking 原创 - CSDN博客\", \"url\": \"https://blog.csdn.net/twelveai/article/details/154575275\", \"content\": \"本文介绍Kimi K2-0905端侧推理SDK的零代码部署方案，帮助开发者解决边缘AI项目中的高延迟问题。内容涵盖SDK安装、模型加载与推理调用、批处理优化等延迟优化技巧，以及性能监控方法。通过本地模型加载、缓存策略等优化手段，可实现42%的延迟降低。该SDK提供零代码接口，简化部署流程，开发者可快速验证边缘设备上的推理效果。\\n\\nKimi-k2-thinking模型发布\\n\\n架构师刘7的博客\\n\\n11-06\\n\\n641\\n\\n月之暗面发布Kimi-k2-thinking模型，强化推理与Agent任务能力 摘要：MoonshotAI正式推出Kimi-k2-thinking模型，该256K长文本模型专为深度推理和多轮工具调用优化。基于万亿参数MoE架构，具备MuonClip优化器、大规模Agentic数据合成等创新技术，在代码生成、数学推理等场景表现优异。提供标准版和Turbo高速版两种计费方案，支持API接入和多框架兼容。此次发布延续了月之暗面从长文本到智能体的技术演进路径，以低成本高性价比对标行业顶级推理模型，推动AI智能体应用 [...] 目前，Kimi K2 Thinking 已经在 Kimi 官网上线，直接通过聊天模式就能使用，但月之暗面也表示，为了确保快速、轻量级体验，Kimi 官网的聊天模式下仅选择性使用部分工具，并减少工具调用轮次。因此，在 Kimi 官网上聊天体验可能无法复现基准测试成绩。\\n\\n完整智能体模式 将在近期更新，以展示 K2 Thinking 的全部能力。你也可以通过 Kimi K2 Thinking API 接入使用。\\n\\n作为 AI 编程博主，比较关注 Kimi K2 Thinking 模型的编程能力，先看下官方关于编程能力的介绍：\\n\\nK2 Thinking 在编程与软件开发任务上展现出显著的进步。\\n\\nSWE-Bench Verified：71.3%\\n\\n这些成绩表明，K2 Thinking 在不同编程语言与智能体框架（agent scaffolds） 间具备极强的泛化能力。\\n\\n在 HTML、React 及依赖大量组件的前端开发任务中，模型表现尤为突出，能够将抽象的想法转化为完整可运行、响应迅速的产品。 [...] 近日，互联网上一则来自Kimi员工的推文引发了AI领域的广泛关注。该员工在社交平台X上发文称“我嗅到了一个伟大模型的到来”，这一神秘的言论迅速在评论区引发热议，人们纷纷猜测究竟是什么样的模型即将问世。经过深入挖掘，大家才发现原来是Kimi团队悄然发布了全新的K2思考模型。与以往不同的是，这次K2模型的发布异常低调，官方并未进行大规模的宣传推广，仅仅是在官方API文档中默默地更新了模型名称。\\n此次\\n\\n大模型学习必藏：Kimi K2 Thinking开源发布，性能直逼GPT-5\\n\\nstar\\\\_nwe的博客\\n\\n11-07\\n\\n589\\n\\n月之暗面发布开源大模型Kimi K2 Thinking，多项基准测试性能超越GPT-5等闭源模型。该模型擅长多轮工具调用和持续思考，具备自主网络浏览等先进能力。通过INT4量化技术实现万亿参数低成本运行，已采用宽松MIT许可证开源。这一突破或标志着国产大模型发展的新拐点。\\n\\ndify智能体-Kimi-K2+Mermaid 神器，一键生成系统架构图.yml\\n\\n07-27\", \"score\": 0.7208996, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"刚刚月之暗面发布Kimi K2 Thinking模型，藏师傅首测教你用Kimi 编程 ...\", \"url\": \"https://m.aitntnews.com/newDetail.html?newId=19581\", \"content\": \"`uv tool install --python 3.13 kimi-cli`\\n\\n安装后在终端打开你的项目目录输入 kimi 就可以启动 Kimi CLI 了。\\n\\n首次启动后会让你选登录方式，选第一个然后输入刚才从 KFC 获取的 API Key 就行。\\n\\n进去之后按 Tab 键就可以启用思考，这时候用的就是 K2-Thinking 模型了Claude Code 启用思考也是一样的操作。\\n\\n当然这个依旧可以在 Cluade Code 里面使用，如果嫌配置环境变量麻烦的话，用我的《ai-claude-start》项目就行，创建的时候填写下面内容。\\n\\n`URL:  kimi-k2-thinkingAPI Key: 你刚才在 Kimi For Coding 获取的 API`\\n\\n好了接下来就是模型测试了。\\n\\n我们这次的测试逻辑改一下，尽量少的测试模型单次生成的能力，多测试实际使用中，频繁迭代修改以及发现问题的能力。\\n\\n首先测试连续编辑和生成的能力，随着要求越来越多越来越复杂 K2-Thinking 会不会出现崩溃和冲突修改的问题。 [...] 约到后面需求是越来越难的，比如拖拽功能以及最后的移动端适配大改版，中间还伴随着增加功能导致的 UI 变化，但是 K2-Thinking 全部都是一次搞定。\\n\\n第三次的时候由于增加功能导致的 UI 问题，也在第四次修复了，有点顶。\\n\\n我也用其他模型试了一下这个测试。Claude 4.5 可以搞定，但是 Codex 就不行了。\\n\\n下面是 Codex CLI 第七轮修改的样子，非常的凄凉，这产品几乎不可用，让他整废了。\\n\\n然后是考验 Kimi CLI 搜索能力的一个测试，让他搜索 Linear 这个产品的首页设计风格然后写一个网页复刻这个产品网页的设计风格看一下有多像。\\n\\n重点是要他输出他的思考和决策过程，看一下思考带来的优势。\\n\\n模仿 Linear 的动效设计风格制作英文网页\\n\\n要求：\\n\\n1.先搜索了解 Linear (linear.app) 的设计特点：\\n\\n 深色主题为主\\n 流畅的过渡动画\\n 磨砂玻璃效果（glassmorphism）\\n 微妙的渐变和光效\\n 简洁的排版\\n 调研 Linear 使用的前端和动效库方案，并且参考\\n\\n2.创建一个产品功能展示页面，包含： [...] Hero 区域：大标题 + 副标题 + CTA 按钮\\n 功能卡片（3-4个），鼠标悬停有光效跟随\\n 实现滚动视差\\n 背景网格动画\\n\\n3.技术要求：\\n\\n TailwindCSS\\n 颜色方案参考 Linear\\n 动画要流畅（60fps）\\n\\n4.核心特效：\\n\\n 鼠标移动时背景有微妙的渐变跟随\\n 卡片悬停时有发光边框\\n 滚动时元素渐显和位移\\n 按钮悬停有磁吸效果\\n\\n请先简单描述你搜索到的 Linear 设计特点，然后实现，我要看到你的完整思考和决策过程。\\n\\n来看一下过程，他真的在调用搜索工具搜索一些 Linear 风格的设计文章，然后总结了 Linear 的设计特点拓展了原来的提示词和要求。\\n\\n这里面最难的是那个磁吸按钮以及卡片上鼠标 Hover 的光效，也是之前 Linear 的设计核心，他都搞定了。\\n\\n我也用这个提示词在 Claude Code 里面让 Claude 4.5 尝试了，磁吸按钮动效以及鼠标 Hover 的渐变效果 Claude 4.5 都没有搞定。\", \"score\": 0.7156829, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"Kimi K2 Thinking突袭！智能体&推理能力超GPT-5，网友 - 量子位\", \"url\": \"https://www.qbitai.com/2025/11/349957.html\", \"content\": \"在人类最后的考试（HLE）、测试自主网络浏览能力的BrowseComp，和复杂信息收集推理基准测试SEAL-0等评测基准中，Kimi K2 Thinking都刷新了SOTA，超越GPT-5、Claude Sonnet 4.5（Thinking）等闭源模型。\\n\\nKimi K2 Thinking的代码和权重均遵循最为宽松的MIT协议。新模型也已第一时间上线kimi.com和最新版Kimi手机应用，即刻就能实测体验。API可通过Kimi开放平台访问。\\n\\n# 技术细节\\n\\n官方提到，K2 Thinking是月之暗面在Test-Time Scaling（测试时扩展）领域的最新进展，通过同时扩展思考Token和工具调用轮次，模型实现了更强的Agent和推理性能。\\n\\n智能体、推理能力全面提升\\n\\n体现在测试上，在人类最后的考试（HLE）中，允许使用工具——搜索、Python、网络浏览工具的同等情况下，Kimi K2 Thinking取得了44.9%的SOTA成绩。\\n\\n官方还放出了一个通过23次推理和工具调用，K2 Thinking成功解决博士级别数学问题的示例。 [...] # Kimi K2 Thinking突袭！智能体&推理能力超GPT-5，网友：再次缩小开源闭源差距\\n\\n鱼羊 2025-11-07 09:42:41  来源：量子位\\n\\n> 鱼羊 发自 凹非寺\\n>\\n> 量子位 | 公众号 QbitAI\\n\\nKimi K2 Thinking，现已发布并开源！\\n\\n主打一个“模型即Agent”，不仅是Kimi“迄今能力最强的开源思考模型”，还掌握边思考，边使用工具的能力——\\n\\n无需人工干预，即可执行200-300次连续工具调用。\\n\\n作为今年最受关注的开源模型系列之一，Kimi K2的Thinking版本一上线，就成为热议对象：再次缩小了开源模型与闭源模型的差距。\\n\\n 1TB参数，激活参数32B，采用INT4而非FP8。\\n 256K上下文窗口。\\n 更多专家，更少head，更多思考。\\n\\n△与DeepSeek R1的架构对比，图源x@rasbt [...] 官方的解释是，思考模型会产生极长的解码长度，常规的量化手段往往会导致模型性能大幅下降。为了克服这一挑战，他们在后训练阶段采用了量化感知训练（QAT），并对MoE组件应用了INT4纯权重（weight-only）量化。\\n\\n这使得Kimi K2 Thinking能够在复杂推理和Agentic任务中支持原生的INT4推理，并将生成速度提升了约2倍。\\n\\n以及，INT4对推理硬件的兼容性更强，对国产加速计算芯片更友好。\\n\\np.s. Blackwell之前的英伟达GPU不支持FP4。\\n\\n# 上手实测\\n\\n更多测试实例，可以在官方技术博客中查看，我们也第一时间简单测试了一波（仅开启长思考模式，未联网）。\\n\\n经典题：\\n\\n> 一根7米长的甘蔗如何通过1×2米的门\\n\\n思考了将近5分钟，Kimi给出的回答是：\\n\\n思考的时间是有点长，但Kimi K2 Thinking成功绕过了这道题目中的陷阱，想到了门的长宽其实并不会限制甘蔗的通过。\\n\\n编程方面，我们测试的题目是：\\n\\n> 编写一个Python程序，让一个小球在旋转的六边形内弹跳，小球运动遵循物理规律\", \"score\": 0.70944256, \"published_date\": \"\"}, {\"index\": 9, \"title\": \"Kimi K2 Thinking模型发布并开源，该模型哪些信息值得关注？ - 知乎\", \"url\": \"https://www.zhihu.com/question/1969558404759544488\", \"content\": \"HLE 和IMO (确切说是这两天谷歌新出的IMO answerbench）都刷到了SOTA，而且是SOTA，不是“国产SOTA ”。（值得一提的是IMO这个新bench在kimi出来之前sota居然是Deepseek而不是北美模型）\\n\\n不知道heavy模式会不会开放给用户订阅，如果开放的话正好可以看看国内愿意每个月花1000左右买sota的用户到底有多少。\\n\\n这个公司估值只有openai的0.5%，anthropic和grok的2%，朱啸虎还觉得有“泡沫”……\\n\\n对了，QAT和DSA是互相独立，可以叠加的可以降低成本的优化，这波又把模型的cost降低了5倍左右。\\n\\n编辑于 2025-11-07 10:08\\n\\n赞同 13129 条评论\\n\\n分享\\n\\n收藏喜欢\\n\\nImage 7: 刘少伟\\n\\n刘少伟\\n\\nImage 8\\n\\n互联网行业 从业人员\\n\\n 关注\\n\\n谢邀 @知乎科技\\n\\n利益相关：月之暗面infra团队工程师一枚，深度参与了这次thinking模型int4量化方面的工作。 [...] Token控制：K2T 其实在一些推理CoT中明显有进行控制的痕迹，如#4 题就大量使用中文简写，内容几乎不是给人阅读的。但奈何整体思考效率依然不高，很多复杂问题偏向暴力解，验算次数也偏多，导致最终Token开销显著高于Grok 4，是目前第二高。 [...] 指令遵循：K2T 的指令遵循能力略差于Grok 4，例如在简单指令#20 桌游模拟，复杂指令#40 代码推导，得分均落后于Grok 4，最低分与K2 相当。但也有一些惊喜，如在#30 日记整理题中，题目要求了输出字数，其他模型通常只预估一个概数，很少能严格遵守，而K2T 在思维链中严格的逐字计算了总字数，最终字符与要求只差 6 个字。如果提高字数要求，K2T 这套方法自然是行不通的，会导致Token超长，此处仅用来说明其遵守能力。\\n   编程基本功：在之前K2 的测试中提到，K2 虽然编程基本功不足，但具备在多轮环境中稳定改善的能力。K2T 在编程方面并没有相应的强化，表现仍与K2 相当。或许是Kimi团队认为在Agent工作模式下，知错能改的重要性大于不犯错。\\n   特殊字符：在80%以上的输出文本中，都混入了NBSP特殊字符，而这在K2 基础模型中并不存在。合理怀疑是Kimi在后训练阶段故意加入，用作某种标记。此外K2T 也有小概率输出英文，这属于国模的常见病症，不做讨论。\", \"score\": 0.6501347, \"published_date\": \"\"}, {\"index\": 10, \"title\": \"Kimi K2 快速开始 - Moonshot AI 开放平台\", \"url\": \"https://platform.moonshot.cn/docs/guide/kimi-k2-quickstart\", \"content\": \"🚀 最新发布 kimi-k2-0905-preview 模型，上下文窗口扩展至 256K，代码能力更强，快来体验吧！\\n\\n文档\\n\\n入门指南\\n\\nKimi K2 快速开始\\n\\n# Kimi K2 快速开始\\n\\n## \\n\\nKimi K2 是月之暗面推出的新一代具备自主行动 (agentic) 能力的旗舰模型。采用 1T 总参数量、32B 激活参数的 MoE 架构设计。该模型在代码编程 (AI coding) 和智能体 (Agent) 搭建两大领域表现卓越。技术报告 (opens in a new tab)\\n\\n### \\n\\n 国内领先：Kimi K2 是目前国内表现最优秀的 Coding 模型之一\\n 全栈支持：从前端到后端，从代码生成到开发运维、调试优化，全方位覆盖真实编程场景\\n 效率倍增：官方提供联网搜索等十余款开箱即用工具，配合精准的 tool call 调用能力，显著提升编程效率\\n\\n###\", \"score\": 0.6461259, \"published_date\": \"\"}], \"total_results\": 10}",
      "timestamp": "2025-11-09T11:35:18.404611",
      "tool_call_id": "web_search:0",
      "name": "web_search"
    },
    {
      "role": "assistant",
      "content": "让我查看更详细的技术文档，特别是关于模型调用的具体参数说明。",
      "timestamp": "2025-11-09T11:35:21.433885"
    },
    {
      "role": "assistant",
      "content": "",
      "timestamp": "2025-11-09T11:35:21.440638",
      "tool_calls": [
        {
          "id": "extract_webpage:1",
          "index": 0,
          "type": "function",
          "function": {
            "name": "extract_webpage",
            "arguments": "{\"url\": \"https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\"success\": true, \"url\": \"https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"🌐 网页内容 (14645 字符):\\n📍 URL: https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model\\n==================================================\\n使用kimi-k2-thinking思考模型 - Moonshot AI 开放平台 - Kimi 大模型 API 服务\\n\\n===============\\n\\n🎉 最新发布 kimi-k2-thinking 模型，支持多步工具调用与思考，擅长解决更复杂的问题，快来体验吧！\\n\\n[![Image 1: Logo](https://platform.moonshot.cn/lightmode.svg)](https://platform.moonshot.cn/)[联系销售](https://platform.moonshot.cn/contact-sales)[Blog](https://platform.moonshot.cn/blog)[文档](https://platform.moonshot.cn/docs/overview)[开发工作台](https://platform.moonshot.cn/playground)[用户中心](https://platform.moonshot.cn/console)\\n\\n⌘K\\n\\n⌘K\\n\\n*   [欢迎](https://platform.moonshot.cn/docs/overview)\\n*   [使用手册](https://platform.moonshot.cn/docs/introduction)\\n*   API 接口说明\\n\\n    *   [Chat](https://platform.moonshot.cn/docs/api/chat)\\n    *   [Tool Use](https://platform.moonshot.cn/docs/api/tool-use)\\n    *   [Partial Mode](https://platform.moonshot.cn/docs/api/partial)\\n    *   [文件接口](https://platform.moonshot.cn/docs/api/files)\\n    *   [计算 Token](https://platform.moonshot.cn/docs/api/estimate)\\n    *   [查询余额](https://platform.moonshot.cn/docs/api/balance)\\n\\n*   产品定价\\n\\n    *   [模型推理定价](https://platform.moonshot.cn/docs/pricing/chat)\\n    *   [联网搜索定价](https://platform.moonshot.cn/docs/pricing/tools)\\n    *   [充值与限速](https://platform.moonshot.cn/docs/pricing/limits)\\n    *   [常见问题](https://platform.moonshot.cn/docs/pricing/faq)\\n\\n*   入门指南\\n\\n    *   [Kimi K2 快速开始](https://platform.moonshot.cn/docs/guide/kimi-k2-quickstart)\\n    *   [使用 kimi-k2-thinking 思考模型](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model)\\n    *   [开始使用 Kimi API](https://platform.moonshot.cn/docs/guide/start-using-kimi-api)\\n    *   [从 OpenAI 迁移到 Kimi API](https://platform.moonshot.cn/docs/guide/migrating-from-openai-to-kimi)\\n    *   [使用 API 调试工具](https://platform.moonshot.cn/docs/guide/use-moonpalace)\\n    *   [使用 Kimi API 进行多轮对话](https://platform.moonshot.cn/docs/guide/engage-in-multi-turn-conversations-using-kimi-api)\\n    *   [使用 Vision 视觉模型](https://platform.moonshot.cn/docs/guide/use-kimi-vision-model)\\n    *   [自动选择 Kimi 模型](https://platform.moonshot.cn/docs/guide/choose-an-appropriate-kimi-model)\\n    *   [自动断线重连](https://platform.moonshot.cn/docs/guide/auto-reconnect)\\n    *   [使用 Stream 流式输出](https://platform.moonshot.cn/docs/guide/utilize-the-streaming-output-feature-of-kimi-api)\\n    *   [使用 Tool Calls](https://platform.moonshot.cn/docs/guide/use-kimi-api-to-complete-tool-calls)\\n    *   [使用联网搜索 Tool](https://platform.moonshot.cn/docs/guide/use-web-search)\\n    *   [使用 JSON Mode](https://platform.moonshot.cn/docs/guide/use-json-mode-feature-of-kimi-api)\\n    *   [使用 Partial Mode](https://platform.moonshot.cn/docs/guide/use-partial-mode-feature-of-kimi-api)\\n    *   [使用 Kimi API 进行文件问答](https://platform.moonshot.cn/docs/guide/use-kimi-api-for-file-based-qa)\\n    *   [使用开发工作台调试模型](https://platform.moonshot.cn/docs/guide/use-playground-to-debug-the-model)\\n    *   [在 ClaudeCode/Cline/RooCode 中使用 kimi-k2 模型](https://platform.moonshot.cn/docs/guide/agent-support)\\n    *   [在开发工作台中配置 ModelScope MCP 服务器](https://platform.moonshot.cn/docs/guide/configure-the-modelscope-mcp-server)\\n    *   [在 Kimi API 中使用官方工具](https://platform.moonshot.cn/docs/guide/use-official-tools)\\n    *   [使用 Kimi CLI 调用 Kimi 大模型](https://platform.moonshot.cn/docs/guide/kimi-cli-support)\\n    *   [Prompt 最佳实践](https://platform.moonshot.cn/docs/guide/prompt-best-practice)\\n    *   [组织管理最佳实践](https://platform.moonshot.cn/docs/guide/org-best-practice)\\n    *   [常见问题及解决方案](https://platform.moonshot.cn/docs/guide/faq)\\n\\n*   条款与协议\\n\\n    *   [平台服务协议](https://platform.moonshot.cn/docs/agreement/modeluse)\\n    *   [用户服务协议](https://platform.moonshot.cn/docs/agreement/userservice)\\n    *   [用户隐私协议](https://platform.moonshot.cn/docs/agreement/userprivacy)\\n    *   [充值协议](https://platform.moonshot.cn/docs/agreement/payment)\\n\\n*   [Moonshot ↗ (opens in a new tab)](https://www.moonshot.cn/)\\n*   [Changelog ↗ (opens in a new tab)](https://platform.moonshot.cn/blog/posts/changelog)\\n*   [联系客服](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model) \\n*   [开发者交流群](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model) \\n*   [官方公众号](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model) \\n*   [Global | platform.moonshot.ai↗ (opens in a new tab)](https://platform.moonshot.ai/)\\n\\n目录\\n\\n*   [基本示例](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%9F%BA%E6%9C%AC%E7%A4%BA%E4%BE%8B)\\n*   [输出思考内容](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E8%BE%93%E5%87%BA%E6%80%9D%E8%80%83%E5%86%85%E5%AE%B9)\\n*   [多步工具调用](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%A4%9A%E6%AD%A5%E5%B7%A5%E5%85%B7%E8%B0%83%E7%94%A8)\\n*   [使用须知](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E4%BD%BF%E7%94%A8%E9%A1%BB%E7%9F%A5)\\n*   [完整示例](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%AE%8C%E6%95%B4%E7%A4%BA%E4%BE%8B)\\n*   [常见问题](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)\\n*   [Q1: 为什么需要保留 reasoning_content？](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#q1-%E4%B8%BA%E4%BB%80%E4%B9%88%E9%9C%80%E8%A6%81%E4%BF%9D%E7%95%99-reasoning_content)\\n*   [Q2: reasoning_content 会消耗额外的 token 吗？](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#q2-reasoning_content-%E4%BC%9A%E6%B6%88%E8%80%97%E9%A2%9D%E5%A4%96%E7%9A%84-token-%E5%90%97)\\n\\n回到顶部\\n\\n文档\\n\\n入门指南\\n\\n使用 kimi-k2-thinking 思考模型\\n\\n使用kimi-k2-thinking思考模型\\n======================\\n\\n> `kimi-k2-thinking` 模型是月之暗面提供的具有通用 Agentic 能力和推理能力的思考模型，它擅长深度推理，并可通过多步工具调用，帮助解决各类难题。\\n\\n基本示例[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%9F%BA%E6%9C%AC%E7%A4%BA%E4%BE%8B)\\n---------------------------------------------------------------------------------------------------------------\\n\\n你可以简单地通过更换 `model` 来使用它：\\n\\ncurl python\\n\\n```\\n$ curl https://api.moonshot.cn/v1/chat/completions \\\\\\n    -H \\\"Content-Type: application/json\\\" \\\\\\n    -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\\\n    -d '{\\n        \\\"model\\\": \\\"kimi-k2-thinking\\\",\\n        \\\"messages\\\": [\\n            {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"你好\\\"}\\n        ],\\n        \\\"temperature\\\": 1.0\\n \\n   }'\\n```\\n\\n输出思考内容[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E8%BE%93%E5%87%BA%E6%80%9D%E8%80%83%E5%86%85%E5%AE%B9)\\n-----------------------------------------------------------------------------------------------------------------------------------\\n\\n注意到，在使用 `kimi-k2-thinking` 模型时，我们的 API 响应中使用了 `reasoning_content` 字段作为模型思考内容的载体，对于 `reasoning_content` 字段：\\n\\n*   openai SDK 中的 `ChoiceDelta` 和 `ChatCompletionMessage` 类型并不提供 `reasoning_content` 字段，因此无法直接通过 `.reasoning_content` 的方式访问该字段，仅支持通过 `hasattr(obj, \\\"reasoning_content\\\")` 来判断是否存在字段，如果存在，则使用 `getattr(obj, \\\"reasoning_content\\\")` 获取字段值\\n*   如果你使用其他框架或自行通过 HTTP 接口对接，可以直接获取与 `content` 字段同级的 `reasoning_content` 字段\\n*   在流式输出（`stream=True`）的场合，`reasoning_content` 字段一定会先于 `content` 字段出现，你可以在业务代码中通过判断是否出现 `content` 字段来识别思考内容（或称推理过程）是否结束\\n*   `reasoning_content` 中包含的 Tokens 也受 `max_tokens` 参数控制，`reasoning_content` 的 Tokens 数加上 `content` 的 Tokens 数应小于等于 `max_tokens`\\n\\n多步工具调用[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%A4%9A%E6%AD%A5%E5%B7%A5%E5%85%B7%E8%B0%83%E7%94%A8)\\n-----------------------------------------------------------------------------------------------------------------------------------\\n\\n使用 `kimi-k2-thinking` 支持通过深度地推理进行多次工具调用，进而完成非常复杂的任务。\\n\\n### 使用须知[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E4%BD%BF%E7%94%A8%E9%A1%BB%E7%9F%A5)\\n\\n为确保最佳效果，请务必按以下方式调用 kimi-k2-thinking 模型：\\n\\n*   输入应当包括上下文中所有的思考内容(reasoning_content字段)，模型会根据实际情况选择把必要的思考内容送到模型进行推理。\\n*   设置 `max_tokens>=16000` 以避免无法输出完整的 `reasoning_content` 和 `content`。\\n*   **设置 `temperature=1.0`，以获得最佳性能。**\\n*   使用流式输出（`stream=True`）：`kimi-k2-thinking` 模型的输出内容包含了 `reasoning_content`，相比普通模型其输出内容更多，启用流式输出能获得更好的用户体验，同时一定程度避免网络超时问题。\\n\\n### 完整示例[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%AE%8C%E6%95%B4%E7%A4%BA%E4%BE%8B)\\n\\n下面的示例展示了一个\\\"今日新闻报告生成\\\"的场景，模型会依次调用 `date`（获取日期）和 `web_search`（搜索今日新闻）等官方工具，并在这个过程中展现深度思考过程：\\n\\n```\\nimport os\\nimport json\\nimport httpx\\nimport openai\\n \\n \\nclass FormulaChatClient:\\n    def __init__(self, base_url: str, api_key: str):\\n        \\\"\\\"\\\"初始化 Formula 客户端\\\"\\\"\\\"\\n        self.base_url = base_url\\n        self.api_key = api_key\\n        self.openai = openai.Client(\\n            base_url=base_url,\\n            api_key=api_key,\\n        )\\n        self.httpx = httpx.Client(\\n            base_url=base_url,\\n            headers={\\\"Authorization\\\": f\\\"Bearer {api_key}\\\"},\\n            timeout=30.0,\\n        )\\n        self.model = \\\"kimi-k2-thinking\\\"\\n \\n    def get_tools(self, formula_uri: str):\\n        \\\"\\\"\\\"从 Formula API 获取工具定义\\\"\\\"\\\"\\n        response = self.httpx.get(f\\\"/formulas/{formula_uri}/tools\\\")\\n        response.raise_for_status()  # 检查 HTTP 状态码\\n        \\n        try:\\n            return response.json().get(\\\"tools\\\", [])\\n        except json.JSONDecodeError as e:\\n            print(f\\\"错误: 无法解析响应为 JSON (状态码: {response.status_code})\\\")\\n            print(f\\\"响应内容: {response.text[:500]}\\\")\\n            raise\\n \\n    def call_tool(self, formula_uri: str, function: str, args: dict):\\n        \\\"\\\"\\\"调用官方工具\\\"\\\"\\\"\\n        response = self.httpx.post(\\n            f\\\"/formulas/{formula_uri}/fibers\\\",\\n            json={\\\"name\\\": function, \\\"arguments\\\": json.dumps(args)},\\n        )\\n        response.raise_for_status()  # 检查 HTTP 状态码\\n        fiber = response.json()\\n        \\n        if fiber.get(\\\"status\\\", \\\"\\\") == \\\"succeeded\\\":\\n            return fiber[\\\"context\\\"].get(\\\"output\\\") or fiber[\\\"context\\\"].get(\\\"encrypted_output\\\")\\n        \\n        if \\\"error\\\" in fiber:\\n            return f\\\"Error: {fiber['error']}\\\"\\n        if \\\"error\\\" in fiber.get(\\\"context\\\", {}):\\n            return f\\\"Error: {fiber['context']['error']}\\\"\\n        return \\\"Error: Unknown error\\\"\\n \\n    def close(self):\\n        \\\"\\\"\\\"关闭客户端连接\\\"\\\"\\\"\\n        self.httpx.close()\\n \\n \\n# 初始化客户端\\nbase_url = os.getenv(\\\"MOONSHOT_BASE_URL\\\", \\\"https://api.moonshot.cn/v1\\\")\\napi_key = os.getenv(\\\"MOONSHOT_API_KEY\\\")\\n \\nif not api_key:\\n    raise ValueError(\\\"MOONSHOT_API_KEY 环境变量未设置，请先设置 API 密钥\\\")\\n \\nprint(f\\\"Base URL: {base_url}\\\")\\nprint(f\\\"API Key: {api_key[:10]}...{api_key[-10:] if len(api_key) > 20 else api_key}\\\\n\\\")\\n \\nclient = FormulaChatClient(base_url, api_key)\\n \\n# 定义要使用的官方工具 Formula URI\\nformula_uris = [\\n    \\\"moonshot/date:latest\\\",\\n    \\\"moonshot/web-search:latest\\\"\\n]\\n \\n# 加载所有工具定义并建立映射\\nprint(\\\"正在加载官方工具...\\\")\\nall_tools = []\\ntool_to_uri = {}  # function.name -> formula_uri 的映射\\n \\nfor uri in formula_uris:\\n    try:\\n        tools = client.get_tools(uri)\\n        for tool in tools:\\n            func = tool.get(\\\"function\\\")\\n            if func:\\n                func_name = func.get(\\\"name\\\")\\n                if func_name:\\n                    tool_to_uri[func_name] = uri\\n                    all_tools.append(tool)\\n                    print(f\\\"  已加载工具: {func_name} from {uri}\\\")\\n    except Exception as e:\\n        print(f\\\"  警告: 加载工具 {uri} 失败: {e}\\\")\\n        continue\\n \\nprint(f\\\"总共加载 {len(all_tools)} 个工具\\\\n\\\")\\n \\nif not all_tools:\\n    raise ValueError(\\\"未能加载任何工具，请检查 API 密钥和网络连接\\\")\\n \\n# 初始化消息列表\\nmessages = [\\n    {\\n        \\\"role\\\": \\\"system\\\",\\n        \\\"content\\\": \\\"你是 Kimi，一个专业的新闻分析师。你擅长收集、分析和整理信息，生成高质量的新闻报告。\\\",\\n    },\\n]\\n \\n# 用户请求生成今日新闻报告\\nuser_request = \\\"请帮我生成一份今日新闻报告，包含重要的科技、经济和社会新闻。\\\"\\nmessages.append({\\n    \\\"role\\\": \\\"user\\\",\\n    \\\"content\\\": user_request\\n})\\n \\nprint(f\\\"用户请求: {user_request}\\\\n\\\")\\n \\n# 开始多轮对话循环\\nmax_iterations = 10  # 防止无限循环\\nfor iteration in range(max_iterations):\\n    # 调用模型\\n    try:\\n        completion = client.openai.chat.completions.create(\\n            model=client.model,\\n            messages=messages,\\n            max_tokens=1024 * 32,\\n            tools=all_tools,\\n            temperature=1.0,\\n        )\\n    except openai.AuthenticationError as e:\\n        print(f\\\"认证错误: {e}\\\")\\n        print(\\\"请检查 API key 是否正确，以及 API key 是否有权限访问该端点\\\")\\n        raise\\n    except Exception as e:\\n        print(f\\\"调用模型时发生错误: {e}\\\")\\n        raise\\n    \\n    # 获取响应\\n    message = completion.choices[0].message\\n    \\n    # 打印思考过程\\n    if hasattr(message, \\\"reasoning_content\\\"):\\n        print(f\\\"=============第 {iteration + 1} 轮思考开始=============\\\")\\n        reasoning = getattr(message, \\\"reasoning_content\\\")\\n        if reasoning:\\n            print(reasoning[:500] + \\\"...\\\" if len(reasoning) > 500 else reasoning)\\n        print(f\\\"=============第 {iteration + 1} 轮思考结束=============\\\\n\\\")\\n    \\n    # 添加 assistant 消息到上下文（保留 reasoning_content）\\n    messages.append(message)\\n    \\n    # 如果模型没有调用工具，说明对话结束\\n    if not message.tool_calls:\\n        print(\\\"=============最终回答=============\\\")\\n        print(message.content)\\n        break\\n    \\n    # 处理工具调用\\n    print(f\\\"模型决定调用 {len(message.tool_calls)} 个工具:\\\\n\\\")\\n    \\n    for tool_call in message.tool_calls:\\n        func_name = tool_call.function.name\\n        args = json.loads(tool_call.function.arguments)\\n        \\n        print(f\\\"调用工具: {func_name}\\\")\\n        print(f\\\"参数: {json.dumps(args, ensure_ascii=False, indent=2)}\\\")\\n        \\n        # 获取对应的 formula_uri\\n        formula_uri = tool_to_uri.get(func_name)\\n        if not formula_uri:\\n            print(f\\\"错误: 找不到工具 {func_name} 对应的 Formula URI\\\")\\n            continue\\n        \\n        # 调用工具\\n        result = client.call_tool(formula_uri, func_name, args)\\n        \\n        # 打印结果（截断过长内容）\\n        if len(str(result)) > 200:\\n            print(f\\\"工具结果: {str(result)[:200]}...\\\\n\\\")\\n        else:\\n            print(f\\\"工具结果: {result}\\\\n\\\")\\n        \\n        # 添加工具结果到消息列表\\n        tool_message = {\\n            \\\"role\\\": \\\"tool\\\",\\n            \\\"tool_call_id\\\": tool_call.id,\\n            \\\"name\\\": func_name,\\n            \\\"content\\\": result\\n        }\\n        messages.append(tool_message)\\n \\nprint(\\\"\\\\n对话完成！\\\")\\n \\n# 清理资源\\nclient.close()\\n```\\n\\n整个过程展现了 `kimi-k2-thinking` 模型如何通过深度思考来规划和执行复杂的多步骤任务，每个步骤都有完整的推理过程（`reasoning_content`），并且思考内容会保留在上下文中以确保工具调用的准确性。\\n\\n常见问题[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)\\n---------------------------------------------------------------------------------------------------------------\\n\\n### Q1: 为什么需要保留 reasoning_content？[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#q1-%E4%B8%BA%E4%BB%80%E4%B9%88%E9%9C%80%E8%A6%81%E4%BF%9D%E7%95%99-reasoning_content)\\n\\nA: 保留 reasoning_content 可以确保模型在多轮对话中保持推理的连贯性，特别是在工具调用过程中。服务器会自动处理这些字段，用户无需手动管理。\\n\\n### Q2: reasoning_content 会消耗额外的 token 吗？[](https://platform.moonshot.cn/docs/guide/use-kimi-k2-thinking-model#q2-reasoning_content-%E4%BC%9A%E6%B6%88%E8%80%97%E9%A2%9D%E5%A4%96%E7%9A%84-token-%E5%90%97)\\n\\nA: 是的，reasoning_content 会计入输入/输出 token 消耗。具体计费方式请参考 MoonshotAI 的定价文档。\\n\\nLast updated on 2025年11月6日\\n\\n[Kimi K2 快速开始](https://platform.moonshot.cn/docs/guide/kimi-k2-quickstart \\\"Kimi K2 快速开始\\\")[开始使用 Kimi API](https://platform.moonshot.cn/docs/guide/start-using-kimi-api \\\"开始使用 Kimi API\\\")\\n==================================================\"}",
      "timestamp": "2025-11-09T11:35:23.249841",
      "tool_call_id": "extract_webpage:1",
      "name": "extract_webpage"
    },
    {
      "role": "assistant",
      "content": "让我再搜索一下是否有通过参数控制思考模式的相关信息。",
      "timestamp": "2025-11-09T11:35:25.860543"
    },
    {
      "role": "assistant",
      "content": "",
      "timestamp": "2025-11-09T11:35:25.866058",
      "tool_calls": [
        {
          "id": "web_search:2",
          "index": 0,
          "type": "function",
          "function": {
            "name": "web_search",
            "arguments": "{\"query\": \"kimi k2 thinking 参数 control thinking parameter api moonshot\", \"max_results\": 8}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\"success\": true, \"summary\": \"🔍 搜索查询: kimi k2 thinking 参数 control thinking parameter api moonshot\\n📅 搜索时间: 2025-11-09T11:35:29.518123\\n🎯 过滤条件: Topic: general\\n\\n📝 AI摘要:\\nKimi K2 Thinking is Moonshot AI's advanced model with 32 billion activated parameters, supporting complex reasoning and tool use. It uses a Mixture-of-Experts architecture and has a 256k context length. The model is optimized for long-horizon reasoning and dynamic tool invocation.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. Kimi K2 Thinking - API, Providers, Stats - OpenRouter\\n   🔗 https://openrouter.ai/moonshotai/kimi-k2-thinking\\n   📄 Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) arc...\\n\\n2. Moonshot AI Open Platform - Kimi Large Language Model API Service\\n   🔗 https://platform.moonshot.ai/\\n   📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nWelcome\\n\\n# Welcome to Kimi API Docs\\n\\n# Kimi K2 Model\\n\\nA MoE architecture base model with powe...\\n\\n3. moonshotai/Kimi-K2-Thinking\\n   🔗 https://huggingface.co/moonshotai/Kimi-K2-Thinking\\n   📄 Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. I...\\n\\n4. Kimi K2 is the large language model series developed by Moonshot ...\\n   🔗 https://github.com/MoonshotAI/Kimi-K2\\n   📄 Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptiona...\\n\\n5. moonshotai/Kimi-K2-Instruct - Hugging Face\\n   🔗 https://huggingface.co/moonshotai/Kimi-K2-Instruct\\n   📄 Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptiona...\\n\\n6. Kimi K2 Thinking突袭！智能体&推理能力超GPT-5，网友\\n   🔗 https://zhuanlan.zhihu.com/p/1970062429741385218\\n   📄 作为今年最受关注的开源模型系列之一，Kimi K2的Thinking版本一上线，就成为热议对象：再次缩小了开源模型与闭源模型的差距。\\n\\nImage 6\\n\\n更多技术细节速览在此：\\n\\n   1TB参数，激活参数32B，采用INT4而非FP8。\\n   256K上下文窗口。\\n   更多专家，更少head，更多思考。\\n\\nImage 7\\n\\n△与DeepSeek R1的架构对比，图源x@rasbt\\n\\n在人类最后...\\n\\n7. Kimi K2: Open Agentic Intelligence\\n   🔗 https://moonshotai.github.io/Kimi-K2/\\n   📄 The Kimi Platform offers an OpenAI/Anthropic compatible interface, allowing for easy adaptation of your existing applications to Kimi K2. We encourage developers to explore our tool calling API for bu...\\n\\n8. Moonshot AI\\n   🔗 https://www.moonshot.ai/\\n   📄 Image 5Kimi Kimi is driven by Moonshot's proprietary LLM. It supports online search, deep thinking, multimodal reasoning, and long-form text conversations. Chat with KimiKimi Open Platform Kimi Open P...\", \"filters\": {\"topic\": \"general\"}, \"query\": \"kimi k2 thinking 参数 control thinking parameter api moonshot\", \"results\": [{\"index\": 1, \"title\": \"Kimi K2 Thinking - API, Providers, Stats - OpenRouter\", \"url\": \"https://openrouter.ai/moonshotai/kimi-k2-thinking\", \"content\": \"Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves [...] # MoonshotAI: Kimi K2 Thinking\\n\\n### moonshotai/kimi-k2-thinking\\n\\nCreated Nov 6, 2025262,144 context\\n\\n$0.60/M input tokens$2.50/M output tokens [...] For information about using third-party SDKs and frameworks with OpenRouter, please see ourframeworks documentation.\\n\\nSee the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.\", \"score\": 0.85538095, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"Moonshot AI Open Platform - Kimi Large Language Model API Service\", \"url\": \"https://platform.moonshot.ai/\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nWelcome\\n\\n# Welcome to Kimi API Docs\\n\\n# Kimi K2 Model\\n\\nA MoE architecture base model with powerful code and Agent capabilities, featuring 1T total parameters with 32B activated parameters and up to 256k context length. In benchmark tests for general knowledge reasoning, programming, mathematics, and Agent capabilities, K2 model outperforms other mainstream open-source models\\n\\n## Quick Start [...] Getting Started with Kimi K2\\n\\nKimi K2 quick start guide and basic conceptsConsole\\n\\nManage API keys, usage and billingUsing K2 Model in Software Agents\\n\\nExperience K2 model's code writing and tool calling capabilitiesPricing & Billing\\n\\nLearn about model pricing and billing methodsRate Limits\\n\\nLearn about API rate limits and quotasFAQ\\n\\nFind answers to common questions\\n\\n## Models and Pricing\\n\\nView all\\n\\nkimi-k2-turbo-preview\\n\\npreviewkimi-k2-thinking\\n\\nlong thinkkimi-latest\\n\\nImage understanding [...] ## More Resources\\n\\nK2 Model Details\\n\\nLearn about K2 model features and advantagesDevelopment Workbench\\n\\nInteractive interface to experience Kimi model capabilitiesWeb Search Tool\\n\\nUse Kimi's official web search toolBest Practices\\n\\nPrompt writing and application examples\\n\\nUser Manual\", \"score\": 0.75442725, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"moonshotai/Kimi-K2-Thinking\", \"url\": \"https://huggingface.co/moonshotai/Kimi-K2-Thinking\", \"content\": \"Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. It sets a new state-of-the-art on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks by dramatically scaling multi-step reasoning depth and maintaining stable tool-use across 200–300 sequential calls. At the same time, K2 Thinking is a native INT4 quantization model with 256k context [...] |  |  |\\n --- |\\n| Architecture | Mixture-of-Experts (MoE) |\\n| Total Parameters | 1T |\\n| Activated Parameters | 32B |\\n| Number of Layers (Dense layer included) | 61 |\\n| Number of Dense Layers | 1 |\\n| Attention Hidden Dimension | 7168 |\\n| MoE Hidden Dimension (per Expert) | 2048 |\\n| Number of Attention Heads | 64 |\\n| Number of Experts | 384 |\\n| Selected Experts per Token | 8 |\\n| Number of Shared Experts | 1 |\\n| Vocabulary Size | 160K |\\n| Context Length | 256K |\\n| Attention Mechanism | MLA | [...] messages.append({\\n                    \\\"role\\\": \\\"tool\\\",\\n                    \\\"tool_call_id\\\": tool_call.id,\\n                    \\\"name\\\": tool_call_name,\\n                    \\\"content\\\": json.dumps(tool_result)\\n                })\\n    print(\\\"-\\\"  100)\\n    print(choice.message.content)\", \"score\": 0.69870013, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"Kimi K2 is the large language model series developed by Moonshot ...\", \"url\": \"https://github.com/MoonshotAI/Kimi-K2\", \"content\": \"Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.\\n\\n### Key Features [...] |\\n|  |\\n| Architecture | Mixture-of-Experts (MoE) |\\n| Total Parameters | 1T |\\n| Activated Parameters | 32B |\\n| Number of Layers (Dense layer included) | 61 |\\n| Number of Dense Layers | 1 |\\n| Attention Hidden Dimension | 7168 |\\n| MoE Hidden Dimension (per Expert) | 2048 |\\n| Number of Attention Heads | 64 |\\n| Number of Experts | 384 |\\n| Selected Experts per Token | 8 |\\n| Number of Shared Experts | 1 |\\n| Vocabulary Size | 160K |\\n| Context Length | 128K |\\n| Attention Mechanism | MLA | [...] Kimi-K2-Base: The foundation model, a strong start for researchers and builders who want full control for fine-tuning and custom solutions.\\n Kimi-K2-Instruct: The post-trained model best for drop-in, general-purpose chat and agentic experiences. It is a reflex-grade model without long thinking.\\n\\n## 2. Model Summary\", \"score\": 0.6929958, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"moonshotai/Kimi-K2-Instruct - Hugging Face\", \"url\": \"https://huggingface.co/moonshotai/Kimi-K2-Instruct\", \"content\": \"Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.\\n\\n### Key Features [...] |  |  |\\n --- |\\n| Architecture | Mixture-of-Experts (MoE) |\\n| Total Parameters | 1T |\\n| Activated Parameters | 32B |\\n| Number of Layers (Dense layer included) | 61 |\\n| Number of Dense Layers | 1 |\\n| Attention Hidden Dimension | 7168 |\\n| MoE Hidden Dimension (per Expert) | 2048 |\\n| Number of Attention Heads | 64 |\\n| Number of Experts | 384 |\\n| Selected Experts per Token | 8 |\\n| Number of Shared Experts | 1 |\\n| Vocabulary Size | 160K |\\n| Context Length | 128K |\\n| Attention Mechanism | MLA | [...] > You can access Kimi K2's API on  , we provide OpenAI/Anthropic-compatible API for you.\\n>\\n> The Anthropic-compatible API maps temperature by `real_temperature = request_temperature  0.6` for better compatible with existing applications.\\n\\nOur model checkpoints are stored in the block-fp8 format, you can find it on Huggingface.\\n\\nCurrently, Kimi-K2 is recommended to run on the following inference engines:\\n\\n vLLM\\n SGLang\\n KTransformers\\n TensorRT-LLM\", \"score\": 0.6773251, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"Kimi K2 Thinking突袭！智能体&推理能力超GPT-5，网友\", \"url\": \"https://zhuanlan.zhihu.com/p/1970062429741385218\", \"content\": \"作为今年最受关注的开源模型系列之一，Kimi K2的Thinking版本一上线，就成为热议对象：再次缩小了开源模型与闭源模型的差距。\\n\\nImage 6\\n\\n更多技术细节速览在此：\\n\\n   1TB参数，激活参数32B，采用INT4而非FP8。\\n   256K上下文窗口。\\n   更多专家，更少head，更多思考。\\n\\nImage 7\\n\\n△与DeepSeek R1的架构对比，图源x@rasbt\\n\\n在人类最后的考试（HLE）、测试自主网络浏览能力的BrowseComp，和复杂信息收集推理基准测试SEAL-0等评测基准中，Kimi K2 Thinking都刷新了SOTA，超越GPT-5、Claude Sonnet 4.5（Thinking）等闭源模型。\\n\\nImage 8\\n\\nKimi K2 Thinking的代码和权重均遵循最为宽松的MIT协议。新模型也已第一时间上线\\n\\n技术细节\\n\\n官方提到，K2 Thinking是月之暗面在Test-Time Scaling（测试时扩展）领域的最新进展，通过同时扩展思考Token和工具调用轮次，模型实现了更强的Agent和推理性能。\\n\\n智能体、推理能力全面提升\", \"score\": 0.6439794, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"Kimi K2: Open Agentic Intelligence\", \"url\": \"https://moonshotai.github.io/Kimi-K2/\", \"content\": \"The Kimi Platform offers an OpenAI/Anthropic compatible interface, allowing for easy adaptation of your existing applications to Kimi K2. We encourage developers to explore our tool calling API for building agent applications. For detailed information, visit platform.moonshot.ai.\\n\\n#### Serve Kimi K2 on your own [...] The Kimi Platform offers an OpenAI/Anthropic compatible interface, allowing for easy adaptation of your existing applications to Kimi K2. We encourage developers to explore our tool calling API for building agent applications. For detailed information, visit platform.moonshot.ai.\\n#### Serve Kimi K2 on your own [...] Kimi K2 is our latest Mixture-of-Experts model with 32 billion activated parameters and 1 trillion total parameters. It achieves state-of-the-art performance in frontier knowledge, math, and coding among non-thinking models. But it goes further — meticulously optimized for agentic tasks, Kimi K2 does not just answer; it acts.\\n\\nAnd now, it is within your reach. Today, we are open-sourcing:\", \"score\": 0.63234437, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"Moonshot AI\", \"url\": \"https://www.moonshot.ai/\", \"content\": \"Image 5Kimi Kimi is driven by Moonshot's proprietary LLM. It supports online search, deep thinking, multimodal reasoning, and long-form text conversations. Chat with KimiKimi Open Platform Kimi Open Platform supports flexible API calls, so you can easily integrate advanced features and give your applications a competitive edge. Experience Now\\n\\nLatest Research\", \"score\": 0.5973971, \"published_date\": \"\"}], \"total_results\": 8}",
      "timestamp": "2025-11-09T11:35:29.523127",
      "tool_call_id": "web_search:2",
      "name": "web_search"
    },
    {
      "role": "assistant",
      "content": "让我查看GitHub上的官方文档，了解更详细的API信息。",
      "timestamp": "2025-11-09T11:35:33.478812"
    },
    {
      "role": "assistant",
      "content": "",
      "timestamp": "2025-11-09T11:35:33.486337",
      "tool_calls": [
        {
          "id": "extract_webpage:3",
          "index": 0,
          "type": "function",
          "function": {
            "name": "extract_webpage",
            "arguments": "{\"url\": \"https://github.com/MoonshotAI/Kimi-K2\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\"success\": true, \"url\": \"https://github.com/MoonshotAI/Kimi-K2\", \"content\": \"🌐 网页内容 (17374 字符):\\n📍 URL: https://github.com/MoonshotAI/Kimi-K2\\n==================================================\\n[Skip to content](#start-of-content)   \\n\\n\\n\\n## Navigation Menu\\n\\n[Sign in](/login?return_to=https%3A%2F%2Fgithub.com%2FMoonshotAI%2FKimi-K2) \\n\\nAppearance settings\\n\\n# Search code, repositories, users, issues, pull requests...\\n\\n[Search syntax tips](https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax)\\n\\n[Sign in](/login?return_to=https%3A%2F%2Fgithub.com%2FMoonshotAI%2FKimi-K2)\\n\\n [Sign up](/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E&source=header-repo&source_repo=MoonshotAI%2FKimi-K2) \\n\\nAppearance settings\\n\\nYou signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert\\n\\n{{ message }}\\n\\n[MoonshotAI](/MoonshotAI)   /  **[Kimi-K2](/MoonshotAI/Kimi-K2)**  Public\\n\\n* [Notifications](/login?return_to=%2FMoonshotAI%2FKimi-K2)  You must be signed in to change notification settings\\n* [Fork 553](/login?return_to=%2FMoonshotAI%2FKimi-K2)\\n* [Star  8.4k](/login?return_to=%2FMoonshotAI%2FKimi-K2)\\n\\nKimi K2 is the large language model series developed by Moonshot AI team\\n\\n### License\\n\\n[View license](/MoonshotAI/Kimi-K2/blob/main/LICENSE)\\n\\n[8.4k stars](/MoonshotAI/Kimi-K2/stargazers)   [553 forks](/MoonshotAI/Kimi-K2/forks)   [Branches](/MoonshotAI/Kimi-K2/branches)   [Tags](/MoonshotAI/Kimi-K2/tags)   [Activity](/MoonshotAI/Kimi-K2/activity)\\n\\n[Star](/login?return_to=%2FMoonshotAI%2FKimi-K2)\\n\\n[Notifications](/login?return_to=%2FMoonshotAI%2FKimi-K2)  You must be signed in to change notification settings\\n\\n# MoonshotAI/Kimi-K2\\n\\n[Branches](/MoonshotAI/Kimi-K2/branches)[Tags](/MoonshotAI/Kimi-K2/tags)\\n\\nOpen more actions menu\\n\\n## Folders and files\\n\\n| Name | Name | Last commit message | Last commit date |\\n| --- | --- | --- | --- |\\n| Latest commit   History[23 Commits](/MoonshotAI/Kimi-K2/commits/main/) |\\n| [docs](/MoonshotAI/Kimi-K2/tree/main/docs \\\"docs\\\") | [docs](/MoonshotAI/Kimi-K2/tree/main/docs \\\"docs\\\") |  |  |\\n| [figures](/MoonshotAI/Kimi-K2/tree/main/figures \\\"figures\\\") | [figures](/MoonshotAI/Kimi-K2/tree/main/figures \\\"figures\\\") |  |  |\\n| [LICENSE](/MoonshotAI/Kimi-K2/blob/main/LICENSE \\\"LICENSE\\\") | [LICENSE](/MoonshotAI/Kimi-K2/blob/main/LICENSE \\\"LICENSE\\\") |  |  |\\n| [README.md](/MoonshotAI/Kimi-K2/blob/main/README.md \\\"README.md\\\") | [README.md](/MoonshotAI/Kimi-K2/blob/main/README.md \\\"README.md\\\") |  |  |\\n| [tech\\\\_report.pdf](/MoonshotAI/Kimi-K2/blob/main/tech_report.pdf \\\"tech_report.pdf\\\") | [tech\\\\_report.pdf](/MoonshotAI/Kimi-K2/blob/main/tech_report.pdf \\\"tech_report.pdf\\\") |  |  |\\n|  |\\n\\n## Repository files navigation\\n\\n---\\n\\n**📰  [Tech Blog](https://moonshotai.github.io/Kimi-K2/)**     |     **📄  [Full Report](https://www.arxiv.org/abs/2507.20534)**\\n\\n## 1. Model Introduction\\n\\nKimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.\\n\\n### Key Features\\n\\n* Large-Scale Training: Pre-trained a 1T parameter MoE model on 15.5T tokens with zero training instability.\\n* MuonClip Optimizer: We apply the Muon optimizer to an unprecedented scale, and develop novel optimization techniques to resolve instabilities while scaling up.\\n* Agentic Intelligence: Specifically designed for tool use, reasoning, and autonomous problem-solving.\\n\\n### Model Variants\\n\\n* **Kimi-K2-Base**: The foundation model, a strong start for researchers and builders who want full control for fine-tuning and custom solutions.\\n* **Kimi-K2-Instruct**: The post-trained model best for drop-in, general-purpose chat and agentic experiences. It is a reflex-grade model without long thinking.\\n\\n## 2. Model Summary\\n\\n|\\n|  |\\n| **Architecture** | Mixture-of-Experts (MoE) |\\n| **Total Parameters** | 1T |\\n| **Activated Parameters** | 32B |\\n| **Number of Layers** (Dense layer included) | 61 |\\n| **Number of Dense Layers** | 1 |\\n| **Attention Hidden Dimension** | 7168 |\\n| **MoE Hidden Dimension** (per Expert) | 2048 |\\n| **Number of Attention Heads** | 64 |\\n| **Number of Experts** | 384 |\\n| **Selected Experts per Token** | 8 |\\n| **Number of Shared Experts** | 1 |\\n| **Vocabulary Size** | 160K |\\n| **Context Length** | 128K |\\n| **Attention Mechanism** | MLA |\\n| **Activation Function** | SwiGLU |\\n\\n## 3. Evaluation Results\\n\\n#### Instruction model evaluation results\\n\\n| Benchmark | Metric | Kimi K2 Instruct | DeepSeek-V3-0324 | Qwen3-235B-A22B  (non-thinking) | Claude Sonnet 4  (w/o extended thinking) | Claude Opus 4  (w/o extended thinking) | GPT-4.1 | Gemini 2.5 Flash   Preview (05-20) |\\n| --- | --- | --- | --- | --- | --- | --- | --- | --- |\\n| **Coding Tasks** |\\n| LiveCodeBench v6 (Aug 24 - May 25) | Pass@1 | **53.7** | 46.9 | 37.0 | 48.5 | 47.4 | 44.7 | 44.7 |\\n| OJBench | Pass@1 | **27.1** | 24.0 | 11.3 | 15.3 | 19.6 | 19.5 | 19.5 |\\n| MultiPL-E | Pass@1 | **85.7** | 83.1 | 78.2 | 88.6 | **89.6** | 86.7 | 85.6 |\\n| SWE-bench Verified  (Agentless Coding) | Single Patch w/o Test (Acc) | **51.8** | 36.6 | 39.4 | 50.2 | **53.0** | 40.8 | 32.6 |\\n| SWE-bench Verified   (Agentic Coding) | Single Attempt (Acc) | **65.8** | 38.8 | 34.4 | **72.7**\\\\* | 72.5\\\\* | 54.6 | — |\\n| Multiple Attempts (Acc) | **71.6** | — | — | **80.2** | 79.4\\\\* | — | — |\\n| SWE-bench Multilingual  (Agentic Coding) | Single Attempt (Acc) | **47.3** | 25.8 | 20.9 | **51.0** | — | 31.5 | — |\\n| TerminalBench | Inhouse Framework (Acc) | **30.0** | — | — | 35.5 | **43.2** | 8.3 | — |\\n| Terminus (Acc) | **25.0** | 16.3 | 6.6 | — | — | **30.3** | 16.8 |\\n| Aider-Polyglot | Acc | 60.0 | 55.1 | **61.8** | 56.4 | **70.7** | 52.4 | 44.0 |\\n| **Tool Use Tasks** |\\n| Tau2 retail | Avg@4 | **70.6** | 69.1 | 57.0 | 75.0 | **81.8** | 74.8 | 64.3 |\\n| Tau2 airline | Avg@4 | **56.5** | 39.0 | 26.5 | 55.5 | **60.0** | 54.5 | 42.5 |\\n| Tau2 telecom | Avg@4 | **65.8** | 32.5 | 22.1 | 45.2 | 57.0 | 38.6 | 16.9 |\\n| AceBench | Acc | **76.5** | 72.7 | 70.5 | 76.2 | 75.6 | **80.1** | 74.5 |\\n| **Math & STEM Tasks** |\\n| AIME 2024 | Avg@64 | **69.6** | 59.4\\\\* | 40.1\\\\* | 43.4 | 48.2 | 46.5 | 61.3 |\\n| AIME 2025 | Avg@64 | **49.5** | 46.7 | 24.7\\\\* | 33.1\\\\* | 33.9\\\\* | 37.0 | 46.6 |\\n| MATH-500 | Acc | **97.4** | 94.0\\\\* | 91.2\\\\* | 94.0 | 94.4 | 92.4 | 95.4 |\\n| HMMT 2025 | Avg@32 | **38.8** | 27.5 | 11.9 | 15.9 | 15.9 | 19.4 | 34.7 |\\n| CNMO 2024 | Avg@16 | 74.3 | **74.7** | 48.6 | 60.4 | 57.6 | 56.6 | **75.0** |\\n| PolyMath-en | Avg@4 | **65.1** | 59.5 | 51.9 | 52.8 | 49.8 | 54.0 | 49.9 |\\n| ZebraLogic | Acc | **89.0** | 84.0 | 37.7\\\\* | 73.7 | 59.3 | 58.5 | 57.9 |\\n| AutoLogi | Acc | **89.5** | 88.9 | 83.3 | **89.8** | 86.1 | 88.2 | 84.1 |\\n| GPQA-Diamond | Avg@8 | **75.1** | 68.4\\\\* | 62.9\\\\* | 70.0\\\\* | 74.9\\\\* | 66.3 | 68.2 |\\n| SuperGPQA | Acc | **57.2** | 53.7 | 50.2 | 55.7 | 56.5 | 50.8 | 49.6 |\\n| Humanity's Last Exam (Text Only) | - | 4.7 | 5.2 | **5.7** | 5.8 | **7.1** | 3.7 | 5.6 |\\n| **General Tasks** |\\n| MMLU | EM | **89.5** | 89.4 | 87.0 | 91.5 | **92.9** | 90.4 | 90.1 |\\n| MMLU-Redux | EM | **92.7** | 90.5 | 89.2 | 93.6 | **94.2** | 92.4 | 90.6 |\\n| MMLU-Pro | EM | 81.1 | **81.2**\\\\* | 77.3 | 83.7 | **86.6** | 81.8 | 79.4 |\\n| IFEval | Prompt Strict | **89.8** | 81.1 | 83.2\\\\* | 87.6 | 87.4 | 88.0 | 84.3 |\\n| Multi-Challenge | Acc | **54.1** | 31.4 | 34.0 | 46.8 | 49.0 | 36.4 | 39.5 |\\n| SimpleQA | Correct | **31.0** | 27.7 | 13.2 | 15.9 | 22.8 | **42.3** | 23.3 |\\n| Livebench | Pass@1 | **76.4** | 72.4 | 67.6 | 74.8 | 74.6 | 69.8 | 67.8 |\\n\\n • Bold denotes global SOTA, and underlined denotes open-source SOTA.   \\n • Data points marked with \\\\* are directly from the model's tech report or blog.   \\n • All metrics, except for SWE-bench Verified (Agentless), are evaluated with an 8k output token length. SWE-bench Verified (Agentless) is limited to a 16k output token length.   \\n • Kimi K2 achieves 65.8% pass@1 on the SWE-bench Verified tests with bash/editor tools (single-attempt patches, no test-time compute). It also achieves a 47.3% pass@1 on the SWE-bench Multilingual tests under the same conditions. Additionally, we report results on SWE-bench Verified tests (71.6%) that leverage parallel test-time compute by sampling multiple sequences and selecting the single best via an internal scoring model.   \\n • To ensure the stability of the evaluation, we employed avg@k on the AIME, HMMT, CNMO, PolyMath-en, GPQA-Diamond, EvalPlus, Tau2.   \\n • Some data points have been omitted due to prohibitively expensive evaluation costs.  \\n\\n---\\n\\n#### Base model evaluation results\\n\\n| Benchmark | Metric | Shot | Kimi K2 Base | Deepseek-V3-Base | Qwen2.5-72B | Llama 4 Maverick |\\n| --- | --- | --- | --- | --- | --- | --- |\\n| **General Tasks** |\\n| MMLU | EM | 5-shot | **87.8** | 87.1 | 86.1 | 84.9 |\\n| MMLU-pro | EM | 5-shot | **69.2** | 60.6 | 62.8 | 63.5 |\\n| MMLU-redux-2.0 | EM | 5-shot | **90.2** | 89.5 | 87.8 | 88.2 |\\n| SimpleQA | Correct | 5-shot | **35.3** | 26.5 | 10.3 | 23.7 |\\n| TriviaQA | EM | 5-shot | **85.1** | 84.1 | 76.0 | 79.3 |\\n| GPQA-Diamond | Avg@8 | 5-shot | 48.1 | **50.5** | 40.8 | 49.4 |\\n| SuperGPQA | EM | 5-shot | **44.7** | 39.2 | 34.2 | 38.8 |\\n| **Coding Tasks** |\\n| LiveCodeBench v6 | Pass@1 | 1-shot | **26.3** | 22.9 | 21.1 | 25.1 |\\n| EvalPlus | Pass@1 | - | **80.3** | 65.6 | 66.0 | 65.5 |\\n| **Mathematics Tasks** |\\n| MATH | EM | 4-shot | **70.2** | 60.1 | 61.0 | 63.0 |\\n| GSM8k | EM | 8-shot | **92.1** | 91.7 | 90.4 | 86.3 |\\n| **Chinese Tasks** |\\n| C-Eval | EM | 5-shot | **92.5** | 90.0 | 90.9 | 80.9 |\\n| CSimpleQA | Correct | 5-shot | **77.6** | 72.1 | 50.5 | 53.5 |\\n\\n • We only evaluate open-source pretrained models in this work. We report results for Qwen2.5-72B because the base checkpoint for Qwen3-235B-A22B was not open-sourced at the time of our study.   \\n • All models are evaluated using the same evaluation protocol. \\n\\n## 4. Deployment\\n\\nNote\\n\\nYou can access Kimi K2's API on <https://platform.moonshot.ai> , we provide OpenAI/Anthropic-compatible API for you.\\n\\nThe Anthropic-compatible API maps temperature by `real_temperature = request_temperature * 0.6` for better compatiblity with existing applications.\\n\\nOur model checkpoints are stored in block-fp8 format, you can find it on [Huggingface](https://huggingface.co/moonshotai/Kimi-K2-Instruct).\\n\\nCurrently, it is recommended to run Kimi-K2 on the following inference engines:\\n\\n* vLLM\\n* SGLang\\n* KTransformers\\n* TensorRT-LLM\\n\\nDeployment examples for vLLM and SGLang can be found in the [Model Deployment Guide](/MoonshotAI/Kimi-K2/blob/main/docs/deploy_guidance.md).\\n\\n---\\n\\n## 5. Model Usage\\n\\n### Chat Completion\\n\\nOnce the local inference service is set up, you can interact with it through the chat endpoint:\\n\\n```\\ndef simple_chat client OpenAI model_name str messages = \\\"role\\\" \\\"system\\\" \\\"content\\\"\\\"You are Kimi, an AI assistant created by Moonshot AI.\\\" \\\"role\\\" \\\"user\\\" \\\"content\\\" \\\"type\\\" \\\"text\\\" \\\"text\\\"\\\"Please give a brief self-introduction.\\\" response = client chat completions create model = model_name messages = messages stream = False temperature =0.6 max_tokens = 256 print response choices 0 message content\\n```\\n\\nNote\\n\\nThe recommended temperature for Kimi-K2-Instruct is `temperature = 0.6`. If no special instructions are required, the system prompt above is a good default.\\n\\n---\\n\\n### Tool Calling\\n\\nKimi-K2-Instruct has strong tool-calling capabilities. To enable them, you need to pass the list of available tools in each request, then the model will autonomously decide when and how to invoke them.\\n\\nThe following example demonstrates calling a weather tool end-to-end:\\n\\n```\\n# Your tool implementation def get_weather city str -> dict return \\\"weather\\\" \\\"Sunny\\\" # Tool schema definition tools = \\\"type\\\" \\\"function\\\" \\\"function\\\" \\\"name\\\" \\\"get_weather\\\" \\\"description\\\"\\\"Retrieve current weather information. Call this when the user asks about the weather.\\\" \\\"parameters\\\" \\\"type\\\" \\\"object\\\" \\\"required\\\" \\\"city\\\" \\\"properties\\\" \\\"city\\\" \\\"type\\\" \\\"string\\\" \\\"description\\\" \\\"Name of the city\\\" # Map tool names to their implementations tool_map = \\\"get_weather\\\" get_weather def tool_call_with_client client OpenAI model_name str messages = \\\"role\\\" \\\"system\\\" \\\"content\\\"\\\"You are Kimi, an AI assistant created by Moonshot AI.\\\" \\\"role\\\" \\\"user\\\" \\\"content\\\"\\\"What's the weather like in Beijing today? Use the tool to check.\\\" finish_reason = None while finish_reason is None or finish_reason == \\\"tool_calls\\\" completion = client chat completions create model = model_name messages = messages temperature =0.6 tools = tools # tool list defined above tool_choice = \\\"auto\\\" choice = completion choices 0 finish_reason = choice finish_reason if finish_reason == \\\"tool_calls\\\" messages append choice message for tool_call in choice message tool_calls tool_call_name = tool_call function name tool_call_arguments = json loads tool_call function arguments tool_function = tool_map tool_call_name tool_result = tool_function** tool_call_arguments print\\\"tool_result:\\\" tool_result messages append \\\"role\\\" \\\"tool\\\" \\\"tool_call_id\\\" tool_call id \\\"name\\\" tool_call_name \\\"content\\\" json dumps tool_result print\\\"-\\\"* 100 print choice message content\\n```\\n\\nThe `tool_call_with_client` function implements the pipeline from user query to tool execution. This pipeline requires the inference engine to support Kimi-K2’s native tool-parsing logic. For streaming output and manual tool-parsing, see the [Tool Calling Guide](/MoonshotAI/Kimi-K2/blob/main/docs/tool_call_guidance.md).\\n\\n---\\n\\n## 6. License\\n\\nBoth the code and the model weights are released under the [Modified MIT License](/MoonshotAI/Kimi-K2/blob/main/LICENSE).\\n\\n---\\n\\n## 7. Citation\\n\\n```\\n@misc{kimiteam2025kimik2openagentic, title={Kimi K2: Open Agentic Intelligence}, author={Kimi Team and Yifan Bai and Yiping Bao and Guanduo Chen and Jiahao Chen and Ningxin Chen and Ruijue Chen and Yanru Chen and Yuankun Chen and Yutian Chen and Zhuofu Chen and Jialei Cui and Hao Ding and Mengnan Dong and Angang Du and Chenzhuang Du and Dikang Du and Yulun Du and Yu Fan and Yichen Feng and Kelin Fu and Bofei Gao and Hongcheng Gao and Peizhong Gao and Tong Gao and Xinran Gu and Longyu Guan and Haiqing Guo and Jianhang Guo and Hao Hu and Xiaoru Hao and Tianhong He and Weiran He and Wenyang He and Chao Hong and Yangyang Hu and Zhenxing Hu and Weixiao Huang and Zhiqi Huang and Zihao Huang and Tao Jiang and Zhejun Jiang and Xinyi Jin and Yongsheng Kang and Guokun Lai and Cheng Li and Fang Li and Haoyang Li and Ming Li and Wentao Li and Yanhao Li and Yiwei Li and Zhaowei Li and Zheming Li and Hongzhan Lin and Xiaohan Lin and Zongyu Lin and Chengyin Liu and Chenyu Liu and Hongzhang Liu and Jingyuan Liu and Junqi Liu and Liang Liu and Shaowei Liu and T. Y. Liu and Tianwei Liu and Weizhou Liu and Yangyang Liu and Yibo Liu and Yiping Liu and Yue Liu and Zhengying Liu and Enzhe Lu and Lijun Lu and Shengling Ma and Xinyu Ma and Yingwei Ma and Shaoguang Mao and Jie Mei and Xin Men and Yibo Miao and Siyuan Pan and Yebo Peng and Ruoyu Qin and Bowen Qu and Zeyu Shang and Lidong Shi and Shengyuan Shi and Feifan Song and Jianlin Su and Zhengyuan Su and Xinjie Sun and Flood Sung and Heyi Tang and Jiawen Tao and Qifeng Teng and Chensi Wang and Dinglu Wang and Feng Wang and Haiming Wang and Jianzhou Wang and Jiaxing Wang and Jinhong Wang and Shengjie Wang and Shuyi Wang and Yao Wang and Yejie Wang and Yiqin Wang and Yuxin Wang and Yuzhi Wang and Zhaoji Wang and Zhengtao Wang and Zhexu Wang and Chu Wei and Qianqian Wei and Wenhao Wu and Xingzhe Wu and Yuxin Wu and Chenjun Xiao and Xiaotong Xie and Weimin Xiong and Boyu Xu and Jing Xu and Jinjing Xu and L. H. Xu and Lin Xu and Suting Xu and Weixin Xu and Xinran Xu and Yangchuan Xu and Ziyao Xu and Junjie Yan and Yuzi Yan and Xiaofei Yang and Ying Yang and Zhen Yang and Zhilin Yang and Zonghan Yang and Haotian Yao and Xingcheng Yao and Wenjie Ye and Zhuorui Ye and Bohong Yin and Longhui Yu and Enming Yuan and Hongbang Yuan and Mengjie Yuan and Haobing Zhan and Dehao Zhang and Hao Zhang and Wanlu Zhang and Xiaobin Zhang and Yangkun Zhang and Yizhi Zhang and Yongting Zhang and Yu Zhang and Yutao Zhang and Yutong Zhang and Zheng Zhang and Haotian Zhao and Yikai Zhao and Huabin Zheng and Shaojie Zheng and Jianren Zhou and Xinyu Zhou and Zaida Zhou and Zhen Zhu and Weiyu Zhuang and Xinxing Zu}, year={2025}, eprint={2507.20534}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2507.20534}, } \\n```\\n\\n---\\n\\n## 8. Contact Us\\n\\nIf you have any questions or concerns, please reach out to us at [support@moonshot.cn](mailto:support@moonshot.cn).\\n\\n## About\\n\\nKimi K2 is the large language model series developed by Moonshot AI team\\n\\n### Resources\\n\\n### License\\n\\n[View license](#License-1-ov-file)\\n\\n### Uh oh!\\n\\nThere was an error while loading. Please reload this page.\\n\\n[Custom properties](/MoonshotAI/Kimi-K2/custom-properties)\\n\\n### Stars\\n\\n[**8.4k** stars](/MoonshotAI/Kimi-K2/stargazers)\\n\\n### Watchers\\n\\n[**63** watching](/MoonshotAI/Kimi-K2/watchers)\\n\\n### Forks\\n\\n[**553** forks](/MoonshotAI/Kimi-K2/forks)\\n\\n[Report repository](/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2FMoonshotAI%2FKimi-K2&report=MoonshotAI+%28user%29)\\n\\n### Uh oh!\\n\\nThere was an error while loading. Please reload this page.\\n\\n## [Contributors 10](/MoonshotAI/Kimi-K2/graphs/contributors)\\n\\nYou can’t perform that action at this time.\\n==================================================\"}",
      "timestamp": "2025-11-09T11:35:35.299441",
      "tool_call_id": "extract_webpage:3",
      "name": "extract_webpage"
    },
    {
      "role": "assistant",
      "content": "基于我查到的信息，我可以给你详细解释月之暗面Kimi-k2模型控制思考模式的机制：\n\n## Kimi-k2模型思考模式控制机制\n\n根据官方文档，**Kimi-k2模型是通过模型名称来控制是否启用思考模式的**，而不是通过请求参数。\n\n### 核心结论：\n- **使用 `kimi-k2-thinking` 模型名称** → 启用思考模式\n- **使用 `kimi-k2` 或 `kimi-k2-instruct` 模型名称** → 不启用思考模式\n\n### 具体区别：\n\n**1. 思考模型 (`kimi-k2-thinking`)**\n- 具有深度推理能力\n- 会输出 `reasoning_content` 字段显示思考过程\n- 支持多步工具调用（200-300次连续调用）\n- 适用于复杂问题解决\n\n**2. 非思考模型 (`kimi-k2-instruct`)**\n- 快速响应，无长篇思考过程\n- 官方描述为 \"reflex-grade model without long thinking\"\n- 适用于日常对话和简单任务\n\n### API调用示例：\n\n```python\n# 启用思考模式\ncompletion = client.chat.completions.create(\n    model=\"kimi-k2-thinking\",  # 关键：模型名称\n    messages=messages,\n    temperature=1.0,  # 推荐温度\n    max_tokens=16000  # 推荐设置\n)\n\n# 不启用思考模式  \ncompletion = client.chat.completions.create(\n    model=\"kimi-k2-instruct\",  # 关键：模型名称\n    messages=messages,\n    temperature=0.6  # 推荐温度\n)\n```\n\n### 重要参数设置：\n- **思考模型推荐**：`temperature=1.0`，`max_tokens>=16000`\n- **非思考模型推荐**：`temperature=0.6`\n\n所以答案是：**通过修改模型名称来控制，而不是通过thinking参数**。请求时使用 `kimi-k2-thinking` 就启用思考模式，使用 `kimi-k2-instruct` 就不启用思考模式。",
      "timestamp": "2025-11-09T11:35:56.646967"
    },
    {
      "role": "user",
      "content": "那如果使用kimi-k2-thinking模型时，请求传入一个thinking参数为disabled，他还会思考吗？",
      "timestamp": "2025-11-09T11:50:37.601254"
    },
    {
      "role": "assistant",
      "content": "让我查一下官方文档中是否有关于thinking参数的具体说明，以及是否存在禁用思考模式的选项。",
      "timestamp": "2025-11-09T11:50:43.340797"
    },
    {
      "role": "assistant",
      "content": "",
      "timestamp": "2025-11-09T11:50:43.347987",
      "tool_calls": [
        {
          "id": "web_search:4",
          "index": 0,
          "type": "function",
          "function": {
            "name": "web_search",
            "arguments": "{\"query\": \"kimi-k2-thinking thinking parameter disabled false true moonshot api documentation\", \"max_results\": 10}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\"success\": true, \"summary\": \"🔍 搜索查询: kimi-k2-thinking thinking parameter disabled false true moonshot api documentation\\n📅 搜索时间: 2025-11-09T11:50:47.566321\\n🎯 过滤条件: Topic: general\\n\\n📝 AI摘要:\\nThe Kimi K2 Thinking model supports multi-step reasoning and tool use. It has 32 billion activated parameters and can handle 256k context length. The \\\"thinking\\\" parameter is disabled for Moonshot API.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. Moonshot AI Open Platform - Kimi Large Language Model API Service\\n   🔗 https://platform.moonshot.ai/\\n   📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nWelcome\\n\\n# Welcome to Kimi API Docs\\n\\n# Kimi K2 Model\\n\\nA MoE architecture base model with powe...\\n\\n2. [BUG] LiteLLM Moonshot AI Kimi K2 thinking fails on multi-turn ...\\n   🔗 https://github.com/strands-agents/sdk-python/issues/1150\\n   📄 Moonshot API only accepts OpenAI-compatible content types: `\\\"text\\\"` and `\\\"image_url\\\"`. The `\\\"thinking\\\"` type works with Claude via Bedrock but is rejected by Moonshot and other OpenAI-compatible provi...\\n\\n3. Kimi K2 Thinking - API, Providers, Stats - OpenRouter\\n   🔗 https://openrouter.ai/moonshotai/kimi-k2-thinking\\n   📄 Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) arc...\\n\\n4. How to Use Kimi K2 Thinking API - Apidog\\n   🔗 https://apidog.com/blog/kimi-k2-thinking-api/\\n   📄 As AI models push the boundaries of reasoning and agentic capabilities, Kimi K2 Thinking emerges as a standout innovation from Moonshot AI, blending open-source accessibility with enterprise-grade per...\\n\\n5. Using the kimi-k2-thinking model - Moonshot AI Open Platform\\n   🔗 https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\\n   📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The ...\\n\\n6. Kimi-K2-Thinking Usage Guide - vLLM Recipes\\n   🔗 https://docs.vllm.ai/projects/recipes/en/latest/moonshotai/Kimi-K2-Think.html\\n   📄 ```\\nvllm serve moonshotai/Kimi-K2-Thinking \\\\    \\\\ --tensor-parallel-size 8 \\\\   8  \\\\ --enable-auto-tool-choice \\\\   \\\\ --tool-call-parser kimi_k2 \\\\    \\\\ --reasoning-parser kimi_k2 \\\\    \\\\ --trust-remote-c...\\n\\n7. kimi-k2-preview - AI/ML API Documentation\\n   🔗 https://docs.aimlapi.com/api-references/text-models-llm/moonshot/kimi-k2-preview\\n   📄 body: JSON.stringify({\\n\\n                                      \\n                                      \\n                                              model: 'moonshot/kimi-k2-preview',\\n\\n                ...\\n\\n8. moonshotai/Kimi-K2-Thinking - Hugging Face\\n   🔗 https://huggingface.co/moonshotai/Kimi-K2-Thinking\\n   📄 Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. I...\\n\\n9. Kimi K2 is the large language model series developed by Moonshot ...\\n   🔗 https://github.com/MoonshotAI/Kimi-K2\\n   📄 Notifications  You must be signed in to change notification settings\\n Fork 553\\n Star  8.4k\\n\\nKimi K2 is the large language model series developed by Moonshot AI team\\n\\n### License\\n\\nView license\\n\\n8.4k st...\\n\\n10. Quickstart with the Kimi API - Moonshot AI Open Platform\\n   🔗 https://platform.moonshot.ai/docs/guide/start-using-kimi-api\\n   📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nQuickstart with Kimi API\\n\\n# Quickstart with the Kimi API\\n\\nThe Kimi API...\", \"filters\": {\"topic\": \"general\"}, \"query\": \"kimi-k2-thinking thinking parameter disabled false true moonshot api documentation\", \"results\": [{\"index\": 1, \"title\": \"Moonshot AI Open Platform - Kimi Large Language Model API Service\", \"url\": \"https://platform.moonshot.ai/\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nWelcome\\n\\n# Welcome to Kimi API Docs\\n\\n# Kimi K2 Model\\n\\nA MoE architecture base model with powerful code and Agent capabilities, featuring 1T total parameters with 32B activated parameters and up to 256k context length. In benchmark tests for general knowledge reasoning, programming, mathematics, and Agent capabilities, K2 model outperforms other mainstream open-source models\\n\\n## Quick Start [...] Getting Started with Kimi K2\\n\\nKimi K2 quick start guide and basic conceptsConsole\\n\\nManage API keys, usage and billingUsing K2 Model in Software Agents\\n\\nExperience K2 model's code writing and tool calling capabilitiesPricing & Billing\\n\\nLearn about model pricing and billing methodsRate Limits\\n\\nLearn about API rate limits and quotasFAQ\\n\\nFind answers to common questions\\n\\n## Models and Pricing\\n\\nView all\\n\\nkimi-k2-turbo-preview\\n\\npreviewkimi-k2-thinking\\n\\nlong thinkkimi-latest\\n\\nImage understanding [...] ## More Resources\\n\\nK2 Model Details\\n\\nLearn about K2 model features and advantagesDevelopment Workbench\\n\\nInteractive interface to experience Kimi model capabilitiesWeb Search Tool\\n\\nUse Kimi's official web search toolBest Practices\\n\\nPrompt writing and application examples\\n\\nUser Manual\", \"score\": 0.63152665, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"[BUG] LiteLLM Moonshot AI Kimi K2 thinking fails on multi-turn ...\", \"url\": \"https://github.com/strands-agents/sdk-python/issues/1150\", \"content\": \"Moonshot API only accepts OpenAI-compatible content types: `\\\"text\\\"` and `\\\"image_url\\\"`. The `\\\"thinking\\\"` type works with Claude via Bedrock but is rejected by Moonshot and other OpenAI-compatible providers.\\n\\n## Possible Solution\\n\\nModify `src/strands/models/litellm.py` lines 95-102:\\n\\n```\\nif \\\"reasoningContent\\\" in content reasoning_text = content \\\"reasoningContent\\\" \\\"reasoningText\\\"# Use text type for OpenAI-compatible providers return \\\"type\\\" \\\"text\\\" \\\"text\\\" reasoning_text \\\"text\\\"\\n``` [...] opened on Nov 7, 2025\\n\\nIssue body actions\\n\\n## Checks\\n\\n I have updated to the latest minor and patch version of Strands\\n I have checked the documentation and this is not expected behavior\\n I have searched issues and there are no duplicates\\n\\n## Strands Version\\n\\n1.15.0\\n\\n## Python Version\\n\\n3.10.18\\n\\n## Operating System\\n\\nmacOS (Darwin 25.0.0)\\n\\n## Installation Method\\n\\npip install strands==1.15.0\\n\\n## Steps to Reproduce\\n\\n1. Install Strands and configure Moonshot AI via LiteLLM: [...] ```\\nfrom strands models import LiteLLMModel model = LiteLLMModel model_id =\\\"moonshot/kimi-k2-thinking\\\" client_args =\\n```\\n\\n1. Make initial request with tool calling:\\n\\n```\\nmessages = \\\"role\\\" \\\"user\\\" \\\"content\\\" \\\"text\\\" \\\"Analyze this target\\\" response = await model stream messages = messages tool_specs = tools\\n```\\n\\n1. Add response to conversation history and make second request:\", \"score\": 0.60861784, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"Kimi K2 Thinking - API, Providers, Stats - OpenRouter\", \"url\": \"https://openrouter.ai/moonshotai/kimi-k2-thinking\", \"content\": \"Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves [...] # MoonshotAI: Kimi K2 Thinking\\n\\n### moonshotai/kimi-k2-thinking\\n\\nCreated Nov 6, 2025262,144 context\\n\\n$0.60/M input tokens$2.50/M output tokens [...] For information about using third-party SDKs and frameworks with OpenRouter, please see ourframeworks documentation.\\n\\nSee the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.\", \"score\": 0.6052631, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"How to Use Kimi K2 Thinking API - Apidog\", \"url\": \"https://apidog.com/blog/kimi-k2-thinking-api/\", \"content\": \"As AI models push the boundaries of reasoning and agentic capabilities, Kimi K2 Thinking emerges as a standout innovation from Moonshot AI, blending open-source accessibility with enterprise-grade performance. This trillion-parameter thinking agent model redefines how developers interact with large language models, particularly through its robust API. Designed for tasks requiring deep inference and tool chaining, the Kimi K2 Thinking API allows seamless integration into applications, from [...] Kimi K2 Thinking represents Moonshot AI's bold step forward in open-source AI, launching as a specialized thinking agent model optimized for sequential reasoning and tool use. At its core, Kimi K2 Thinking is built to simulate human-like deliberation, processing queries through extended \\\"thinking\\\" tokens that enable multi-turn tool interactions without constant human input. This model, available via API for developers, excels in environments demanding prolonged context retention and adaptive [...] Securing access to the Kimi K2 Thinking API starts at platform.moonshot.ai. Sign up or log in, then navigate to the API console under \\\"API Keys.\\\" Click \\\"Create New Key,\\\" select permissions (e.g., chat completions, tool calls), and generate—copy the key immediately, as it's shown once.\\n\\nWith the key, configure your SDK as above. Test with a simple curl:\", \"score\": 0.5911811, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"Using the kimi-k2-thinking model - Moonshot AI Open Platform\", \"url\": \"https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The `kimi-k2-thinking` model is a general-purpose agentic reasoning model developed by Moonshot AI. Thanks to its strength in deep reasoning and multi-turn tool use, it can solve even the hardest problems.\\n\\n## \\n\\nYou can simply use it by switching the `model` parameter: [...] ### \\n\\nA: Yes, `reasoning_content` counts towards your input/output token quota. For detailed pricing, please refer to MoonshotAI's pricing documentation.\\n\\nKimi K2 QuickstartQuickstart with Kimi API [...] ```\\n$ curl  \\\\\\n    -H \\\"Content-Type: application/json\\\" \\\\\\n    -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\\\n    -d '{\\n        \\\"model\\\": \\\"kimi-k2-thinking\\\",\\n        \\\"messages\\\": [\\n            {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"hello\\\"}\\n        ],\\n        \\\"temperature\\\": 1.0\\n   }'\\n \\n```\\n\\n## \\n\\nIn the `kimi-k2-thinking` model API response, we use the `reasoning_content` field as the carrier for the model's reasoning. About the `reasoning_content` field:\", \"score\": 0.59033114, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"Kimi-K2-Thinking Usage Guide - vLLM Recipes\", \"url\": \"https://docs.vllm.ai/projects/recipes/en/latest/moonshotai/Kimi-K2-Think.html\", \"content\": \"```\\nvllm serve moonshotai/Kimi-K2-Thinking \\\\    \\\\ --tensor-parallel-size 8 \\\\   8  \\\\ --enable-auto-tool-choice \\\\   \\\\ --tool-call-parser kimi_k2 \\\\    \\\\ --reasoning-parser kimi_k2 \\\\    \\\\ --trust-remote-code  \\n``` [...] Skip to content\\n\\n Installing vLLM\\n Launching Kimi-K2-Thinking with vLLM\\n Metrics\\n Benchmarking \\n\\n  + TP8 Benchmark Output\\n  + TP8+DCP8 Benchmark Output\\n DCP Gain Analysis\\n\\n# Kimi-K2-Thinking Usage Guide¶\\n\\nKimi K2 Thinking is an advanced large language model created by moonshotai. It offers the following highlights: [...] ```\\nvllm serve moonshotai/Kimi-K2-Thinking \\\\    \\\\ --tensor-parallel-size 8 \\\\   8  \\\\ --decode-context-parallel-size 8 \\\\   8  \\\\ --enable-auto-tool-choice \\\\   \\\\ --tool-call-parser kimi_k2 \\\\    \\\\ --reasoning-parser kimi_k2 \\\\    \\\\ --trust-remote-code  \\n```\\n\\nThe `--reasoning-parser` flag specifies the reasoning parser to use for extracting reasoning content from the model output. \\n\\n## Metrics¶\\n\\nWe tested the `GSM8K` accuracy of 2 types of launch scripts (TP8 vs TP8+DCP8)\\n\\n TP8\", \"score\": 0.5565207, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"kimi-k2-preview - AI/ML API Documentation\", \"url\": \"https://docs.aimlapi.com/api-references/text-models-llm/moonshot/kimi-k2-preview\", \"content\": \"body: JSON.stringify({\\n\\n                                      \\n                                      \\n                                              model: 'moonshot/kimi-k2-preview',\\n\\n                                      \\n                                      \\n                                              messages:[ [...] },\\n\\n                            \\n                            \\n                                  json={\\n\\n                            \\n                            \\n                                      \\\"model\\\":\\\"moonshot/kimi-k2-preview\\\",\\n\\n                            \\n                            \\n                                      \\\"messages\\\":[ [...] messages = [\\n\\n                            \\n                            \\n                                      {\\\"role\\\": \\\"system\\\", \\\"content\\\": \\\"You are Kimi.\\\"},\\n\\n                            \\n                            \\n                                      {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Please search for Moonshot AI Context Caching technology and tell me what it is in English.\\\"}\", \"score\": 0.512376, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"moonshotai/Kimi-K2-Thinking - Hugging Face\", \"url\": \"https://huggingface.co/moonshotai/Kimi-K2-Thinking\", \"content\": \"Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. It sets a new state-of-the-art on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks by dramatically scaling multi-step reasoning depth and maintaining stable tool-use across 200–300 sequential calls. At the same time, K2 Thinking is a native INT4 quantization model with 256k context [...] messages.append({\\n                    \\\"role\\\": \\\"tool\\\",\\n                    \\\"tool_call_id\\\": tool_call.id,\\n                    \\\"name\\\": tool_call_name,\\n                    \\\"content\\\": json.dumps(tool_result)\\n                })\\n    print(\\\"-\\\"  100)\\n    print(choice.message.content) [...] ```\\n\\n> The recommended temperature for Kimi-K2-Thinking is `temperature = 1.0`.\\n> If no special instructions are required, the system prompt above is a good default.\\n\\n### Tool Calling\\n\\nKimi-K2-Thinking has the same tool calling settings as Kimi-K2-Instruct.\\n\\nTo enable them, you need to pass the list of available tools in each request, then the model will autonomously decide when and how to invoke them.\\n\\nThe following example demonstrates calling a weather tool end-to-end:\", \"score\": 0.5059329, \"published_date\": \"\"}, {\"index\": 9, \"title\": \"Kimi K2 is the large language model series developed by Moonshot ...\", \"url\": \"https://github.com/MoonshotAI/Kimi-K2\", \"content\": \"Notifications  You must be signed in to change notification settings\\n Fork 553\\n Star  8.4k\\n\\nKimi K2 is the large language model series developed by Moonshot AI team\\n\\n### License\\n\\nView license\\n\\n8.4k stars   553 forks   Branches   Tags   Activity\\n\\nStar\\n\\nNotifications  You must be signed in to change notification settings\\n\\n# MoonshotAI/Kimi-K2\\n\\nBranchesTags\\n\\nOpen more actions menu\\n\\n## Folders and files [...] ## Contributors 10\\n\\nYou can’t perform that action at this time. [...] Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.\\n\\n### Key Features\", \"score\": 0.47256237, \"published_date\": \"\"}, {\"index\": 10, \"title\": \"Quickstart with the Kimi API - Moonshot AI Open Platform\", \"url\": \"https://platform.moonshot.ai/docs/guide/start-using-kimi-api\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nQuickstart with Kimi API\\n\\n# Quickstart with the Kimi API\\n\\nThe Kimi API allows you to interact with the Kimi large language model. Here is a simple example code: [...] {\\\"role\\\": \\\"system\\\", \\\"content\\\": \\\"You are Kimi, an AI assistant provided by Moonshot AI. You are proficient in Chinese and English conversations. You provide users with safe, helpful, and accurate answers. You will reject any requests involving terrorism, racism, or explicit content. Moonshot AI is a proper noun and should not be translated.\\\"},\\n        {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Hello, my name is Li Lei. What is 1+1?\\\"}\\n    ],\\n    temperature = 0.6,\\n) [...] ```\\nfrom openai import OpenAI\\n \\nclient = OpenAI(\\n    api_key=\\\"MOONSHOT_API_KEY\\\", # Replace MOONSHOT_API_KEY with the API Key you obtained from the Kimi Open Platform\\n    base_url=\\\"\\n)\\n \\ncompletion = client.chat.completions.create(\\n    model = \\\"kimi-k2-turbo-preview\\\",\\n    messages = [\", \"score\": 0.44594026, \"published_date\": \"\"}], \"total_results\": 10}",
      "timestamp": "2025-11-09T11:50:47.573111",
      "tool_call_id": "web_search:4",
      "name": "web_search"
    },
    {
      "role": "assistant",
      "content": "让我查看更详细的官方文档，确认是否存在thinking参数以及其作用。",
      "timestamp": "2025-11-09T11:50:55.230973"
    },
    {
      "role": "assistant",
      "content": "",
      "timestamp": "2025-11-09T11:50:55.243537",
      "tool_calls": [
        {
          "id": "extract_webpage:5",
          "index": 0,
          "type": "function",
          "function": {
            "name": "extract_webpage",
            "arguments": "{\"url\": \"https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\"success\": true, \"url\": \"https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"🌐 网页内容 (10344 字符):\\n📍 URL: https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\\n==================================================\\n🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The `kimi-k2-thinking` model is a general-purpose agentic reasoning model developed by Moonshot AI. Thanks to its strength in deep reasoning and multi-turn tool use, it can solve even the hardest problems.\\n\\n## \\n\\nYou can simply use it by switching the `model` parameter:\\n\\n```\\n$ curl https://api.moonshot.cn/v1/chat/completions \\\\\\n    -H \\\"Content-Type: application/json\\\" \\\\\\n    -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\\\n    -d '{\\n        \\\"model\\\": \\\"kimi-k2-thinking\\\",\\n        \\\"messages\\\": [\\n            {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"hello\\\"}\\n        ],\\n        \\\"temperature\\\": 1.0\\n   }'\\n \\n```\\n\\n## \\n\\nIn the `kimi-k2-thinking` model API response, we use the `reasoning_content` field as the carrier for the model's reasoning. About the `reasoning_content` field:\\n\\n* In the OpenAI SDK, `ChoiceDelta` and `ChatCompletionMessage` types do not provide a `reasoning_content` field directly, so you cannot access it via `.reasoning_content`. You must use `hasattr(obj, \\\"reasoning_content\\\")` to check if the field exists, and if so, use `getattr(obj, \\\"reasoning_content\\\")` to retrieve its value.\\n* If you use other frameworks or directly interface with the HTTP API, you can directly obtain the `reasoning_content` field at the same level as the `content` field.\\n* In streaming output (`stream=True`), the `reasoning_content` field will always appear before the `content` field. In your business logic, you can detect if the `content` field has been output to determine if the reasoning (inference process) is finished.\\n* Tokens in `reasoning_content` are also controlled by the `max_tokens` parameter: the sum of tokens in `reasoning_content` and `content` must be less than or equal to `max_tokens`.\\n\\n## \\n\\n`kimi-k2-thinking` is designed to perform deep reasoning across multiple tool calls, enabling it to tackle highly complex tasks.\\n\\n### \\n\\nTo get reliable results, always follow these rules when calling the kimi-k2-thinking model :\\n\\n* Include the entire reasoning content from the context (the reasoning\\\\_content field) in your input. The model will decide which parts are necessary and forward them for further reasoning.\\n* Set max\\\\_tokens ≥ 16,000 to ensure the full reasoning\\\\_content and final content can be returned without truncation.\\n* **Set temperature = 1.0 to get the best performance.**\\n* Enable streaming (stream = true). Because kimi-k2-thinking returns both reasoning\\\\_content and regular content, the response is larger than usual. Streaming delivers a better user experience and helps avoid network-timeout issues.\\n\\n### \\n\\nWe walk through a complete example that shows how to properly use kimi-k2-thinking together with official tools for multi-step tool call and extended reasoning.\\n\\nThe example below demonstrates a \\\"Daily News Report Generation\\\" scenario. The model will sequentially call official tools like `date` (to get the date) and `web_search` (to search today's news), and will present deep reasoning throughout this process:\\n\\n```\\nimport os\\nimport json\\nimport httpx\\nimport openai\\n \\n \\nclass FormulaChatClient:\\n    def __init__(self, base_url: str, api_key: str):\\n        \\\"\\\"\\\"Initialize Formula client\\\"\\\"\\\"\\n        self.base_url = base_url\\n        self.api_key = api_key\\n        self.openai = openai.Client(\\n            base_url=base_url,\\n            api_key=api_key,\\n        )\\n        self.httpx = httpx.Client(\\n            base_url=base_url,\\n            headers={\\\"Authorization\\\": f\\\"Bearer {api_key}\\\"},\\n            timeout=30.0,\\n        )\\n        self.model = \\\"kimi-k2-thinking\\\"\\n \\n    def get_tools(self, formula_uri: str):\\n        \\\"\\\"\\\"Get tool definitions from Formula API\\\"\\\"\\\"\\n        response = self.httpx.get(f\\\"/formulas/{formula_uri}/tools\\\")\\n        response.raise_for_status()\\n        \\n        try:\\n            return response.json().get(\\\"tools\\\", [])\\n        except json.JSONDecodeError as e:\\n            print(f\\\"Error: Unable to parse JSON (status code: {response.status_code})\\\")\\n            print(f\\\"Response content: {response.text[:500]}\\\")\\n            raise\\n \\n    def call_tool(self, formula_uri: str, function: str, args: dict):\\n        \\\"\\\"\\\"Call an official tool\\\"\\\"\\\"\\n        response = self.httpx.post(\\n            f\\\"/formulas/{formula_uri}/fibers\\\",\\n            json={\\\"name\\\": function, \\\"arguments\\\": json.dumps(args)},\\n        )\\n        response.raise_for_status()\\n        fiber = response.json()\\n        \\n        if fiber.get(\\\"status\\\", \\\"\\\") == \\\"succeeded\\\":\\n            return fiber[\\\"context\\\"].get(\\\"output\\\") or fiber[\\\"context\\\"].get(\\\"encrypted_output\\\")\\n        \\n        if \\\"error\\\" in fiber:\\n            return f\\\"Error: {fiber['error']}\\\"\\n        if \\\"error\\\" in fiber.get(\\\"context\\\", {}):\\n            return f\\\"Error: {fiber['context']['error']}\\\"\\n        return \\\"Error: Unknown error\\\"\\n \\n    def close(self):\\n        \\\"\\\"\\\"Close the client connection\\\"\\\"\\\"\\n        self.httpx.close()\\n \\n \\n# Initialize client\\nbase_url = os.getenv(\\\"MOONSHOT_BASE_URL\\\", \\\"https://api.moonshot.cn/v1\\\")\\napi_key = os.getenv(\\\"MOONSHOT_API_KEY\\\")\\n \\nif not api_key:\\n    raise ValueError(\\\"MOONSHOT_API_KEY environment variable not set. Please set your API key.\\\")\\n \\nprint(f\\\"Base URL: {base_url}\\\")\\nprint(f\\\"API Key: {api_key[:10]}...{api_key[-10:] if len(api_key) > 20 else api_key}\\\\n\\\")\\n \\nclient = FormulaChatClient(base_url, api_key)\\n \\n# Define the official tool Formula URIs to use\\nformula_uris = [\\n    \\\"moonshot/date:latest\\\",\\n    \\\"moonshot/web-search:latest\\\"\\n]\\n \\n# Load all tool definitions and build mapping\\nprint(\\\"Loading official tools...\\\")\\nall_tools = []\\ntool_to_uri = {}  # function.name -> formula_uri\\n \\nfor uri in formula_uris:\\n    try:\\n        tools = client.get_tools(uri)\\n        for tool in tools:\\n            func = tool.get(\\\"function\\\")\\n            if func:\\n                func_name = func.get(\\\"name\\\")\\n                if func_name:\\n                    tool_to_uri[func_name] = uri\\n                    all_tools.append(tool)\\n                    print(f\\\"  Loaded tool: {func_name} from {uri}\\\")\\n    except Exception as e:\\n        print(f\\\"  Warning: Failed to load tool {uri}: {e}\\\")\\n        continue\\n \\nprint(f\\\"Loaded {len(all_tools)} tools in total\\\\n\\\")\\n \\nif not all_tools:\\n    raise ValueError(\\\"No tools loaded. Please check API key and network connection.\\\")\\n \\n# Initialize message list\\nmessages = [\\n    {\\n        \\\"role\\\": \\\"system\\\",\\n        \\\"content\\\": \\\"You are Kimi, a professional news analyst. You excel at collecting, analyzing, and organizing information to generate high-quality news reports.\\\",\\n    },\\n]\\n \\n# User request to generate today's news report\\nuser_request = \\\"Please help me generate a daily news report including important technology, economy, and society news.\\\"\\nmessages.append({\\n    \\\"role\\\": \\\"user\\\",\\n    \\\"content\\\": user_request\\n})\\n \\nprint(f\\\"User request: {user_request}\\\\n\\\")\\n \\n# Begin multi-turn conversation loop\\nmax_iterations = 10  # Prevent infinite loops\\nfor iteration in range(max_iterations):\\n    try:\\n        completion = client.openai.chat.completions.create(\\n            model=client.model,\\n            messages=messages,\\n            max_tokens=1024 * 32,\\n            tools=all_tools,\\n            temperature=1.0,\\n        )\\n    except openai.AuthenticationError as e:\\n        print(f\\\"Authentication error: {e}\\\")\\n        print(\\\"Please check if the API key is correct and has the required permissions\\\")\\n        raise\\n    except Exception as e:\\n        print(f\\\"Error while calling the model: {e}\\\")\\n        raise\\n    \\n    # Get response\\n    message = completion.choices[0].message\\n    \\n    # Print reasoning process\\n    if hasattr(message, \\\"reasoning_content\\\"):\\n        print(f\\\"=============Reasoning round {iteration + 1} starts=============\\\")\\n        reasoning = getattr(message, \\\"reasoning_content\\\")\\n        if reasoning:\\n            print(reasoning[:500] + \\\"...\\\" if len(reasoning) > 500 else reasoning)\\n        print(f\\\"=============Reasoning round {iteration + 1} ends=============\\\\n\\\")\\n    \\n    # Add assistant message to context (preserve reasoning_content)\\n    messages.append(message)\\n    \\n    # If the model did not call any tools, conversation is done\\n    if not message.tool_calls:\\n        print(\\\"=============Final Answer=============\\\")\\n        print(message.content)\\n        break\\n    \\n    # Handle tool calls\\n    print(f\\\"The model decided to call {len(message.tool_calls)} tool(s):\\\\n\\\")\\n    \\n    for tool_call in message.tool_calls:\\n        func_name = tool_call.function.name\\n        args = json.loads(tool_call.function.arguments)\\n        \\n        print(f\\\"Calling tool: {func_name}\\\")\\n        print(f\\\"Arguments: {json.dumps(args, ensure_ascii=False, indent=2)}\\\")\\n        \\n        # Get corresponding formula_uri\\n        formula_uri = tool_to_uri.get(func_name)\\n        if not formula_uri:\\n            print(f\\\"Error: Could not find Formula URI for tool {func_name}\\\")\\n            continue\\n        \\n        # Call the tool\\n        result = client.call_tool(formula_uri, func_name, args)\\n        \\n        # Print result (truncate if too long)\\n        if len(str(result)) > 200:\\n            print(f\\\"Tool result: {str(result)[:200]}...\\\\n\\\")\\n        else:\\n            print(f\\\"Tool result: {result}\\\\n\\\")\\n        \\n        # Add tool result to message list\\n        tool_message = {\\n            \\\"role\\\": \\\"tool\\\",\\n            \\\"tool_call_id\\\": tool_call.id,\\n            \\\"name\\\": func_name,\\n            \\\"content\\\": result\\n        }\\n        messages.append(tool_message)\\n \\nprint(\\\"\\\\nConversation completed!\\\")\\n \\n# Cleanup\\nclient.close()\\n```\\n\\nThis process demonstrates how the `kimi-k2-thinking` model uses deep reasoning to plan and execute complex multi-step tasks, with detailed reasoning steps (`reasoning_content`) preserved in the context to ensure accurate tool use at every stage.\\n\\n## \\n\\n### \\n\\nA: Keeping the `reasoning_content` ensures the model maintains reasoning continuity in multi-turn conversations, especially when calling tools. The server will automatically handle these fields; users do not need to manage them manually.\\n\\n### \\n\\nA: Yes, `reasoning_content` counts towards your input/output token quota. For detailed pricing, please refer to MoonshotAI's pricing documentation.\\n\\n[Kimi K2 Quickstart](/docs/guide/kimi-k2-quickstart \\\"Kimi K2 Quickstart\\\")[Quickstart with Kimi API](/docs/guide/start-using-kimi-api \\\"Quickstart with Kimi API\\\")\\n==================================================\"}",
      "timestamp": "2025-11-09T11:50:57.058186",
      "tool_call_id": "extract_webpage:5",
      "name": "extract_webpage"
    },
    {
      "role": "assistant",
      "content": "让我再查一下是否存在关于禁用思考功能的参数说明。",
      "timestamp": "2025-11-09T11:51:01.784881"
    },
    {
      "role": "assistant",
      "content": "",
      "timestamp": "2025-11-09T11:51:01.800322",
      "tool_calls": [
        {
          "id": "web_search:6",
          "index": 0,
          "type": "function",
          "function": {
            "name": "web_search",
            "arguments": "{\"query\": \"\\\"thinking\\\" parameter \\\"disabled\\\" moonshot api kimi-k2-thinking documentation\", \"max_results\": 8}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\"success\": true, \"summary\": \"🔍 搜索查询: \\\"thinking\\\" parameter \\\"disabled\\\" moonshot api kimi-k2-thinking documentation\\n📅 搜索时间: 2025-11-09T11:51:05.124344\\n🎯 过滤条件: Topic: general\\n\\n📝 AI摘要:\\nThe \\\"thinking\\\" parameter in the Kimi K2 Thinking API is disabled by default. To enable it, set the model parameter to \\\"kimi-k2-thinking\\\" in the API request. The model supports multi-step reasoning and tool use.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. Using the kimi-k2-thinking model\\n   🔗 https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\\n   📄 🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The ...\\n\\n2. How to Use Kimi K2 Thinking API\\n   🔗 https://apidog.com/blog/kimi-k2-thinking-api/\\n   📄 Kimi K2 Thinking represents Moonshot AI's bold step forward in open-source AI, launching as a specialized thinking agent model optimized for sequential reasoning and tool use. At its core, Kimi K2 Thi...\\n\\n3. Kimi K2 Thinking is Here and It Beats GPT-5!\\n   🔗 https://www.analyticsvidhya.com/blog/2025/11/kimi-k2-thinking/\\n   📄 You can access the model via the chatbot.\\n Weights and code are available on Hugging Face.\\n Via API, you can simply use it by switching the `model` parameter:\\n\\n```\\n$ curl  \\\\ -H \\\"Content-Type: applicat...\\n\\n4. Kimi K2 Thinking: 1T-A32B params, SOTA HLE, BrowseComp ...\\n   🔗 https://news.smol.ai/issues/25-11-06-kimi-k2/\\n   📄 Kimi released Kimi K2 Thinking, an open-source trillion-parameter reasoning model (Activity: 778): Kimi K2 Thinking is a newly released open-source trillion-parameter reasoning model by Moonshot AI, a...\\n\\n5. moonshotai/Kimi-K2-Thinking\\n   🔗 https://huggingface.co/moonshotai/Kimi-K2-Thinking\\n   📄 messages.append({\\n                    \\\"role\\\": \\\"tool\\\",\\n                    \\\"tool_call_id\\\": tool_call.id,\\n                    \\\"name\\\": tool_call_name,\\n                    \\\"content\\\": json.dumps(tool_resul...\\n\\n6. Kimi K2 Thinking: How to Run Locally | Unsloth Documentation\\n   🔗 https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally\\n   📄 Kimi-K2 and Kimi-K2-Thinking achieve SOTA performance in knowledge, reasoning, coding, and agentic tasks. The full 1T parameter models from Moonshot AI requires 1.09TB of disk space, while the quantiz...\\n\\n7. karthink/gptel: A simple LLM client for Emacs\\n   🔗 https://github.com/karthink/gptel\\n   📄 To use Claude 3.7 Sonnet model in its “thinking” mode, you can define a second Claude backend and select it via the UI or elisp:\\n\\n```\\n\\\"Claude-thinking\\\" \\\" \\\";Any name you want;:key\\\"your-API-key\\\" \\\" \\\":str...\\n\\n8. Migrating from OpenAI to Kimi API\\n   🔗 https://platform.moonshot.ai/docs/guide/migrating-from-openai-to-kimi\\n   📄 \\\"content\\\": \\\"You are Kimi, an artificial intelligence assistant provided by Moonshot AI. You are more proficient in Chinese and English conversations. You provide users with safe, helpful, and accurate...\", \"filters\": {\"topic\": \"general\"}, \"query\": \"\\\"thinking\\\" parameter \\\"disabled\\\" moonshot api kimi-k2-thinking documentation\", \"results\": [{\"index\": 1, \"title\": \"Using the kimi-k2-thinking model\", \"url\": \"https://platform.moonshot.ai/docs/guide/use-kimi-k2-thinking-model\", \"content\": \"🎉 New kimi-k2-thinking model released! Now supports multi-step reasoning with tool use. Try it now!\\n\\nDocs\\n\\nGetting Started Guide\\n\\nUse kimi-k2-thinking Model\\n\\n# Using the kimi-k2-thinking model\\n\\n> The `kimi-k2-thinking` model is a general-purpose agentic reasoning model developed by Moonshot AI. Thanks to its strength in deep reasoning and multi-turn tool use, it can solve even the hardest problems.\\n\\n## \\n\\nYou can simply use it by switching the `model` parameter: [...] ```\\n$ curl  \\\\\\n    -H \\\"Content-Type: application/json\\\" \\\\\\n    -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\\\n    -d '{\\n        \\\"model\\\": \\\"kimi-k2-thinking\\\",\\n        \\\"messages\\\": [\\n            {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"hello\\\"}\\n        ],\\n        \\\"temperature\\\": 1.0\\n   }'\\n \\n```\\n\\n## \\n\\nIn the `kimi-k2-thinking` model API response, we use the `reasoning_content` field as the carrier for the model's reasoning. About the `reasoning_content` field: [...] ## \\n\\n`kimi-k2-thinking` is designed to perform deep reasoning across multiple tool calls, enabling it to tackle highly complex tasks.\\n\\n### \\n\\nTo get reliable results, always follow these rules when calling the kimi-k2-thinking model :\", \"score\": 0.6609831, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"How to Use Kimi K2 Thinking API\", \"url\": \"https://apidog.com/blog/kimi-k2-thinking-api/\", \"content\": \"Kimi K2 Thinking represents Moonshot AI's bold step forward in open-source AI, launching as a specialized thinking agent model optimized for sequential reasoning and tool use. At its core, Kimi K2 Thinking is built to simulate human-like deliberation, processing queries through extended \\\"thinking\\\" tokens that enable multi-turn tool interactions without constant human input. This model, available via API for developers, excels in environments demanding prolonged context retention and adaptive [...] As AI models push the boundaries of reasoning and agentic capabilities, Kimi K2 Thinking emerges as a standout innovation from Moonshot AI, blending open-source accessibility with enterprise-grade performance. This trillion-parameter thinking agent model redefines how developers interact with large language models, particularly through its robust API. Designed for tasks requiring deep inference and tool chaining, the Kimi K2 Thinking API allows seamless integration into applications, from [...] Securing access to the Kimi K2 Thinking API starts at platform.moonshot.ai. Sign up or log in, then navigate to the API console under \\\"API Keys.\\\" Click \\\"Create New Key,\\\" select permissions (e.g., chat completions, tool calls), and generate—copy the key immediately, as it's shown once.\\n\\nWith the key, configure your SDK as above. Test with a simple curl:\", \"score\": 0.6407488, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"Kimi K2 Thinking is Here and It Beats GPT-5!\", \"url\": \"https://www.analyticsvidhya.com/blog/2025/11/kimi-k2-thinking/\", \"content\": \"You can access the model via the chatbot.\\n Weights and code are available on Hugging Face.\\n Via API, you can simply use it by switching the `model` parameter:\\n\\n```\\n$ curl  \\\\ -H \\\"Content-Type: application/json\\\" \\\\ -H \\\"Authorization: Bearer $MOONSHOT_API_KEY\\\" \\\\ -d '{ \\\"model\\\": \\\"kimi-k2-thinking\\\", \\\"messages\\\": [ {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"hello\\\"} ], \\\"temperature\\\": 1.0 }'\\n```\\n\\nFor more details on API use, checkout this guide.\\n\\nAlso Read: Kimi OK Computer: A Hands-On Guide to the Free AI Agent [...] Out of all the Chinese AI models available today, Moonshot’s Kimi is my personal favorite! Whether it’s generating slides from a single prompt or performing agentic web browsing, Kimi truly does it all. Just when we thought Kimi K2 was their best model, Moonshot launched an even more powerful upgrade: Kimi K2 Thinking. It is an open-source thinking agent model designed to reason, plan, and act autonomously. Built on test-time scaling, K2 Thinking dynamically expands its reasoning steps and tool [...] Kimi K2 Thinking delivered an outstanding performance on the task of simulating a historically grounded debate between Nikola Tesla and Thomas Edison on the ethics of modern AI. It accurately reflected each inventor’s documented philosophies. Tesla’s idealism, emphasis on open knowledge, and vision of technology serving humanity, versus Edison’s pragmatism, commercial protectionism, and belief in controlled innovation. Extended these worldviews coherently to contemporary issues like deepfakes,\", \"score\": 0.6100127, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"Kimi K2 Thinking: 1T-A32B params, SOTA HLE, BrowseComp ...\", \"url\": \"https://news.smol.ai/issues/25-11-06-kimi-k2/\", \"content\": \"Kimi released Kimi K2 Thinking, an open-source trillion-parameter reasoning model (Activity: 778): Kimi K2 Thinking is a newly released open-source trillion-parameter reasoning model by Moonshot AI, available on Hugging Face. The model is designed to achieve state-of-the-art (SOTA) performance on the HLE benchmark, showcasing its advanced reasoning capabilities. The technical blog provides insights into its architecture and implementation, emphasizing its potential for high-performance [...] Kimi K2 Thinking Goes Agentic, Breaks SOTA: Moonshot AI launched Kimi K2 Thinking with a 256K context window and autonomous 200–300 tool calls, claiming SOTA on HLE (44.9%) and BrowseComp (60.2%); see the technical blog Kimi K2 Thinking and weights on moonshotai at Hugging Face. The model targets reasoning, agentic search, and coding, and is live on kimi.com with API at platform.moonshot.ai. [...] Kimi K2 Thinking Huggingface (Activity: 250): Kimi K2 Thinking is a cutting-edge open-source reasoning model from Huggingface, featuring a `1 trillion` parameter Mixture-of-Experts (MoE) architecture. It utilizes native `INT4` quantization, contrary to the stated `I32`, and employs Quantization-Aware Training (QAT) for enhanced inference speed and accuracy. The model excels in benchmarks like Humanity’s Last Exam (HLE) and BrowseComp, supporting `200-300` tool calls with stable long-horizon\", \"score\": 0.5914643, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"moonshotai/Kimi-K2-Thinking\", \"url\": \"https://huggingface.co/moonshotai/Kimi-K2-Thinking\", \"content\": \"messages.append({\\n                    \\\"role\\\": \\\"tool\\\",\\n                    \\\"tool_call_id\\\": tool_call.id,\\n                    \\\"name\\\": tool_call_name,\\n                    \\\"content\\\": json.dumps(tool_result)\\n                })\\n    print(\\\"-\\\"  100)\\n    print(choice.message.content) [...] Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. It sets a new state-of-the-art on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks by dramatically scaling multi-step reasoning depth and maintaining stable tool-use across 200–300 sequential calls. At the same time, K2 Thinking is a native INT4 quantization model with 256k context [...] print(f\\\"k2 answer: {response.choices.message.content}\\\")\\n    print(\\\"=====below is reasoning content======\\\")\\n    print(f\\\"reasoning content: {response.choices.message.reasoning_content}\\\")\", \"score\": 0.58465123, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"Kimi K2 Thinking: How to Run Locally | Unsloth Documentation\", \"url\": \"https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally\", \"content\": \"Kimi-K2 and Kimi-K2-Thinking achieve SOTA performance in knowledge, reasoning, coding, and agentic tasks. The full 1T parameter models from Moonshot AI requires 1.09TB of disk space, while the quantized Unsloth Dynamic 1.8-bit version reduces this to just 230GB (-80% size): Kimi-K2-GGUF\\n\\nYou can now run Kimi-K2-Thinking with our new GGUFs. Use our same settings below but ensure you change the model name from 'Kimi-K2-Instruct' to 'Kimi-K2-Instruct-0905': K2-0905 GGUFs [...] ### 🤔No Thinking Tags?\\n\\nYou may notice that there are no thinking tags when you run the model. This is normal and intended behavior.\\n\\nIn your `llama.cpp` script, make sure to include the `--special` flag at the very end of your command. Once you do, you’ll see the  token appear as expected. [...] ```\\n  You are Kimi, an AI assistant created by Moonshot AI.You are Kimi, an AI assistant created by Moonshot AI.\\n  ```\\n\\nWe recommend setting min\\\\_p to 0.01 to suppress the occurrence of unlikely tokens with low probabilities.\\n\\n### 🔢 Chat template and prompt format\", \"score\": 0.5835125, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"karthink/gptel: A simple LLM client for Emacs\", \"url\": \"https://github.com/karthink/gptel\", \"content\": \"To use Claude 3.7 Sonnet model in its “thinking” mode, you can define a second Claude backend and select it via the UI or elisp:\\n\\n```\\n\\\"Claude-thinking\\\" \\\" \\\";Any name you want;:key\\\"your-API-key\\\" \\\" \\\":stream t:models:request-params:thinking:type \\\"enabled\\\" \\\" \\\":budget_tokens 2048:max_tokens 4096\\n```\\n\\nYou can set the reasoning budget tokens and max tokens for this usage via the `:budget_tokens` and `:max_tokens` keys here, respectively.\", \"score\": 0.5177909, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"Migrating from OpenAI to Kimi API\", \"url\": \"https://platform.moonshot.ai/docs/guide/migrating-from-openai-to-kimi\", \"content\": \"\\\"content\\\": \\\"You are Kimi, an artificial intelligence assistant provided by Moonshot AI. You are more proficient in Chinese and English conversations. You provide users with safe, helpful, and accurate answers. You also refuse to answer any questions involving terrorism, racism, pornography, or violence. Moonshot AI is a proper noun and should not be translated into other languages.\\\"},\", \"score\": 0.48499045, \"published_date\": \"\"}], \"total_results\": 8}",
      "timestamp": "2025-11-09T11:51:05.128304",
      "tool_call_id": "web_search:6",
      "name": "web_search"
    },
    {
      "role": "assistant",
      "content": "基于我查到的官方文档和技术资料，我可以明确回答你的问题：\n\n## 答案：**仍然会思考**\n\n### 原因分析：\n\n1. **Kimi-k2-thinking模型没有thinking参数**\n   - 官方API文档中没有任何关于`thinking`参数的说明\n   - 该模型是**专门设计为思考模式**的，思考是其固有特性\n\n2. **模型架构决定**\n   - `kimi-k2-thinking`是一个**思考代理模型**（thinking agent）\n   - 它的核心特性就是**原生具备思考能力**\n   - 无论传入什么参数，它都会进行推理过程\n\n3. **思考过程不可禁用**\n   - 官方文档强调该模型\"擅长深度推理\"\n   - 会输出`reasoning_content`字段显示思考内容\n   - 这是模型的**内置行为**，无法通过参数关闭\n\n### 正确的控制方式：\n\n| 需求 | 正确做法 | 错误做法 |\n|------|----------|----------|\n| 启用思考 | 使用`model: \"kimi-k2-thinking\"` | 传`thinking: true` |\n| 禁用思考 | 使用`model: \"kimi-k2-instruct\"` | 传`thinking: false` |\n\n### 总结：\n传入`thinking: disabled`参数**不会生效**，`kimi-k2-thinking`模型**仍然会思考**。如果你想禁用思考功能，应该使用`kimi-k2-instruct`模型而不是试图通过参数控制思考行为。",
      "timestamp": "2025-11-09T11:51:23.959223"
    }
  ],
  "todo_list": null,
  "metadata": {
    "project_path": "/opt/agent/agents/users/jojo/project",
    "project_relative_path": "users/jojo/project",
    "thinking_mode": false,
    "total_messages": 25,
    "total_tools": 14,
    "status": "active"
  },
  "token_statistics": {
    "total_input_tokens": 266246,
    "total_output_tokens": 1673,
    "updated_at": "2025-11-09T12:05:51.659516"
  }
}