144 lines
77 KiB
JSON
144 lines
77 KiB
JSON
{
|
||
"id": "conv_20251103_135635_404",
|
||
"title": "我听说cursor自己最新发布了一个模型,但是被发现疑似使用deepseek或qwen的模型微调出来...",
|
||
"created_at": "2025-11-03T13:56:35.405015",
|
||
"updated_at": "2025-11-03T15:46:45.579555",
|
||
"messages": [
|
||
{
|
||
"role": "user",
|
||
"content": "我听说cursor自己最新发布了一个模型,但是被发现疑似使用deepseek或qwen的模型微调出来的,是真的吗",
|
||
"timestamp": "2025-11-03T13:57:21.994350"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "我来帮您搜索一下关于Cursor最新模型是否使用DeepSeek或Qwen模型微调的相关信息。",
|
||
"timestamp": "2025-11-03T13:57:24.393553"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "",
|
||
"timestamp": "2025-11-03T13:57:24.396854",
|
||
"tool_calls": [
|
||
{
|
||
"id": "web_search:0",
|
||
"index": 0,
|
||
"type": "function",
|
||
"function": {
|
||
"name": "web_search",
|
||
"arguments": "{\"query\": \"Cursor IDE latest model deepseek qwen fine tuning controversy\", \"time_range\": \"week\", \"max_results\": 10}"
|
||
}
|
||
}
|
||
]
|
||
},
|
||
{
|
||
"role": "tool",
|
||
"content": "{\"success\": true, \"summary\": \"🔍 搜索查询: Cursor IDE latest model deepseek qwen fine tuning controversy\\n📅 搜索时间: 2025-11-03T13:57:30.085520\\n🎯 过滤条件: Topic: general | Time Range: week\\n\\n📝 AI摘要:\\nCursor's latest model, Composer, is a proprietary mixture-of-experts model fine-tuned for coding, not derived from existing open-source models. The controversy surrounds its claims of 4x speed improvement over similar models. Cursor has been evasive about the base model's origins.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. Cursor Composer: Breaking Down the 4x Speed Claim in AI ...\\n 🔗 https://medium.com/@meshuggah22/cursor-composer-breaking-down-the-4x-speed-claim-in-ai-assisted-coding-b0b0f911bfe0\\n 📄 Based on what I’ve managed to find Cursor’s Composer isn’t a fine-tuned Llama or Qwen model. While the team does not share a lot of details about the architecture of the model, what’s clear is their t...\\n\\n2. Fine Tune Qwen2.5-VL-32B in 3 Days Complete Hands On Tutorial\\n 🔗 https://skywork.ai/blog/llm/fine-tune-qwen2-5-vl-32b-in-3-days-complete-hands-on-tutorial/\\n 📄 Cold test set (50 items, 2025‑10‑25, not seen in training):\\n\\nAnecdote: on a cluttered dashboard with overlapping tooltips, base model guessed “flat trend.” The fine‑tuned model said “up 6.2% week‑over...\\n\\n3. Simon Willison on llms\\n 🔗 https://simonwillison.net/tags/llms/\\n 📄 DeepSeek released a new model yesterday: DeepSeek-OCR, a 6.6GB model fine-tuned specifically for OCR. They released it as model weights that run using PyTorch and CUDA. I got it running on the NVIDIA ...\\n\\n4. Simon Willison on coding-agents\\n 🔗 https://simonwillison.net/tags/coding-agents/\\n 📄 DeepSeek released a new model yesterday: DeepSeek-OCR, a 6.6GB model fine-tuned specifically for OCR. They released it as model weights that run using PyTorch and CUDA. I got it running on the NVIDIA ...\\n\\n5. New prompt injection papers: Agents Rule of Two and The Attacker ...\\n 🔗 https://simonw.substack.com/p/new-prompt-injection-papers-agents\\n 📄 One detail that’s notably absent from their description: did they train the model from scratch, or did they start with an existing open-weights model such as something from Qwen or GLM?\\n\\nCursor resear...\\n\\n6. Simon Willison on generative-ai\\n 🔗 https://simonwillison.net/tags/generative-ai/\\n 📄 One detail that's notably absent from their description: did they train the model from scratch, or did they start with an existing open-weights model such as something from Qwen or GLM?\\n\\nCursor resear...\\n\\n7. not much happened today | AINews - AI News (Smol)\\n 🔗 https://news.smol.ai/issues/25-10-30-not-much/\\n 📄 4. Long-Context Engineering: Kimi’s Linear Attention Push\\n\\n`Kimi Delta Attention`\\n\\n# Discord: High level Discord summaries\\n\\n## Perplexity AI Discord\\n\\n## LMArena Discord\\n\\n## Cursor Community Discord\\n\\n#...\\n\\n8. Cursor 2.0 and Composer: how a multi-agent rethink surprised AI ...\\n 🔗 https://www.cometapi.com/cursor-2-0-what-changed-and-why-it-matters/\\n 📄 Cursor’s new model, Composer, is a frontier coding model trained and optimized specifically for agentic interactions inside the Cursor environment. It emphasizes throughput and responsiveness — the es...\\n\\n9. Do Large Language Models Grasp The Grammar? Evidence ... - arXiv\\n 🔗 https://arxiv.org/html/2510.24856v1\\n 📄 Qwen/Qwen3-4B-Instruct-2507\\n0.15 ± 0.13\\n8.12 ± 7.04\\n26.74 ± 7.55\\n0.86 ± 0.01\\n0.90 ± 0.01\\n0.82 ± 0.01\\n0.52 ± 0.01\\n\\nQwen/Qwen3-30B-A3B-Instruct-2507\\n0.26 ± 0.13\\n14.40 ± 10.05\\n35.84 ± 8.85\\n0.85 ± 0.01\\n0....\\n\\n10. Untitled\\n 🔗 http://www.williballenthin.com/homepage/\\n 📄 Cursor released their own model this week: Composer-1. Here's Simon Willison's take on it. Personally, I've played around with it and while the speed is impressive it also isn't that smart and I wonde...\", \"filters\": {\"topic\": \"general\", \"time_range\": \"week\"}, \"query\": \"Cursor IDE latest model deepseek qwen fine tuning controversy\", \"results\": [{\"index\": 1, \"title\": \"Cursor Composer: Breaking Down the 4x Speed Claim in AI ...\", \"url\": \"https://medium.com/@meshuggah22/cursor-composer-breaking-down-the-4x-speed-claim-in-ai-assisted-coding-b0b0f911bfe0\", \"content\": \"Based on what I’ve managed to find Cursor’s Composer isn’t a fine-tuned Llama or Qwen model. While the team does not share a lot of details about the architecture of the model, what’s clear is their training approach:\\n\\nReinforcement Learning in Production Environments [...] Sign up\\n\\nSign in\\n\\nSign up\\n\\nSign in\\n\\nMember-only story\\n\\n# Cursor Composer: Breaking Down the 4x Speed Claim in AI-Assisted Coding\\n\\nPawel\\n\\n--\\n\\nShare\\n\\nCursor has just released version 2.0 alongside Composer — their first proprietary language model. This isn’t just another API wrapper around Claude or GPT. This is Cursor training their own mixture-of-experts model from the ground up, optimized specifically for one thing: making you code faster. [...] Their headline claim? 4x faster than similarly intelligent models. But in an industry drowning in benchmark hype, what does that actually mean? After diving deep into the technical details, community feedback, and preparing a systematic test suite, here’s what you need to know about Cursor’s bold move into model development.\\n\\n### What Makes Composer Different\", \"score\": 0.65940565, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"Fine Tune Qwen2.5-VL-32B in 3 Days Complete Hands On Tutorial\", \"url\": \"https://skywork.ai/blog/llm/fine-tune-qwen2-5-vl-32b-in-3-days-complete-hands-on-tutorial/\", \"content\": \"Cold test set (50 items, 2025‑10‑25, not seen in training):\\n\\nAnecdote: on a cluttered dashboard with overlapping tooltips, base model guessed “flat trend.” The fine‑tuned model said “up 6.2% week‑over‑week,” which matched the ground truth. Not perfect everywhere, but on my “annoying real stuff,” it felt notably more reliable.\\n\\nNote: these are my numbers on my data. If you work with docs heavy on OCR, you’ll need more samples and stricter preprocessing. [...] Would I do it again? Yep. With a weekend, Colab, and 1–2k labeled examples, qwen2 5 vl fine tune turned from “cool demo” into a sharper teammate. If you try it, send me your weirdest screenshots, I’m curious how it handles them.\\n\\nPrevious posts:\\n\\n### About The Author\\n\\n#### Camille\\n\\nQwen3-VL GUI Automation 2025 Visual Agent Revolution\\n\\n## Related Posts\\n\\nDécouvrez comment DeepSeek-V3.2-Exp for Customer Service\\n\\n### Découvrez comment DeepSeek-V3.2-Exp for Customer Service [...] I’m Camille. On October 24, 2025, I opened a folder of messy screenshots from a product demo and thought, “Can I make Qwen2.5‑VL actually understand these without babying it?” That tiny annoyance kicked off a weekend rabbit hole: Qwen2.5 vl fine-tune with LoRA, a small custom dataset, and a Gradio front end to see the results live. Not sponsored, just honest results and a few “oh wow” moments.\\n\\n## Why Fine-Tune Qwen2.5-VL?\", \"score\": 0.5957048, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"Simon Willison on llms\", \"url\": \"https://simonwillison.net/tags/llms/\", \"content\": \"DeepSeek released a new model yesterday: DeepSeek-OCR, a 6.6GB model fine-tuned specifically for OCR. They released it as model weights that run using PyTorch and CUDA. I got it running on the NVIDIA Spark by having Claude Code effectively brute force the challenge of getting it working on that particular hardware. [...] DeepSeek released a new model yesterday: DeepSeek-OCR, a 6.6GB model fine-tuned specifically for OCR. They released it as model weights that run using PyTorch and CUDA. I got it running on the NVIDIA Spark by having Claude Code effectively brute force the challenge of getting it working on that particular hardware.\\n\\n... [1,971 words] [...] One detail that's notably absent from their description: did they train the model from scratch, or did they start with an existing open-weights model such as something from Qwen or GLM?\\n\\nCursor researcher Sasha Rush has been answering questions on Hacker News, but has so far been evasive in answering questions about the base model. When directly asked \\\"is Composer a fine tune of an existing open source base model?\\\" they replied:\", \"score\": 0.56200886, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"Simon Willison on coding-agents\", \"url\": \"https://simonwillison.net/tags/coding-agents/\", \"content\": \"DeepSeek released a new model yesterday: DeepSeek-OCR, a 6.6GB model fine-tuned specifically for OCR. They released it as model weights that run using PyTorch and CUDA. I got it running on the NVIDIA Spark by having Claude Code effectively brute force the challenge of getting it working on that particular hardware.\\n\\n... [1,971 words] [...] One detail that's notably absent from their description: did they train the model from scratch, or did they start with an existing open-weights model such as something from Qwen or GLM?\\n\\nCursor researcher Sasha Rush has been answering questions on Hacker News, but has so far been evasive in answering questions about the base model. When directly asked \\\"is Composer a fine tune of an existing open source base model?\\\" they replied: [...] DeepSeek released a new model yesterday: DeepSeek-OCR, a 6.6GB model fine-tuned specifically for OCR. They released it as model weights that run using PyTorch and CUDA. I got it running on the NVIDIA Spark by having Claude Code effectively brute force the challenge of getting it working on that particular hardware.\\n\\nAndrej Karpathy — AGI is still a decade away\\n(via)\\nExtremely high signal 2 hour 25 minute (!) conversation between Andrej Karpathy and Dwarkesh Patel.\", \"score\": 0.5081298, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"New prompt injection papers: Agents Rule of Two and The Attacker ...\", \"url\": \"https://simonw.substack.com/p/new-prompt-injection-papers-agents\", \"content\": \"One detail that’s notably absent from their description: did they train the model from scratch, or did they start with an existing open-weights model such as something from Qwen or GLM?\\n\\nCursor researcher Sasha Rush has been answering questions on Hacker News, but has so far been evasive in answering questions about the base model. When directly asked “is Composer a fine tune of an existing open source base model?” they replied: [...] It’s interesting to see Cursor investing resources in training their own code-specific model - similar to GPT-5-Codex or Qwen3-Coder. From their post:\\n\\nComposer is a mixture-of-experts (MoE) language model supporting long-context generation and understanding. It is specialized for software engineering through reinforcement learning (RL) in a diverse range of development environments. [...] [...] Link 2025-10-29 Composer: Building a fast frontier model with RL:\\n\\nCursor released Cursor 2.0 today, with a refreshed UI focused on agentic coding (and running agents in parallel) and a new model that’s unique to Cursor called Composer 1.\\n\\nAs far as I can tell there’s no way to call the model directly via an API, so I fired up “Ask” mode in Cursor’s chat side panel and asked it to “Generate an SVG of a pelican riding a bicycle”:\", \"score\": 0.5035894, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"Simon Willison on generative-ai\", \"url\": \"https://simonwillison.net/tags/generative-ai/\", \"content\": \"One detail that's notably absent from their description: did they train the model from scratch, or did they start with an existing open-weights model such as something from Qwen or GLM?\\n\\nCursor researcher Sasha Rush has been answering questions on Hacker News, but has so far been evasive in answering questions about the base model. When directly asked \\\"is Composer a fine tune of an existing open source base model?\\\" they replied: [...] It's interesting to see Cursor investing resources in training their own code-specific model - similar to GPT-5-Codex or Qwen3-Coder. From their post:\\n\\nComposer is a mixture-of-experts (MoE) language model supporting long-context generation and understanding. It is specialized for software engineering through reinforcement learning (RL) in a diverse range of development environments. [...] [...] Our primary focus is on RL post-training. We think that is the best way to get the model to be a strong interactive agent.\\n\\nSasha did confirm that rumors of an earlier Cursor preview model, Cheetah, being based on a model by xAI's Grok were \\\"Straight up untrue.\\\"\\n\\n# 29th October 2025,\\n8:45 pm\\n/ llm-release, cursor, pelican-riding-a-bicycle, generative-ai, ai, llms, parallel-agents, coding-agents, ai-assisted-programming\\n\\n### Hacking the WiFi-enabled color screen GitHub Universe conference badge\", \"score\": 0.4854295, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"not much happened today | AINews - AI News (Smol)\", \"url\": \"https://news.smol.ai/issues/25-10-30-not-much/\", \"content\": \"4. Long-Context Engineering: Kimi’s Linear Attention Push\\n\\n`Kimi Delta Attention`\\n\\n# Discord: High level Discord summaries\\n\\n## Perplexity AI Discord\\n\\n## LMArena Discord\\n\\n## Cursor Community Discord\\n\\n## Unsloth AI (Daniel Han) Discord\\n\\n## OpenRouter Discord\\n\\n## HuggingFace Discord\\n\\n## Modular (Mojo 🔥) Discord\\n\\n## LM Studio Discord\\n\\n`qwen/qwen3-vl-4b`\\n\\n## GPU MODE Discord\\n\\n`DeviceScan`\\n`single_pass.bin`\\n`quantize_`\\n\\n## Yannick Kilcher Discord\\n\\n## DSPy Discord\\n\\n## Latent Space Discord [...] ### Unsloth AI (Daniel Han) ▷ #general (221 messages🔥🔥):\\n\\n`RTX 8000 Turing Cards, Qwen3 finetuning, Kimi-Linear-48B-A3B-Instruct Model, Qwen 3 VL, GLM 4.6 model`\\n\\n`RTX 8000 Turing Cards, Qwen3 finetuning, Kimi-Linear-48B-A3B-Instruct Model, Qwen 3 VL, GLM 4.6 model`\\n\\n### Unsloth AI (Daniel Han) ▷ #off-topic (103 messages🔥🔥):\\n\\n`Backend latency improvements, VAE data sample requirements, Colab UI updates, Elon Musk's Grokipedia, Probabilistic computing` [...] `Qwen3VLCausalLMOutputWithPast and hidden states, Unsloth environment flags for debugging, triton_kernels installation issues, Offline loading with Unsloth, Mapping part of training stuck`\\n`Qwen3VLCausalLMOutputWithPast`\\n`os.environ[\\\"UNSLOTH_RETURN_HIDDEN_STATES\\\"] = \\\"1\\\"`\\n`os.environ[\\\"UNSLOTH_RETURN_LOGITS\\\"] = \\\"1\\\"`\\n`No module named 'triton_kernels'`\\n`triton_kernels`\\n`UNSLOTH_DISABLE_STATISTICS=\\\"\\\"`\\n`HF_HUB_OFFLINE=\\\"1\\\"`\\n`TRANSFORMERS_OFFLINE=\\\"1\\\"`\\n`local_files_only = True`\\n`dataset_text_field`\", \"score\": 0.36039954, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"Cursor 2.0 and Composer: how a multi-agent rethink surprised AI ...\", \"url\": \"https://www.cometapi.com/cursor-2-0-what-changed-and-why-it-matters/\", \"content\": \"Cursor’s new model, Composer, is a frontier coding model trained and optimized specifically for agentic interactions inside the Cursor environment. It emphasizes throughput and responsiveness — the essential properties for short, iterative coding turns inside an editor — over raw, expensive token-perplexity benchmarks. The company reports Composer achieves roughly 4× generation speed compared to similarly capable models and completes most interactive turns in under 30 seconds in their internal [...] Cursor, the AI-first code editor and agent platform, released Cursor 2.0 on October 29, 2025 — a major update that pairs a purpose-built coding model (named Composer) with a reimagined, agent-centered interface and a set of platform upgrades aimed at making agentic coding faster, safer, and more collaborative. The company positions this release as a strategic shift from plugging in third-party models to shipping its own trained model and tooling that are optimized end-to-end for “agentic” [...] AI assistants for code have moved beyond single-turn completions toward longer, multi-step workflows (plan → edit → test → iterate). That creates new bottlenecks — latency, context fidelity across large repos, safe execution of shell/CI commands, and the human work of reviewing agent changes. Cursor 2.0 directly targets those bottlenecks by shipping both a model tuned for agentic workflows and UI/architecture primitives to run many agents concurrently and safely. In short: Cursor is trying to\", \"score\": 0.35414317, \"published_date\": \"\"}, {\"index\": 9, \"title\": \"Do Large Language Models Grasp The Grammar? Evidence ... - arXiv\", \"url\": \"https://arxiv.org/html/2510.24856v1\", \"content\": \"Qwen/Qwen3-4B-Instruct-2507\\n0.15 ± 0.13\\n8.12 ± 7.04\\n26.74 ± 7.55\\n0.86 ± 0.01\\n0.90 ± 0.01\\n0.82 ± 0.01\\n0.52 ± 0.01\\n\\nQwen/Qwen3-30B-A3B-Instruct-2507\\n0.26 ± 0.13\\n14.40 ± 10.05\\n35.84 ± 8.85\\n0.85 ± 0.01\\n0.95 ± 0.01\\n0.81 ± 0.01\\n0.54 ± 0.01\\n\\nQwen/Qwen3-Next-80B-A3B-Instruct\\n0.26 ± 0.12\\n19.15 ± 11.46\\n40.31 ± 9.59\\n0.92 ± 0.01\\n0.95 ± 0.01\\n0.88 ± 0.01\\n0.57 ± 0.01\\n\\nQwen/Qwen3-Next-80B-A3B-Thinking\\n0.24 ± 0.09\\n18.53 ± 10.60\\n40.41 ± 8.70\\n0.95 ± 0.01\\n0.95 ± 0.01\\n0.91 ± 0.01\\n0.58 ± 0.01 [...] GPT-5\\n∼\\\\sim300B\\n2025-08\\n\\ngoogle/gemma-2-2b-it\\n2B\\n2024-06\\n\\ngoogle/gemma-2-9b-it\\n9B\\n2024-06\\n\\ngoogle/gemma-3-4b-it\\n4B\\n2025-03\\n\\ngoogle/gemma-3-12b-it\\n12B\\n2025-03\\n\\ngoogle/gemma-3-27b-it\\n27B\\n2025-03\\n\\nmeta-llama/Llama-3.2-1B-Instruct\\n1B\\n2024-04\\n\\nmeta-llama/Llama-3.2-3B-Instruct\\n3B\\n2024-04\\n\\nmeta-llama/Llama-3.3-70B-Instruct\\n70B\\n2024-10\\n\\nQwen/Qwen3-4B-Instruct-2507\\n4B\\n2025-07\\n\\nQwen/Qwen3-30B-A3B-Instruct-2507\\n30B\\n2025-07\\n\\nQwen/Qwen3-Next-80B-A3B-Instruct\\n80B\\n2025-09 [...] grammatical understanding. However, as model size increases, translation capabilities strengthen and can possibly surpass grammatical understanding. Moreover, translation improvements appear to encounter capacity-related bottlenecks beyond a certain scale, at which point both grammatical understanding and translation plateau, as exemplified by transitioning from Gemma-3-12B to 27B, or Qwen3-30B to larger variants.\", \"score\": 0.1799484, \"published_date\": \"\"}, {\"index\": 10, \"title\": \"Untitled\", \"url\": \"http://www.williballenthin.com/homepage/\", \"content\": \"Cursor released their own model this week: Composer-1. Here's Simon Willison's take on it. Personally, I've played around with it and while the speed is impressive it also isn't that smart and I wonder how that will shake out. We've done experiments with fast-but-not-as-smart models in Amp and when I went to measure the difference between a slow-but-smart and a fast-but-dumb model, the slow model turned out to be faster every time, when measuring prompt-to-result time, because the [...] Thought this was the perfect opportunity to create an actual useful app and do\\nbulk PDF processing with vision models. Spun up qwen2.5vl:32b on Ollama and\\nbuilt a pipeline:\\n\\nTook about 3-4 hours to process everything locally. Found the proof I needed\\non page 287 of their Declaration. Sent them the evidence, but ofc still\\nwaiting to hear back.\\n\\nFinally justified the purpose of this rig lol. [...] Both Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? | Hey, what's going on? Are Chinese models saving American startups? submitted by /u/Successful-Newt1517 \\n(\\n( \\n---|---\\n\\nBoth Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models?\\n\\n[](\\n\\ngenerated: November 03, 2025 at 03:28:46\", \"score\": 0.13455996, \"published_date\": \"\"}], \"total_results\": 10}",
|
||
"timestamp": "2025-11-03T13:57:30.089658",
|
||
"tool_call_id": "web_search:0",
|
||
"name": "web_search"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "让我搜索更多关于这个争议的具体信息:",
|
||
"timestamp": "2025-11-03T13:57:32.397146"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "",
|
||
"timestamp": "2025-11-03T13:57:32.400966",
|
||
"tool_calls": [
|
||
{
|
||
"id": "web_search:1",
|
||
"index": 0,
|
||
"type": "function",
|
||
"function": {
|
||
"name": "web_search",
|
||
"arguments": "{\"query\": \"Cursor Composer model base model deepseek qwen speculation controversy\", \"time_range\": \"week\", \"max_results\": 8}"
|
||
}
|
||
}
|
||
]
|
||
},
|
||
{
|
||
"role": "tool",
|
||
"content": "{\"success\": true, \"summary\": \"🔍 搜索查询: Cursor Composer model base model deepseek qwen speculation controversy\\n📅 搜索时间: 2025-11-03T13:57:36.517534\\n🎯 过滤条件: Topic: general | Time Range: week\\n\\n📝 AI摘要:\\nCursor Composer is a proprietary language model by Cursor, and speculation exists about its base model origins. Controversy surrounds its potential ties to Chinese AI labs. Cursor's latest model focuses on agentic coding capabilities.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. Both Cursor and Cognition (Windsurf) new models are speculated to ...\\n 🔗 https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/both_cursor_and_cognition_windsurf_new_models_are/\\n 📄 Both Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? Skip to main contentBoth Cursor and Cognition (Windsurf) new models are speculated to be built on Chi...\\n\\n2. Cursor Composer: Breaking Down the 4x Speed Claim in AI ...\\n 🔗 https://medium.com/@meshuggah22/cursor-composer-breaking-down-the-4x-speed-claim-in-ai-assisted-coding-b0b0f911bfe0\\n 📄 # Cursor Composer: Breaking Down the 4x Speed Claim in AI-Assisted Coding Cursor has just released version 2.0 alongside Composer — their first proprietary language model. This is Cursor training thei...\\n\\n3. New prompt injection papers: Agents Rule of Two and The Attacker ...\\n 🔗 https://simonw.substack.com/p/new-prompt-injection-papers-agents\\n 📄 Cursor released Cursor 2.0 today, with a refreshed UI focused on agentic coding (and running agents in parallel) and a new model that’s unique to Cursor called **Composer 1**. This is in line with a n...\\n\\n4. Simon Willison on generative-ai\\n 🔗 https://simonwillison.net/tags/generative-ai/\\n 📄 [6:11 pm](https://simonwillison.net/2025/Sep/29/claude-sonnet-4-5/ \\\"Permalink for \\\\\\\"Claude Sonnet 4.5 is probably the \\\\\\\"best coding model in the world\\\\\\\" (at least for now)\\\\\\\"\\\") / [29th September 2025](...\\n\\n5. not much happened today | AINews - AI News (Smol)\\n 🔗 https://news.smol.ai/issues/25-10-30-not-much/\\n 📄 The guide is designed to share insights on what strategies have been effective and which have not, aiming to help practitioners build reliable LLMs. The playbook is available on Hugging Face’s platfor...\\n\\n6. AI #140: Trying To Hold The Line - Zvi Mowshowitz | Substack\\n 🔗 https://thezvi.substack.com/p/ai-140-trying-to-hold-the-line\\n 📄 We would like to do more—we would like to build an AI factory that can make 1 gigawatt per week of new capacity, at a greatly reduced cost relative to today—but that will require more confidence in fu...\\n\\n7. Daniel D. Gutierrez, Principal Analyst & Resident Data Scientist\\n 🔗 https://radicaldatascience.wordpress.com/author/amuletc/\\n 📄 [Radical Data Science ==================== News and Industry Analysis for Data Science, Machine Learning, AI and Deep Learning -------------------------------------------------------------------------...\\n\\n8. Simon Willison on coding-agents\\n 🔗 https://simonwillison.net/tags/coding-agents/\\n 📄 # 14th October 2025, 9:26 pm / ai, generative-ai, llms, ai-assisted-programming, model-context-protocol, coding-agents, claude-code, codex-cli, parallel-agents # 10th October 2025, 11:30 pm / plugins,...\", \"filters\": {\"topic\": \"general\", \"time_range\": \"week\"}, \"query\": \"Cursor Composer model base model deepseek qwen speculation controversy\", \"results\": [{\"index\": 1, \"title\": \"Both Cursor and Cognition (Windsurf) new models are speculated to ...\", \"url\": \"https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/both_cursor_and_cognition_windsurf_new_models_are/\", \"content\": \"Both Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? Skip to main contentBoth Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? Both Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? Image 5: r/LocalLLaMA - Both Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? Image 6: r/LocalLLaMA - Both Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? Image 7: r/LocalLLaMA - Both Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? Which US open model could compete with Chinese ones?\", \"score\": 0.9855, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"Cursor Composer: Breaking Down the 4x Speed Claim in AI ...\", \"url\": \"https://medium.com/@meshuggah22/cursor-composer-breaking-down-the-4x-speed-claim-in-ai-assisted-coding-b0b0f911bfe0\", \"content\": \"# Cursor Composer: Breaking Down the 4x Speed Claim in AI-Assisted Coding Cursor has just released version 2.0 alongside Composer — their first proprietary language model. This is Cursor training their own mixture-of-experts model from the ground up, optimized specifically for one thing: making you code faster. If you’re new to Medium, create a new account to read this story on us. ## I Built an AI App in 10 Minutes With Google’s New Tool, Opal. ### From a simple idea to a working AI mini-app without a single line of code. ### Claude Code v2.0.28 introduces specialized planning subagents, resumable context, and dynamic model selection — transforming coding This new technique unlocks 2× more creativity from ANY AI model — no training required…\", \"score\": 0.98529, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"New prompt injection papers: Agents Rule of Two and The Attacker ...\", \"url\": \"https://simonw.substack.com/p/new-prompt-injection-papers-agents\", \"content\": \"Cursor released Cursor 2.0 today, with a refreshed UI focused on agentic coding (and running agents in parallel) and a new model that’s unique to Cursor called **Composer 1**. This is in line with a number of recent open weights model releases from Chinese AI labs which focus on agentic capabilities, likely pointing to a heavy post-training emphasis on RL. This is a notable trend: if you want to build a really great agentic coding tool there’s clearly a lot to be said for using reinforcement learning to fine-tune a model against your own custom set of tools using large numbers of sandboxed simulated coding environments as part of that process.\", \"score\": 0.98262, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"Simon Willison on generative-ai\", \"url\": \"https://simonwillison.net/tags/generative-ai/\", \"content\": \"[6:11 pm](https://simonwillison.net/2025/Sep/29/claude-sonnet-4-5/ \\\"Permalink for \\\\\\\"Claude Sonnet 4.5 is probably the \\\\\\\"best coding model in the world\\\\\\\" (at least for now)\\\\\\\"\\\") / [29th September 2025](https://simonwillison.net/2025/Sep/29/) / [anthropic](https://simonwillison.net/tags/anthropic/), [claude](https://simonwillison.net/tags/claude/), [code-interpreter](https://simonwillison.net/tags/code-interpreter/), [llm-reasoning](https://simonwillison.net/tags/llm-reasoning/), [llm-tool-use](https://simonwillison.net/tags/llm-tool-use/), [llm-pricing](https://simonwillison.net/tags/llm-pricing/), [pelican-riding-a-bicycle](https://simonwillison.net/tags/pelican-riding-a-bicycle/), [ai](https://simonwillison.net/tags/ai/), [llms](https://simonwillison.net/tags/llms/), [llm-release](https://simonwillison.net/tags/llm-release/), [generative-ai](https://simonwillison.net/tags/generative-ai/), [llm](https://simonwillison.net/tags/llm/) [#](https://simonwillison.net/2025/Sep/27/video-models-are-zero-shot-learners-and-reasoners/)[27th September 2025](https://simonwillison.net/2025/Sep/27/), [11:59 pm](https://simonwillison.net/2025/Sep/27/video-models-are-zero-shot-learners-and-reasoners/) / [gemini](https://simonwillison.net/tags/gemini/), [google](https://simonwillison.net/tags/google/), [generative-ai](https://simonwillison.net/tags/generative-ai/), [video](https://simonwillison.net/tags/video/), [ai](https://simonwillison.net/tags/ai/), [llms](https://simonwillison.net/tags/llms/), [video-models](https://simonwillison.net/tags/video-models/), [paper-review](https://simonwillison.net/tags/paper-review/) [#](https://simonwillison.net/2025/Sep/15/gpt-5-codex/)[15th September 2025](https://simonwillison.net/2025/Sep/15/), [6:55 pm](https://simonwillison.net/2025/Sep/15/gpt-5-codex/) / [openai](https://simonwillison.net/tags/openai/), [pelican-riding-a-bicycle](https://simonwillison.net/tags/pelican-riding-a-bicycle/), [ai](https://simonwillison.net/tags/ai/), [llms](https://simonwillison.net/tags/llms/), [llm-release](https://simonwillison.net/tags/llm-release/), [gpt-5](https://simonwillison.net/tags/gpt-5/), [codex-cli](https://simonwillison.net/tags/codex-cli/), [ai-assisted-programming](https://simonwillison.net/tags/ai-assisted-programming/), [generative-ai](https://simonwillison.net/tags/generative-ai/), [coding-agents](https://simonwillison.net/tags/coding-agents/) [#](https://simonwillison.net/2025/Sep/14/models-can-prompt/)[14th September 2025](https://simonwillison.net/2025/Sep/14/), [8:25 pm](https://simonwillison.net/2025/Sep/14/models-can-prompt/) / [prompt-engineering](https://simonwillison.net/tags/prompt-engineering/), [llms](https://simonwillison.net/tags/llms/), [ai](https://simonwillison.net/tags/ai/), [generative-ai](https://simonwillison.net/tags/generative-ai/), [gpt-5](https://simonwillison.net/tags/gpt-5/), [anthropic](https://simonwillison.net/tags/anthropic/), [claude](https://simonwillison.net/tags/claude/), [claude-code](https://simonwillison.net/tags/claude-code/), [claude-4](https://simonwillison.net/tags/claude-4/) [GPT-5 rate limits here](https://platform.openai.com/docs/models/gpt-5) show tier 5 stays at 40M tokens per minute. The [GPT-5 mini rate limits](https://platform.openai.com/docs/models/gpt-5-mini) for tiers 2 through 5 are 2M, 4M, 10M and 180M TPM respectively. [#](https://simonwillison.net/2025/Sep/12/gpt-5-rate-limits/)[12th September 2025](https://simonwillison.net/2025/Sep/12/), [11:14 pm](https://simonwillison.net/2025/Sep/12/gpt-5-rate-limits/) / [gemini](https://simonwillison.net/tags/gemini/), [anthropic](https://simonwillison.net/tags/anthropic/), [generative-ai](https://simonwillison.net/tags/generative-ai/), [openai](https://simonwillison.net/tags/openai/), [llm-pricing](https://simonwillison.net/tags/llm-pricing/), [ai](https://simonwillison.net/tags/ai/), [llms](https://simonwillison.net/tags/llms/), [gpt-5](https://simonwillison.net/tags/gpt-5/)\", \"score\": 0.98239, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"not much happened today | AINews - AI News (Smol)\", \"url\": \"https://news.smol.ai/issues/25-10-30-not-much/\", \"content\": \"The guide is designed to share insights on what strategies have been effective and which have not, aiming to help practitioners build reliable LLMs. The playbook is available on Hugging Face’s platform and is structured to cover model architecture, data curation, and infrastructure considerations, providing a detailed roadmap for LLM development.** The community response is positive, with users expressing gratitude and appreciation for the detailed insights into parallelism and advanced training techniques provided by Hugging Face. * **AI Safety Discussions**: Members debated the risk of **AI** leading to **self-harm**, with some suggesting that models instructed to protect a computer might provide harmful instructions to eliminate threats. * **Windows Users Face Local Model Training Troubles**: A member asked what IDE other members use to train models locally, citing dependency issues on Windows.\", \"score\": 0.98009, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"AI #140: Trying To Hold The Line - Zvi Mowshowitz | Substack\", \"url\": \"https://thezvi.substack.com/p/ai-140-trying-to-hold-the-line\", \"content\": \"We would like to do more—we would like to build an AI factory that can make 1 gigawatt per week of new capacity, at a greatly reduced cost relative to today—but that will require more confidence in future models, revenue, and technological/financial innovation. Your mission sure looks in large part like becoming a platform consumer product company, and then building a true automated AI researcher in a little over two years, and the nonprofit’s initial deployments also don’t seem aimed at the mission. > Nevertheless, these findings challenge some common intuitions about what language models are capable of—and since we found that the most capable models we tested (Claude Opus 4 and 4.1) performed the best on our tests of introspection, we think it’s likely that AI models’ introspective capabilities will continue to grow more sophisticated in the future.\", \"score\": 0.97658, \"published_date\": \"\"}, {\"index\": 7, \"title\": \"Daniel D. Gutierrez, Principal Analyst & Resident Data Scientist\", \"url\": \"https://radicaldatascience.wordpress.com/author/amuletc/\", \"content\": \"[Radical Data Science ==================== News and Industry Analysis for Data Science, Machine Learning, AI and Deep Learning -----------------------------------------------------------------------------------](https://radicaldatascience.wordpress.com/ \\\"Radical Data Science \\\") * [AI Industry Influencer Services](https://radicaldatascience.wordpress.com/advertise-with-rds/) Gutierrez, Principal Analyst & Resident Data Scientist](https://radicaldatascience.wordpress.com/author/amuletc/ \\\"Daniel D. Gutierrez, Principal Analyst & Resident Data Scientist](https://radicaldatascience.wordpress.com/author/amuletc/ \\\"Posts by Daniel D. [Hitachi agrees to acquire German data and AI services firm synvert to accelerate Agentic AI and Physical AI development, driving global expansion of HMAX business](https://radicaldatascience.wordpress.com/2025/09/23/hitachi-agrees-to-acquire-german-data-and-ai-services-firm-synvert-to-accelerate-agentic-ai-and-physical-ai-development-driving-global-expansion-of-hmax-business/) [Sep 23](https://radicaldatascience.wordpress.com/2025/09/23/hitachi-agrees-to-acquire-german-data-and-ai-services-firm-synvert-to-accelerate-agentic-ai-and-physical-ai-development-driving-global-expansion-of-hmax-business/ \\\"Permanent Link to Hitachi agrees to acquire German data and AI services firm synvert to accelerate Agentic AI and Physical AI development, driving global expansion of HMAX business\\\") Gutierrez, Principal Analyst & Resident Data Scientist](https://radicaldatascience.wordpress.com/author/amuletc/ \\\"Posts by Daniel D. [Leave a comment](https://radicaldatascience.wordpress.com/2025/09/23/hitachi-agrees-to-acquire-german-data-and-ai-services-firm-synvert-to-accelerate-agentic-ai-and-physical-ai-development-driving-global-expansion-of-hmax-business/#respond) * [AI](https://radicaldatascience.wordpress.com/category/ai/) * [AI Industry Latest News](https://radicaldatascience.wordpress.com/category/ai-industry-latest-news/) * [AI News Briefs](https://radicaldatascience.wordpress.com/category/ai-news-briefs/)\", \"score\": 0.96896, \"published_date\": \"\"}, {\"index\": 8, \"title\": \"Simon Willison on coding-agents\", \"url\": \"https://simonwillison.net/tags/coding-agents/\", \"content\": \"# 14th October 2025, 9:26 pm / ai, generative-ai, llms, ai-assisted-programming, model-context-protocol, coding-agents, claude-code, codex-cli, parallel-agents # 10th October 2025, 11:30 pm / plugins, ai, prompt-engineering, generative-ai, llms, ai-assisted-programming, anthropic, claude, vibe-coding, coding-agents, claude-code, sub-agents, jesse-vincent, skills 12:06 pm / 5th October 2025 / ai, generative-ai, llms, ai-assisted-programming, ai-agents, coding-agents, claude-code, async-coding-agents, jules, codex-cli, parallel-agents, jesse-vincent Coding agents like Anthropic’s Claude Code and OpenAI’s Codex CLI represent a genuine step change in how useful LLMs can be for producing working code. # 25th September 2025, 11:58 pm / github, microsoft, ai, generative-ai, github-copilot, llms, ai-assisted-programming, ai-agents, coding-agents, claude-code, codex-cli, disclosures # 15th September 2025, 6:55 pm / code-review, ai, openai, generative-ai, llms, ai-assisted-programming, pelican-riding-a-bicycle, llm-release, coding-agents, async-coding-agents, gpt-5, codex-cli, theo-browne\", \"score\": 0.96757, \"published_date\": \"\"}], \"total_results\": 8}",
|
||
"timestamp": "2025-11-03T13:57:36.521296",
|
||
"tool_call_id": "web_search:1",
|
||
"name": "web_search"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "让我提取Reddit讨论的具体内容,这个讨论看起来很有针对性:",
|
||
"timestamp": "2025-11-03T13:57:40.810442"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "",
|
||
"timestamp": "2025-11-03T13:57:40.816235",
|
||
"tool_calls": [
|
||
{
|
||
"id": "extract_webpage:2",
|
||
"index": 0,
|
||
"type": "function",
|
||
"function": {
|
||
"name": "extract_webpage",
|
||
"arguments": "{\"url\": \"https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/both_cursor_and_cognition_windsurf_new_models_are/\"}"
|
||
}
|
||
}
|
||
]
|
||
},
|
||
{
|
||
"role": "tool",
|
||
"content": "{\"success\": true, \"url\": \"https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/both_cursor_and_cognition_windsurf_new_models_are/\", \"content\": \"🌐 网页内容 (22172 字符):\\n📍 URL: https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/both_cursor_and_cognition_windsurf_new_models_are/\\n==================================================\\nBoth Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? : r/LocalLLaMA\\n\\n===============\\n[Skip to main content](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/both_cursor_and_cognition_windsurf_new_models_are/#main-content)Both Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models? : r/LocalLLaMA\\n\\nOpen menu Open navigation[](https://www.reddit.com/)Go to Reddit Home\\n\\nr/LocalLLaMA A chip A close button\\n\\n[Log In](https://www.reddit.com/login/)Log in to Reddit\\n\\nExpand user menu Open settings menu\\n\\n[ Go to LocalLLaMA](https://www.reddit.com/r/LocalLLaMA/)\\n\\n[r/LocalLLaMA](https://www.reddit.com/r/LocalLLaMA/)\\n\\n[r/LocalLLaMA](https://www.reddit.com/r/LocalLLaMA/)\\n\\n[https://x.com/localllamasub](https://x.com/localllamasub)\\n\\nSubreddit to discuss AI & Llama, the large language model created by Meta AI.\\n\\n* * *\\n\\n555K Members\\n\\n•3d ago\\n\\n[Successful-Newt1517](https://www.reddit.com/user/Successful-Newt1517/)\\n\\nBoth Cursor and Cognition (Windsurf) new models are speculated to be built on Chinese base models?\\n==================================================================================================\\n\\n[Discussion](https://www.reddit.com/r/LocalLLaMA/?f=flair_name%3A%22Discussion%22)\\n\\n\\n\\n\\n\\nHey, what's going on? Are Chinese models saving American startups?\\n\\n Read more \\n\\n Share \\n\\n[microsoft365](https://www.reddit.com/user/microsoft365/)•Promoted\\n\\nGetting ready for a tough negotiation? Microsoft 365 Copilot Chat helps you prep smarter to close the deal.\\n\\nLearn More\\n\\n m365copilot.com \\n\\n\\n\\n Collapse video player \\n\\n[u/vast_ai](https://www.reddit.com/user/vast_ai/)•Promoted\\n\\nFine-tune Llama or Mistral at scale with multi-GPU clusters. Train more tokens per dollar.\\n\\nBook Now\\n\\n vast.ai \\n\\n\\n\\n[u/ClassActionNotice](https://www.reddit.com/user/ClassActionNotice/)•Promoted\\n\\nLegal Notice - $1.5 Billion Anthropic Copyright Settlement - If your book was in LibGen or PiLiMi, you may be eligible for a payment FILE A CLAIM\\n\\nLearn More\\n\\n anthropiccopyrightsettlement.com \\n\\n\\n\\n Sort by: \\n\\nBest\\n\\nOpen comment sort options\\n\\n* Best \\n* Top \\n* New \\n* Controversial \\n* Old \\n* Q&A \\n\\n[](https://www.reddit.com/user/WithoutReason1729/)\\n\\n[WithoutReason1729](https://www.reddit.com/user/WithoutReason1729/)\\n\\n•[2d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmdwkm0/)\\n\\nYour post is getting popular and we just featured it on our Discord! [Come check it out!](https://discord.gg/PgFhZ8cnWW)\\n\\nYou've also been given a special flair for your contribution. We appreciate your post!\\n\\n_I am a bot and this action was performed automatically._\\n\\n[](https://www.reddit.com/user/TheRealGentlefox/)\\n\\n[TheRealGentlefox](https://www.reddit.com/user/TheRealGentlefox/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcepzg/)\\n\\nI refuse to believe that anyone who uses that font writes code.\\n\\n[](https://www.reddit.com/user/overand/)\\n\\n[overand](https://www.reddit.com/user/overand/)\\n\\n•[2d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmeokpk/)\\n\\nI'm not even sure what font it is - it's definitely monospaced, and it's not comic sans. On digging further, I believe it's [Comic Code](https://tosche.net/fonts/comic-code).\\n\\n1 more reply\\n\\n1 more reply\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmeokpk/?force-legacy-sct=1)\\n\\n1 more reply\\n\\n1 more reply\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcepzg/?force-legacy-sct=1)\\n\\n[](https://www.reddit.com/user/SnooPaintings8639/)\\n\\n[SnooPaintings8639](https://www.reddit.com/user/SnooPaintings8639/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcaius/)\\n\\nIsn't this obvious? Which US open model could compete with Chinese ones? They're either too small, or too censored.\\n\\n[](https://www.reddit.com/user/nullmove/)\\n\\n[nullmove](https://www.reddit.com/user/nullmove/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcgz8x/)\\n\\nTop 1% Commenter\\n\\nNevertheless it's interesting that in China there are DoorDash or Tinder equivalent companies dropping base models from scratch. In USA Windsurf is worth 10B, more than makers of e.g. GLM, on top of whom they are just slapping some prompt and UI.\\n\\n[](https://www.reddit.com/user/coder543/)\\n\\n[coder543](https://www.reddit.com/user/coder543/)\\n\\n•[2d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmd4fi6/)\\n\\nWindsurf was set to be acquired for $3 billion. Where are you getting $10 billion? Cognition is worth $10 billion after acquiring what was left of Windsurf, but they are more than just Windsurf.\\n\\n2 more replies\\n\\n2 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmd4fi6/?force-legacy-sct=1)\\n\\n10 more replies\\n\\n10 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcgz8x/?force-legacy-sct=1)\\n\\n[](https://www.reddit.com/user/ihexx/)\\n\\n[ihexx](https://www.reddit.com/user/ihexx/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcf4pj/)\\n\\nThere's GPT OSS and... ... ... i guess that's about it\\n\\n[](https://www.reddit.com/user/ParthProLegend/)\\n\\n[ParthProLegend](https://www.reddit.com/user/ParthProLegend/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcjol4/)\\n\\nIt's GuidelinePT OSS\\n\\n[](https://www.reddit.com/user/fish312/)\\n\\n[fish312](https://www.reddit.com/user/fish312/)\\n\\n•[2d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcoc1o/)\\n\\nTop 1% Commenter\\n\\nGPT-OSS can't even beat GLM Air. GLM4.6 would run circles around it.\\n\\n2 more replies\\n\\n2 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcoc1o/?force-legacy-sct=1)\\n\\n13 more replies\\n\\n13 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcf4pj/?force-legacy-sct=1)\\n\\n[](https://www.reddit.com/user/Ok_Investigator_5036/)\\n\\n[Ok_Investigator_5036](https://www.reddit.com/user/Ok_Investigator_5036/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmce8ck/)\\n\\nChinese models are basically the budget kings — GLM’s giving you Claude-level vibes at “two noodles and a dumpling” pricing.\\n\\n2 more replies\\n\\n2 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmce8ck/?force-legacy-sct=1)\\n\\n1 more reply\\n\\n1 more reply\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcaius/?force-legacy-sct=1)\\n\\n[u/vast_ai](https://www.reddit.com/user/vast_ai/)•Promoted\\n\\nFine-tune Llama or Mistral at scale with multi-GPU clusters. Train more tokens per dollar.\\n\\nBook Now\\n\\n vast.ai \\n\\n\\n\\n[](https://www.reddit.com/user/Thick-Protection-458/)\\n\\n[Thick-Protection-458](https://www.reddit.com/user/Thick-Protection-458/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcbbwo/)\\n\\nWell, imagine I am making some customized model.\\n\\nWhat do I have on the table?\\n\\n- Custom model from scratch. May be good for narrow task, but not generally. And the whole point of agents is adaptation in general cases - otherwise strict workflow makes way more sense.\\n\\n- Open models.\\n\\n- Now which open models exists? Failure of Llama 4? Outdated llama 3.3? Some Mistral stuff, maybe? All the rest is Chinese ones. Except for gpt-oss, which was released... well, probably well after when guys started development already\\n\\n[](https://www.reddit.com/user/GreenGreasyGreasels/)\\n\\n[GreenGreasyGreasels](https://www.reddit.com/user/GreenGreasyGreasels/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcbxbh/)\\n\\n> Except for gpt-oss, which was released... well, probably well after when guys started development already\\n\\ngpt-oss-120b: August 5, 2025\\n\\nGLM-4.6: October 2, 2025\\n\\n[](https://www.reddit.com/user/Thick-Protection-458/)\\n\\n[Thick-Protection-458](https://www.reddit.com/user/Thick-Protection-458/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmccg33/)\\n\\nYep, but starting developing with one family would still probably be easier than changing it meanwhile.\\n\\n1 more reply\\n\\n1 more reply\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmccg33/?force-legacy-sct=1)\\n\\n1 more reply\\n\\n1 more reply\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcbxbh/?force-legacy-sct=1)\\n\\n4 more replies\\n\\n4 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcbbwo/?force-legacy-sct=1)\\n\\n[](https://www.reddit.com/user/SrijSriv211/)\\n\\n[SrijSriv211](https://www.reddit.com/user/SrijSriv211/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcco0z/)\\n\\nFinetuning open Chinese models is neither shocking nor a problem tbh. Right now there aren't great open weight models from America but Chinese have a lot of them.\\n\\nAlso as long as they are providing some real value I don't think it should be concerning.\\n\\n[](https://www.reddit.com/user/Fast-Satisfaction482/)\\n\\n[Fast-Satisfaction482](https://www.reddit.com/user/Fast-Satisfaction482/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcd3ya/)\\n\\nMaybe they're not concerned by potential issues with the model itself but rather by the implications that Chinese models are the obvious choice now and not American models.\\n\\n19 more replies\\n\\n19 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcd3ya/?force-legacy-sct=1)\\n\\n2 more replies\\n\\n2 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcco0z/?force-legacy-sct=1)\\n\\n[](https://www.reddit.com/user/nrkishere/)\\n\\n[nrkishere](https://www.reddit.com/user/nrkishere/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmccw03/)\\n\\nThis is the same guy who claimed \\\"coordinated attack on american companies\\\" by deepseek in January btw. Truth is, only chinese companies are making capable enough models for complex use cases.\\n\\n[](https://www.reddit.com/user/SnooPaintings8639/)\\n\\n[SnooPaintings8639](https://www.reddit.com/user/SnooPaintings8639/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcfg3b/)\\n\\nI can only imagine how would gpt OSS react if it was to build process management module, with functions like 'kill_child(process_id, child_id)'.\\n\\n[](https://www.reddit.com/user/ihexx/)\\n\\n[ihexx](https://www.reddit.com/user/ihexx/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcfvh5/)\\n\\nI'm sorry, Dave. I'm afraid I can't do that\\n\\n2 more replies\\n\\n2 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcfg3b/?force-legacy-sct=1)\\n\\n[](https://www.reddit.com/user/Ok_Investigator_5036/)\\n\\n[Ok_Investigator_5036](https://www.reddit.com/user/Ok_Investigator_5036/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmck4yb/)\\n\\nNot sure if this is legit — has anyone using Windsurf been able to replicate it?\\n\\n[](https://preview.redd.it/both-cursor-and-cognition-windsurf-new-models-are-v0-q9q431n3gfyf1.png?width=764&auto=webp&s=7f95bb08322f31bef7c129033fd619b61ae0b060)\\n\\n2 more replies\\n\\n2 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmck4yb/?force-legacy-sct=1)\\n\\n[](https://www.reddit.com/user/one-wandering-mind/)\\n\\n[one-wandering-mind](https://www.reddit.com/user/one-wandering-mind/)\\n\\n•[2d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcthsi/)\\n\\nMaybe , but outputting a Chinese character doesn't seem like good evidence. O3 does that and probably a lot of others. Just remembering o3\\n\\n[](https://www.reddit.com/user/egomarker/)\\n\\n[egomarker](https://www.reddit.com/user/egomarker/)\\n\\n•[2d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcvi7l/)\\n\\nEveryone knows it's GLM 4.6. They say on the blog they used top open-source model + Cerebras can run only a very limited subset of LLM, and they just deprecated Qwen Coder in favor of GLM.\\n\\n2 more replies\\n\\n2 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcthsi/?force-legacy-sct=1)\\n\\n[u/ClassActionNotice](https://www.reddit.com/user/ClassActionNotice/)•Promoted\\n\\nLegal Notice - $1.5 Billion Anthropic Copyright Settlement - If your book was in LibGen or PiLiMi, you may be eligible for a payment FILE A CLAIM\\n\\nLearn More\\n\\n anthropiccopyrightsettlement.com \\n\\n\\n\\n[](https://www.reddit.com/user/AnomalyNexus/)\\n\\n[AnomalyNexus](https://www.reddit.com/user/AnomalyNexus/)\\n\\n•[2d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcykvn/)\\n\\nOh that font is a warcrime\\n\\n2 more replies\\n\\n2 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcykvn/?force-legacy-sct=1)\\n\\n[](https://www.reddit.com/user/keepthepace/)\\n\\n[keepthepace](https://www.reddit.com/user/keepthepace/)\\n\\n•[2d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmea38e/)\\n\\nChinese outputs are not necessarily due to a Chinese base model. It could simply be the sign that they use e.g. Qwen to generate fine tuning synthetic data. Or that they trained on genuine, human, Chinese inputs that they gathered.\\n\\nNot everyone codes in English, mes bons amis.\\n\\n[](https://www.reddit.com/user/kkb294/)\\n\\n[kkb294](https://www.reddit.com/user/kkb294/)\\n\\n•[3d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcilad/)\\n\\nSee the sounding and tone of the message and some of the comments in this thread.\\n\\nWhen any Chinese models exhibited the traces of GPT distillation, they made a huge ruckus on data theft and copyright infringement. But, now everyone is saying what's wrong.? Open source means they have to expect this, bla bla bla.!\\n\\nHipocracy at its best🤦♂️\\n\\n[](https://www.reddit.com/user/zipperlein/)\\n\\n[zipperlein](https://www.reddit.com/user/zipperlein/)\\n\\n•[2d ago](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcoxfg/)\\n\\nI can't understand people who cry about model distillation when Meta and others have literally torrented tens of thousands of non-free books .\\n\\n1 more reply\\n\\n1 more reply\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcoxfg/?force-legacy-sct=1)\\n\\n2 more replies\\n\\n2 more replies\\n\\n[More replies](https://www.reddit.com/r/LocalLLaMA/comments/1okppxs/comment/nmcilad/?force-legacy-sct=1)\\n\\nRelated Answers Section\\n=======================\\n\\n Related Answers \\n\\n[Best models for windsurf](https://www.reddit.com/answers/42083c13-b67e-4206-8501-9420c3fde5eb/?q=Best%20models%20for%20windsurf&source=PDP)\\n\\n[Windsurf model reviews](https://www.reddit.com/answers/b7ea0f76-2205-4d38-9a65-ed18895444fd/?q=Windsurf%20model%20reviews&source=PDP)\\n\\n[Cognition AI news and updates](https://www.reddit.com/answers/d3ac4551-6934-45b7-a581-99982980230e/?q=Cognition%20AI%20news%20and%20updates&source=PDP)\\n\\n[Windsurf acquisition news](https://www.reddit.com/answers/5324efb8-221a-4ac1-885f-01c60a0e8cd0/?q=Windsurf%20acquisition%20news&source=PDP)\\n\\n[Optimizing LLaMA models for local deployment](https://www.reddit.com/answers/b80d4b7e-7031-44e0-a6d2-5c2021fd70a1/?q=Optimizing%20LLaMA%20models%20for%20local%20deployment&source=PDP)\\n\\nNew to Reddit? \\nCreate your account and connect with a world of communities.\\n\\n Continue with Email \\n\\n Continue With Phone Number \\n\\nBy continuing, you agree to our[User Agreement](https://www.redditinc.com/policies/user-agreement)and acknowledge that you understand the[Privacy Policy](https://www.redditinc.com/policies/privacy-policy). \\n\\n\\n\\n[https://x.com/localllamasub](https://x.com/localllamasub)\\n\\n Public \\n\\nAnyone can view, post, and comment to this community\\n\\n0 0\\n\\n[Reddit Rules](https://www.redditinc.com/policies/content-policy)[Privacy Policy](https://www.reddit.com/policies/privacy-policy)[User Agreement](https://www.redditinc.com/policies/user-agreement)[Accessibility](https://support.reddithelp.com/hc/sections/38303584022676-Accessibility)[Reddit, Inc. © 2025. All rights reserved.](https://redditinc.com/)\\n\\nExpand Navigation Collapse Navigation\\n\\n\\n==================================================\"}",
|
||
"timestamp": "2025-11-03T13:57:42.626937",
|
||
"tool_call_id": "extract_webpage:2",
|
||
"name": "extract_webpage"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "现在让我搜索更多关于Cursor官方回应的信息:",
|
||
"timestamp": "2025-11-03T13:57:47.889638"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "",
|
||
"timestamp": "2025-11-03T13:57:47.894174",
|
||
"tool_calls": [
|
||
{
|
||
"id": "web_search:3",
|
||
"index": 0,
|
||
"type": "function",
|
||
"function": {
|
||
"name": "web_search",
|
||
"arguments": "{\"query\": \"Cursor Sasha Rush base model deepseek qwen response Hacker News\", \"time_range\": \"week\", \"max_results\": 6}"
|
||
}
|
||
}
|
||
]
|
||
},
|
||
{
|
||
"role": "tool",
|
||
"content": "{\"success\": true, \"summary\": \"🔍 搜索查询: Cursor Sasha Rush base model deepseek qwen response Hacker News\\n📅 搜索时间: 2025-11-03T13:57:52.142485\\n🎯 过滤条件: Topic: general | Time Range: week\\n\\n📝 AI摘要:\\nCursor's Composer is a custom language model trained from scratch, not based on existing open-source models. The model uses reinforcement learning for post-training. Composer is designed for speed in coding tasks.\\n\\n---\\n\\n📊 搜索结果:\\n\\n1. New prompt injection papers: Agents Rule of Two and The Attacker ...\\n 🔗 https://simonw.substack.com/p/new-prompt-injection-papers-agents\\n 📄 One detail that’s notably absent from their description: did they train the model from scratch, or did they start with an existing open-weights model such as something from Qwen or GLM?\\n\\nCursor resear...\\n\\n2. Blogmarks that use markdown - Simon Willison's Weblog\\n 🔗 https://simonwillison.net/dashboard/blogmarks-that-use-markdown/\\n 📄 terminal commands. At our scale, teaching the model to effectively call these tools requires running hundreds of thousands of concurrent sandboxed coding environments in the cloud. One detail that's n...\\n\\n3. The application layer strikes back: Cursor's custom LLM - TechTalks\\n 🔗 https://bdtechtalks.substack.com/p/the-application-layer-strikes-back\\n 📄 Cursor was able to develop Composer because its platform and the model were “co-designed.” According to Sasha Rush, researcher at Cursor, the team built the model and the Cursor environment together, ...\\n\\n4. Sasha Rush (@srush_nlp) / Posts / X\\n 🔗 https://x.com/srush_nlp?lang=en\\n 📄 Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they’ve created my favorite AI systems. We’re now building frontier RL models at scale in real-world coding environ...\\n\\n5. Cursor Composer: Breaking Down the 4x Speed Claim in AI ...\\n 🔗 https://medium.com/@meshuggah22/cursor-composer-breaking-down-the-4x-speed-claim-in-ai-assisted-coding-b0b0f911bfe0\\n 📄 Based on what I’ve managed to find Cursor’s Composer isn’t a fine-tuned Llama or Qwen model. While the team does not share a lot of details about the architecture of the model, what’s clear is their t...\\n\\n6. Simon Willison on generative-ai\\n 🔗 https://simonwillison.net/tags/generative-ai/\\n 📄 > Firstly, we are open-sourcing the flagship model of this series: Qwen3-VL-235B-A22B, available in both Instruct and Thinking versions. The Instruct version matches or even exceeds Gemini 2.5 Pro in ...\", \"filters\": {\"topic\": \"general\", \"time_range\": \"week\"}, \"query\": \"Cursor Sasha Rush base model deepseek qwen response Hacker News\", \"results\": [{\"index\": 1, \"title\": \"New prompt injection papers: Agents Rule of Two and The Attacker ...\", \"url\": \"https://simonw.substack.com/p/new-prompt-injection-papers-agents\", \"content\": \"One detail that’s notably absent from their description: did they train the model from scratch, or did they start with an existing open-weights model such as something from Qwen or GLM?\\n\\nCursor researcher Sasha Rush has been answering questions on Hacker News, but has so far been evasive in answering questions about the base model. When directly asked “is Composer a fine tune of an existing open source base model?” they replied: [...] > Our primary focus is on RL post-training. We think that is the best way to get the model to be a strong interactive agent.\\n\\nSasha did confirm that rumors of an earlier Cursor preview model, Cheetah, being based on a model by xAI’s Grok were “Straight up untrue.”\\n\\nLink 2025-10-29 MiniMax M2 & Agent: Ingenious in Simplicity:\\n\\nMiniMax M2 was released on Monday 27th October by MiniMax, a Chinese AI lab founded in December 2021. [...] The notable thing about Composer-1 is that it is designed to be fast. The pelican certainly came back quickly, and in their announcement they describe it as being “4x faster than similarly intelligent models”.\\n\\nIt’s interesting to see Cursor investing resources in training their own code-specific model - similar to GPT-5-Codex or Qwen3-Coder. From their post:\", \"score\": 0.767638, \"published_date\": \"\"}, {\"index\": 2, \"title\": \"Blogmarks that use markdown - Simon Willison's Weblog\", \"url\": \"https://simonwillison.net/dashboard/blogmarks-that-use-markdown/\", \"content\": \"terminal commands. At our scale, teaching the model to effectively call these tools requires running hundreds of thousands of concurrent sandboxed coding environments in the cloud. One detail that's notably absent from their description: did they train the model from scratch, or did they start with an existing open-weights model such as something from Qwen or GLM? Cursor researcher Sasha Rush has been answering questions on Hacker [News]( but has so far been evasive in answering questions about [...] the base model. When directly asked \\\"is Composer a fine tune of an existing open source base model?\\\" they replied: > Our primary focus is on RL post-training. We think that is the best way to get the model to be a strong interactive agent. Sasha did [confirm]( that rumors of an earlier Cursor preview model, Cheetah, being based on a model by xAI's Grok were \\\"Straight up untrue.\\\" | | Hacker News | 2025-10-29 20:45:53+00:00 | | True | [...] code-specific model - similar to [GPT-5-Codex]( or [Qwen3-Coder]( From their post: > Composer is a mixture-of-experts (MoE) language model supporting long-context generation and understanding. It is specialized for software engineering through reinforcement learning (RL) in a diverse range of development environments. [...] >> Efficient training of large MoE models requires significant investment into building infrastructure and systems research. We built custom training infrastructure\", \"score\": 0.6152965, \"published_date\": \"\"}, {\"index\": 3, \"title\": \"The application layer strikes back: Cursor's custom LLM - TechTalks\", \"url\": \"https://bdtechtalks.substack.com/p/the-application-layer-strikes-back\", \"content\": \"Cursor was able to develop Composer because its platform and the model were “co-designed.” According to Sasha Rush, researcher at Cursor, the team built the model and the Cursor environment together, which allowed them to run the training agent at the necessary scale for production-level tasks. [...] Composer was possibly built on top of a base open source model. It was fine-tuned through reinforcement learning (RL) within a diverse range of development environments (RL has become an important part of post-training of reasoning models such as DeepSeek-R1). During training, the model was given access to a set of production tools, including file editors, semantic search, and terminal commands (the basic set of tools software engineers use), and was tasked with solving real software [...] Basically, when the speed of response passes a certain threshold, then you can unlock new capabilities, such as a model that is processing thousands of tokens in the background and giving you near-real-time feedback or suggestions on your code.\\n\\nImage 5\", \"score\": 0.4048609, \"published_date\": \"\"}, {\"index\": 4, \"title\": \"Sasha Rush (@srush_nlp) / Posts / X\", \"url\": \"https://x.com/srush_nlp?lang=en\", \"content\": \"Some personal news: I recently joined Cursor. Cursor is a small, ambitious team, and they’ve created my favorite AI systems. We’re now building frontier RL models at scale in real-world coding environments. Excited for how good coding is going to be.\\n\\nImage 9\\n\\n140\\n\\n123\\n\\n2.8K\\n\\n329K\\n\\nImage 10\\n\\nSasha Rush\\n\\n@srush_nlp\\n\\n·\\n\\nOct 8, 2024\\n\\nUpdating all my NeurIPS papers.\\n\\nImage 11: Image\\n\\n38\\n\\n197\\n\\n2.4K\\n\\n249K\\n\\nImage 12\\n\\nSasha Rush\\n\\n@srush_nlp\\n\\n·\\n\\nOct 17, 2023 [...] Accessibility\\n\\n|\\n\\nAds info\\n\\n|\\n\\nMore\\n\\n© 2025 X Corp. [...] For some reason, I needed xkcd formatting to convince my brain what Large means.\\n\\nImage 16: I originally had the \\\"Gettysburg Address\\\", but after reading 10^4 tokens about Taylor Swift, I couldn't get Blank Space out of my head.\\n\\n25\\n\\n215\\n\\n1.9K\\n\\n602K\\n\\nNew to X?\\n\\nSign up now to get your own personalized timeline!\\n\\nSign up with Apple\\n\\nCreate account\\n\\nBy signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.\\n\\nTerms of Service\\n\\n|\\n\\nPrivacy Policy\\n\\n|\\n\\nCookie Policy\\n\\n|\", \"score\": 0.3846353, \"published_date\": \"\"}, {\"index\": 5, \"title\": \"Cursor Composer: Breaking Down the 4x Speed Claim in AI ...\", \"url\": \"https://medium.com/@meshuggah22/cursor-composer-breaking-down-the-4x-speed-claim-in-ai-assisted-coding-b0b0f911bfe0\", \"content\": \"Based on what I’ve managed to find Cursor’s Composer isn’t a fine-tuned Llama or Qwen model. While the team does not share a lot of details about the architecture of the model, what’s clear is their training approach:\\n\\nReinforcement Learning in Production Environments [...] ## Claude Skills: The AI Feature That Actually Solves a Real Problem\\n\\n### Yesterday, Anthropic quietly released what might be the most practical AI feature of 2025. It’s not flashier models or better benchmarks…\\n\\nOct 17\\n\\n181\\n\\n9\\n\\nPawel\\n\\n## DeepSeek-OCR: The New OCR Breakthrough\\n\\n### DeepSeek released DeepSeek-OCR yesterday. And it’s not just another OCR model. It’s a proof-of-concept that could fundamentally change how…\\n\\nOct 21\\n\\n33\\n\\n1\\n\\nPawel [...] Sitemap\\n\\nOpen in app\\n\\nSign in\\n\\nSign in\\n\\nMember-only story\\n\\n# Cursor Composer: Breaking Down the 4x Speed Claim in AI-Assisted Coding\\n\\n5 min read1 day ago\\n\\nCursor has just released version 2.0 alongside Composer — their first proprietary language model. This isn’t just another API wrapper around Claude or GPT. This is Cursor training their own mixture-of-experts model from the ground up, optimized specifically for one thing: making you code faster.\", \"score\": 0.3462102, \"published_date\": \"\"}, {\"index\": 6, \"title\": \"Simon Willison on generative-ai\", \"url\": \"https://simonwillison.net/tags/generative-ai/\", \"content\": \"> Firstly, we are open-sourcing the flagship model of this series: Qwen3-VL-235B-A22B, available in both Instruct and Thinking versions. The Instruct version matches or even exceeds Gemini 2.5 Pro in major visual perception benchmarks. The Thinking version achieves state-of-the-art results across many multimodal reasoning benchmarks.\\n\\nBold claims against Gemini 2.5 Pro, which are supported by a flurry of self-reported benchmarks. [...] This initial model is _enormous_. On Hugging Face both Qwen3-VL-235B-A22B-Instruct and Qwen3-VL-235B-A22B-Thinking are 235B parameters and weigh 471 GB. Not something I'm going to be able to run on my 64GB Mac!\\n\\nThe Qwen 2.5 VL family included models at 72B, 32B, 7B and 3B sizes. Given the rate Qwen are shipping models at the moment I wouldn't be surprised to see smaller Qwen 3 VL models show up in just the next few days. [...] 120b-high (~59%), and Gemini-2.5 flash-thinking (~50%). Mid-range models (~$0.1–$2): GPT-5 minimal (~79%), GPT-5 high (~86%), Qwen3 max (~62%), GPT-4.1 (~60%), DeepSeek-v3.1 (~82%), GLM 4.5 (~70%), and Kimi k2-0905 (~82%). High-cost models (>$5): Claude-Sonnet 4-thinking-16k (~87%) and Claude-Opus 4.1-thinking-16k (~99%). Overall, GPT-5 high and Claude models dominate the top-right, while budget models like GPT-4.1-mini and Grok code-fast-1 balance lower cost with moderate performance.\", \"score\": 0.25151023, \"published_date\": \"\"}], \"total_results\": 6}",
|
||
"timestamp": "2025-11-03T13:57:52.145824",
|
||
"tool_call_id": "web_search:3",
|
||
"name": "web_search"
|
||
},
|
||
{
|
||
"role": "assistant",
|
||
"content": "基于我搜索到的信息,我来为您总结一下关于Cursor最新模型争议的现状:\n\n## Cursor Composer模型争议总结\n\n### **基本情况**\n- Cursor最近发布了2.0版本,同时推出了自己的专有语言模型**Composer**\n- Cursor声称这是一个从零开始训练的混合专家模型(MoE),专门针对编程任务优化\n- 官方宣称相比同类智能模型有4倍的速度提升\n\n### **争议焦点**\n**主要质疑:Composer是否真的从零开始训练?**\n\n1. **社区猜测**:很多人认为Composer可能是基于中国开源模型(如DeepSeek、Qwen或GLM)微调而来\n\n2. **官方回应模糊**:\n - Cursor研究员Sasha Rush在Hacker News上回答问题时,对基础模型来源问题回避\n - 当被直接问及\"Composer是否是现有开源基础模型的微调版本\"时,他回答:\"我们的主要关注点是RL后训练。我们认为这是让模型成为强大交互代理的最佳方式。\"\n - 这种回避态度加剧了社区的猜疑\n\n### **支持质疑的证据**\n1. **技术趋势**:最近多家中国AI实验室发布了专注于代理能力的开源模型\n2. **模型表现**:有用户报告在使用中出现了中文字符输出\n3. **时间线**:中国模型如GLM-4.6、Qwen系列在性能上确实很强,且开源\n4. **市场现实**:正如Reddit用户所说,\"现在没有足够强大的美国开源模型,但中国有很多\"\n\n### **官方立场**\n- Cursor坚称Composer是\"从零开始训练\"的专有模型\n- 强调使用了强化学习在生产环境中进行训练\n- 否认了之前关于\"Cheetah\"模型基于xAI Grok的传言\n\n### **业内观点**\n1. **技术博主Simon Willison**指出:Cursor的描述中明显缺少一个细节——他们是从零开始训练模型,还是基于现有的开源权重模型?\n2. **社区共识**:即使基于中国开源模型进行微调,这在技术上是合理的选择,因为中国模型在代码任务上表现确实优秀\n\n### **结论**\n目前**没有确凿证据**证明Cursor Composer是基于DeepSeek或Qwen模型微调的,但:\n- 官方的回避态度确实引发了合理怀疑\n- 基于中国开源模型进行微调在技术上是合理的选择\n- 这种争议反映了当前AI领域对于模型溯源和透明度的关注\n\n这更多是一个关于**透明度**和**开源精神**的讨论,而不是技术能力的问题。",
|
||
"timestamp": "2025-11-03T13:58:07.592851"
|
||
}
|
||
],
|
||
"todo_list": null,
|
||
"metadata": {
|
||
"project_path": "/opt/agent/agents/users/jojo/project",
|
||
"project_relative_path": "users/jojo/project",
|
||
"thinking_mode": false,
|
||
"total_messages": 14,
|
||
"total_tools": 8,
|
||
"status": "active"
|
||
},
|
||
"token_statistics": {
|
||
"total_input_tokens": 101371,
|
||
"total_output_tokens": 1247,
|
||
"updated_at": "2025-11-03T15:46:45.579800"
|
||
}
|
||
} |