就在十几个小时前,DeepSeek 发布了一篇新论文,主题为《Conditional Memory via Scalable Lookup:A New Axis of Sparsity for Large Language Models》,与北京大学合作完成,作者中同样有梁文锋署名。 简单总结一波这项新研究要解决的问题:目前大语言模型主要通过混合专家(MoE)来 ...
The Opensource DeepSeek R1 model and the distilled local versions are shaking up the AI community. The Deepseek models are the best performing open source models and are highly useful as agents and ...
Threat actors are taking advantage of the rise in popularity of the DeepSeek to promote two malicious infostealer packages on the Python Package Index (PyPI), where they impersonated developer tools ...
前者聚焦平衡实用,适用于日常问答、通用Agent任务、真实应用场景下的工具调用。 推理达GPT-5水平,略低于Gemini-3.0-Pro。 后者主打极致推理,推理基准性能媲美Gemini-3.0-Pro。 还一把斩获IMO 2025、CMO 2025、ICPC World Finals 2025、IOI 2025金牌。 划重点,ICPC达到人类选手 ...
2月11日,深度求索(DeepSeek)悄悄地对其旗舰模型进行灰度测试。 据科创板日报报道,多名用户反馈,DeepSeek在网页端和APP端进行了版本更新,支持最高1M(百万)Token的上下文长度。而去年8月发布的DeepSeekV3.1上下文长度拓展至128K。 记者实测中发现,DeepSeek在 ...
Note: While there are moral reasons you might want DeepSeek to discuss historical events that are taboo in China, jailbreaking chatbots has the potential to lead to illegal material. Digital Trends ...
South Korean officials on Saturday temporarily restricted Chinese AI Lab DeepSeek’s app from being downloaded from app stores in the country pending an assessment of how the Chinese company handles ...
DeepSeek is set to become the default decision-making tool for local government officials in China. In several towns, high-level officials have recently instructed their staff on using the technology, ...
来自MSN

DeepSeek,最新发布!

DeepSeek发布新论文,梁文锋参与署名。 1月1日消息,DeepSeek发布了一篇新论文,提出了一种名为mHC(流形约束超连接)的新架构。该研究旨在解决传统超连接在大规模模型训练中的不稳定性问题,同时保持其显著的性能增益。这篇论文的第一作者有三位:Zhenda Xie ...