Loading...
「ツール」は右上に移動しました。
利用したサーバー: wtserver1
2いいね 50 views回再生

AI News 25 Feb 2025

Anthropic has launched Claude 3.7 Sonnet, which they claim is their most intelligent model to date. It features hybrid reasoning capabilities and offers two thinking modes: near-instant responses and extended, step-by-step thinking. Claude 3.7 Sonnet is the first generally available hybrid reasoning model. It is designed for coding, agentic capabilities, and content generation and supports use cases like RAG (Retrieval-Augmented Generation), forecasting, and targeted marketing. Moreover, it is accessible across various platforms, including Claude plans, the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. The pricing remains the same as previous versions at $3 per million input tokens and $15 per million output tokens, except on the free tier. A new feature called "extended thinking" mode has also been introduced.

Claude Code is a new agentic coding tool launched by Anthropic alongside Claude 3.7 Sonnet. It is designed for Claude-powered code assistance, file operations, and task execution directly from the terminal. Claude Code also functions as a Model Context Protocol (MCP) client, allowing users to extend its functionality with servers like Sentry, GitHub, or web search. It aims to streamline developer workflows by executing routine tasks, explaining complex code, and handling Git workflows through natural language commands.

During DeepSeek's Open Source Week, the company released DeepEP, described as an expert-parallel communication library designed for efficient Mixture of Experts (MoE) training. DeepEP features FP8 dispatch support and optimized intranode and internode communication to streamline both training and inference phases. DeepSeek also announced FlashMLA, an optimized MLA decoding kernel for Hopper GPUs, featuring BF16 support and a paged KV cache with a block size of 64. FlashMLA has performance metrics including 3000 GB/s memory-bound and 580 TFLOPS compute-bound on H800. In addition, their Multi-head Latent Attention (MLA) architecture promises a 5-10x inference speed increase by compressing the KV cache, potentially reshaping future LLM architectures. Overall, the releases reflect DeepSeek's commitment to open-source AI technologies and aim to foster community engagement and innovation.

Qwen AI previewed QwQ-Max-Preview, which is a reasoning model based on Qwen2.5-Max. Evaluations for QwQ-Max-Preview on LiveCodeBench show performance on par with o1-medium. The model is open-source under Apache 2.0 and includes plans for Android and iOS apps, signaling a push for accessible and powerful reasoning models.

Perplexity AI is set to launch Comet, a new agentic browser. AravSrinivas, in asking for user feedback on desired features, highlighted the engineering undertaking of Comet and invited people to join them. The browser is described as a Browser for Agentic Search by Perplexity.





Claude 3.7 Sonnet and Claude Code post: https://www.anthropic.com/news/claude...

FlashMLA repo: https://github.com/deepseek-ai/FlashMLA
DeepEP repo: https://github.com/deepseek-ai/DeepEP

QwQ-Max post: https://qwenlm.github.io/blog/qwq-max...

コメント