2025年第二季度中国人工智能现状亮点报告(英)-Artificial Analysis
State of AI: ChinaArtificial AnalysisQ2 2025Highlights ReportFull report available to Premium Access subscribersArtificial Analysis is a leading, and independent AI benchmarking and insights provider. We support engineers and companies to understand AI capabilities and make critical decisions about their AI strategy.Our data, insights and publications are grounded in our comprehensive benchmarking of AI technologies and use cases. This includes everything from hourly performance testing of language model APIs to millions of votes in our crowd-sourced arenas.Our public website, artificialanalysis.ai, is widely referenced by companies leading innovation in AI. To discuss this report, our publications, or our services, please get in touch at contact@artificialanalysis.ai.China’s leading AI labs are now closer than ever to US leaders, with the lead decreasing from more than a year to less than three monthsUS & China: Frontier Language Model Intelligence, Over TimeArtificial Analysis Intelligence Index incorporates 7 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam, LiveCodeBench, SciCode, AIME, MATH-500Commentary• The performance gap between US and Chinese frontier models since the release of ChatGPT in 2022 has remained persistent, but is as narrow now as it has ever been• DeepSeek’s open weights R1 (May 2025) model leads the Chinese AI labs, while OpenAI’s o3 leads models released by US AI Labs• DeepSeek and Alibaba have primarily driven the Chinese frontier, while advances in the US frontier have been primarily driven by OpenAIQwen3 235B A22B, AlibabaGrok3 mini reasoning (high), xAIo1, OpenAIo1-preview, OpenAIGPT-4o, OpenAIGPT-4 Turbo, OpenAIGPT-4, OpenAIGPT-3.5 Turbo, OpenAISeed-Thinking-v1.5, ByteDanceDeepSeek R1 (Jan. 25) DeepSeek V3Qwen2.5 Plus, AlibabaGLM-4-Plus, Zhipu AIBaichuan 4, BaichuanStep 2, StepFunQwen Chat 14B, AlibabaQwen Chat 72B, AlibabaDeepSeek LLM 67B Qwen Chat 7B, AlibabaQwen1.5 Chat 72B, AlibabaQwen1.5 Chat 110B, AlibabaYi-Large, 01.AIQwen2 72B, Alibabao3-mini (high), OpenAINote: Some results were estimated based on company claims and comparable resultsSource: Artificial Analysis Intelligence Index o3, OpenAIGemini 2.5 Pro, GoogleDeepSeek R1 (0528, May ‘25)Artificial Analysis Intelligence IndexThe Chinese open weights frontier surpassed the US in November 2024 with Alibaba’s release of QwQ 32B Preview, R1 consolidated this leadUS & China: Open Weights Frontier Language Model Intelligence, Over TimeCommentary• The Chinese open weights frontier surpassed the US in November 2024 with the release of QwQ 32B Preview (overtaking Meta’s Llama 3.1 405B)• The open weights leadership of Chinese AI labs is reflective of the approach of the top Chinese AI labs to often release the weights of their flagship models. This contrasts with the top US AI labs, which generally do not release the weights of their leading models, e.g., OpenAI, Anthropic and Google• China’s DeepSeek R1 (January 2025) was the first open weights reasoning mod
2025年第二季度中国人工智能现状亮点报告(英)-Artificial Analysis,点击即可下载。报告格式为PDF,大小3.86M,页数17页,欢迎下载。