2025年第一季度中国人工智能发展状况报告(原版)
State of AI: ChinaArtificial AnalysisQ1 2025Artificial Analysis is a leading and independent AI benchmarking and insights provider. We support engineers and companies to understand AI capabilities and make critical decisions about their AI strategy.Our data, insights and publications are grounded in our comprehensive benchmarking of AI technologies and use cases. This includes everything from hourly performance testing of language model APIs to millions of votes in our crowd-sourced arenas.Our public website, artificialanalysis.ai, is widely referenced by companies leading innovation in AI. To discuss this report, our publications, or our services, please get in touch at contact@artificialanalysis.ai.1. Artificial Analysis Intelligence Index: average across a range of language model intelligence and reasoning evaluation datasets. Currently includes MMLU, GPQA Diamond, MATH-500 & HumanEval. Release date is based on first public launch of the model. 2. o3 Intelligence Index estimated by scaling measured Intelligence Index of o1. 3. Estimated based on company claims and comparable results where available, not yet independently benchmarked by Artificial AnalysisUS & China: Frontier Language Model Intelligence, Over Time1Closing the gap: The final months of 2024 have seen the emergence of the numerous highly performant models from top Chinese AI labs. This has resulted in the delta between the level of intelligence offered by models from Chinese AI labs and US AI labs closing. Several Chinese models are now competitive with models from the top US labs.Open models close in on the frontier labs: Open weights models, led by those from DeepSeek and Alibaba, have approached o1level intelligence.Reasoning models quickly becoming commonplace: Reasoning models (that “think” before answering) were first introduced by OpenAI in 3Q24. Within months, Chinese competitors, led by DeepSeek, have largely replicated the intelligence of o1. Several AI labs in China now have a frontier-level reasoning model.Key TrendsModel Release Date15202530354045505560657075808590951Q232Q233Q234Q231Q242Q243Q244Q241Q254Q222Q25OpenAI, GPT-4OpenAI, GPT-4 TurboOpenAI, GPT-3.5 TurboOpenAI, o1-previewAnthropic, Claude Sonnet (Jun ‘24)OpenAI, o1OpenAI, o32DeepSeek, V3DeepSeek, R1Alibaba, Qwen 2 Instruct 72BAlibaba, Qwen 2.5 Instruct 72BDeepSeek, V2Alibaba, Qwen Chat 72B3Alibaba, Qwen Chat 7B3GPT-4oChinese AI labs have progressively caught up to US AI labs; models from Chinese labs are now approaching o1-level intelligence with the release of DeepSeek’s R1 modelUSAChinaFRONTIER LANGUAGE MODELS BY ORIGINArtificial Analysis Intelligence Index1OpenAIAnthropicMetaGoogle1. Artificial Analysis Intelligence Index: average across a range of language model intelligence and reasoning evaluation datasets. Currently includes MMLU, GPQA Diamond, MATH-500 & HumanEval. Release date is based on first public launch of the model. 2. Estimated based on company claims and comparable results where available, not
2025年第一季度中国人工智能发展状况报告(原版),点击即可下载。报告格式为PDF,大小1.24M,页数14页,欢迎下载。
