Digital Economy

Kimi K2: what the new Chinese 'miracle' in artificial intelligence teaches us

The new model impresses for many reasons.

7 August 2025

5' min read

A Chinese company founded by ex-colleagues of American big tech is the Dragon's new twist on artificial intelligence. A story that tells a lot about what innovation has become in China, through the geopolitical tensions that are now ongoing. The Kimi K2 model, in fact, surprises for many reasons.

What is Kimi K2

Kimi K2 is the most powerful open source AI model at the moment, benchmarks say. The company is Moonshot AI, a start-up founded in 2023 with financial backing from Alibaba's Vision Plus fund, by Yang Zhilin, Zhou Xinyu and Wu Yuxin. Zhilin and Yuxin have worked for Meta and Google, specifically in the field of artificial intelligence. The former is among the main authors of Transformer XL and XLNet, two models that are milestones in the evolution of llm.

Everyone can use it on www.kimi.com (on the Chinese company's servers) or download it (also from Hugging Face) for use on their own infrastructure and customisation. Just like with Deepseek, the previous Chinese AI champion.

Kimi K2 can process up to two million tokens as input. The tokens are a way of measuring data for processing AI models.

In direct comparison with OpenAI's Gpt-4o or Anthropic's Claude 3.5 Sonnet - both currently limited to 128,000 tokens - Kimi K2 can thus process more than 15 times as much text, while maintaining contextual consistency, inferential capabilities and semantic precision on a much larger scale. This means that it can read, understand and synthesise documents as long as an encyclopaedia or the entire case law of a case, without having to subdivide or compress them.

In particular, MoonshotAI points out that it is better at coding than the current standard for this application, namely Sonnet, and is cheaper to use. On paper: at the moment, there is still a lack of tools for actual integration into other systems, and therefore Kimi K2 is less usable in practice.

Kimi charges only 15 cents per million incoming tokens and $2.50 per million outgoing tokens, according to its website.

The price of Claude Opus 4 is 100 times more for input ($15 per million tokens) and 30 times more for output ($75 per million tokens). And OpenAI? For each million tokens, GPT-4.1 charges $2 for input and $8 for output.

Moonshot AI stated on GitHub that developers can use K2 as they wish, with the only requirement being that they display 'Kimi K2' on the user interface if their commercial product or service has more than 100 million monthly active users or generates monthly revenues of at least $20 million.

The technical aspects of Kimi K2

According to Moonshot AI, Kimi K2 is not simply an llm with a broader context, but an architecturally renewed model. Based on an optimised version of the Transformer framework, the system adopts advanced attention scaling techniques, with dynamic temporal compression of weights to maintain computational efficiency. The model has been trained on a multilingual corpus of over 50 trillion tokens, using a proprietary pipeline with data cleaning and deep deduplication.

On the performance side, benchmarks provided by the Moonshot team show competitive results. Kimi K2 achieved a score of 88.1 on MMLU (Massive Multitask Language Understanding), in line with GPT-4 and Claude 3 Opus, and a score of 76.5 on HumanEval, a test that measures the ability to generate working code. Particularly noteworthy is the performance on LongBench, a test suite designed to evaluate the ability of models to maintain consistency over long texts: Kimi K2 achieved an average score of 83.2 per cent, outperforming Claude 3.5 Sonnet (76 per cent) and Gemini 1.5 Pro (78 per cent).

Equally strategic is the approach to deployment. Kimi K2 is designed to run in full cloud mode, but Moonshot is already testing compressed versions for on-device or edge computing, with an estimated 40 per cent lower consumption than US models for the same number of tokens processed. This opens up the possibility of integration in low-latency environments such as mobile devices, robotics and autonomous vehicles.

The geopolitical significance

Behind the technical showcase lies a much deeper industrial agenda.

After years in which Chinese models were uncompetitive with Western ones, Beijing is now ready to play a level playing field. Kimi K2 is an unequivocal message: China now has the training capabilities, human capital, engineering skills and software platforms to compete on advanced generative AI.

The model is part of a rapidly expanding ecosystem. In the first half of 2025, Chinese private investment in generative AI exceeded $15 billion, with at least 11 multi-modal models at an advanced stage of development.

China's only Achilles' heel is now chips. Especially the efficient and high-performance ones for inference. That is, the ones needed to run models and offer them in the cloud to users worldwide. The consequence is that Deepseek and Kimi K2, in the 'original' version offered by the founders' servers, are slower than Chatgpt. This is also apparent from our own testing. But Moonshot AI also admitted as much on X.

The limited access to chips also explains another feature of the Chinese AI sector that has puzzled outside observers: the dedication to open source versions. It is not just a way to exploit the worldwide collaboration of engineers and researchers in AI.

DeepSeek v3 and Kimi K2 are both available through third-party hosting services such as New York-based Hugging Face; and are downloadable and executable on users' hardware. Thus, even if the company does not have the computing power to serve customers directly, support for its models is still available elsewhere. Open source versions also make it possible to circumvent US tariffs on hardware: if DeepSeek cannot easily buy Nvidia chips, Hugging Face can. This could change now that President Trump has given Nvidia the green light to sell its most inference-friendly chip, the H20, to China.

In any case, through the key of openness, Kimi K2 could thus prove to be not only a technical advancement, but also a soft power asset for the Chinese government. In non-Western contexts - from Central Asia to the Middle East - the model is proposed as a local, cheap and scalable alternative to OpenAI or Google's solutions, at a time when digital geopolitics is increasingly intertwined with technological sovereignty.

For Europe, this acceleration poses urgent questions. While regulation - such as the AI Act - has moved swiftly, an industrial initiative capable of producing models on the same scale is still lacking. The French approach with Mistral or the German approach with Aleph Alpha is not enough to close the gap with those who dominate today in terms of computational power and data availability.