Analysts are blinded by DeepSeek and OpenAI and missing the depth of data revealed by Liftr Insights
In January 2025, the announcement of a new AI model called DeepSeek R1 significantly impacted U.S. financial markets. This China-developed large language model raised investor alarms due to its performance and the claims that it was built for about $6M USD, compared to the hundreds of millions invested in the more well-known U.S frontier models such as OpenAI and Gemini.
It’s important to state that China’s AI aspirations go beyond DeepSeek. For instance, a Liftr Insights report shows that as of the end of the previous quarter, Alibaba Cloud and Tencent Cloud accounted for 31.4% of all the AI-focused instance types across all the major cloud providers. In contrast, AWS was only 20.2%, which was lower than Alibaba Cloud alone.

The Emergence of DeepSeek
Of the 1.2 million AI and ML models hosted on the Hugging Face hub in December 2024, only 28.4% are tagged as natural language processing.
The emergence of DeepSeek and Chinese AI challenge common AI beliefs. In contrast to some of these misperceptions, Liftr data reports show:
-
It’s not all about Natural Language Processing (NLP). During the past 2 years, NLP and broad-based frontier models have gotten the lion’s share of media, analyst, and investor attention when it comes to AI. But, DeepSeek’s rapid ascent from a field of over a million AI and ML models points to a much bigger and vibrant AI ecosystem. Of the 1.2 million AI and ML models hosted on the Hugging Face hub in December 2024, only 28.4% are tagged as natural language processing. This is obscured by the fact that 61.2% of the downloads were focused on NLP due to some popular models dominating the attention. That means there are about 900,000 other AI and machine learning models helping accomplish different objectives. For example, 5.6% of these other models are focused on computer vision, which we might delve into in a future article.

Among models specifically stated with parameters, Liftr data show that 90.3% of models are trained on (or based on models trained on) a medium size of parameters.
-
You don’t need to be big and spend big to use AI. Having extensive resources may have been true in the past when models were less mature and accurate. Brute force efforts to scale training and inference networks may have been useful for frontier LLMs, but DeepSeek has been trained using a distillation model. This approach “shrinks” models to be very small, efficient, and highly focused on specific tasks.
Like we are seeing with DeepSeek, we also see great results from smaller models from other authors. For example, IBM’s Granite models are available in multiple sizes from 3 billion training parameters up to over 34 billion, depending on the project requirements. Among models specifically stated with parameters, Liftr data show that 90.3% of models are trained on (or based on models trained on) a medium size of parameters. Some small models can even be run on CPU alone or on a GPU-enabled laptop such as a Macbook Pro. Of course, your performance may vary.

Liftr’s data shows AI specific instances already comprise 6.0% of all workloads on leading clouds.
- You don't need a cloud to get started. Cloud service providers have been at the forefront of investing in and marketing AI-based solutions. They have a huge incentive to do so. According to Liftr’s cloud data, AI instances can cost 20% to 30% more per hour than CPU-only instances of similar size. This marketing push is working as Liftr’s data shows AI specific instances already comprise 6.0% of all workloads on leading clouds. So, enterprises may have come to believe that the only way to access these models is through a cloud-based deployment. This is not true. DeepSeek, IBM, NVIDIA, and Meta all deploy their models to be deployed in clouds or on-premise in enterprise data centers. And with smaller models available, private deployment is more feasible than ever.
Making sense of a million models
While Hugging Face and other platforms provide a great service for connecting enterprises to models, open-source AI and ML present many of the same challenges as other open-source software projects. Enterprises want some degree of assurance that a model is viable, stable, and has a vibrant set of contributors. Additionally, enterprises should understand whether other models or tools complement or compete with a model.
When considering your AI strategy, it’s better to understand the data and the trends than just trusting what the mass market and well-monied vendors tell you.
Digging Deeper into AI
At Liftr Insights, we aim to help enterprises and investors make data-driven decisions based on reliable and well-curated market analytics. Liftr has six years of processes and experience curating data about the top nine cloud providers, semiconductors, and AI. These insights have helped enterprises and investors decide which capabilities fit them, including regional availability, price/performance, and the availability of the right instances.
When considering your AI strategy, it’s better to understand the data and the trends than just trusting what the mass market and well-monied vendors tell you. In this AI series, we will discuss valuable insights we are seeing in the AI data. In our next article will discuss one of the most prolific machine-learning models and, given the emphasis on Natural Language Programming, you have probably never heard of this type of AI model.
To stay on top of news about AI like the above (including the next in the series of articles about AI), subscribe to the newsletter today.