​AI cloud provider Nebius on Wednesday announced a new platform called Token Factory that can run the leading open-source models in the market while also providing the computing power needed to run them. The company says its new platform allows enterprises to deploy and optimize open-source and custom models at scale with enterprise-grade reliability and control.

​The platform supports all major open models on the market including DeepSeek, OpenAI's GPT-OSS, Meta's Llama, Nvidia's Nemotron, and Qwen. Token Factory is available to use now with support for over 60 open-source models. It also offers customers the option to host their own models as well.

​Meanwhile, current Nebius AI users will automatically be upgraded to Token Factory.

​The new offering gives direct competition to Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). The company also faces competition from new-age startups like Fireworks and Baseten which offer similar services.

​Nebius says that Token Factory is optimized for efficiency, delivering sub-second latency, autoscaling throughput, and 99.9% uptime, even for workloads which exceed hundreds of millions of requests per minute.

​After sanctions on Yandex, Nebius split off from the search engine in 2024 and has emerged as among the leading neocloud providers. The company sells AI cloud capacity from the data centers it has built in the US, Europe, and Israel, as per a report by Bloomberg.

​AI infrastructure providers like Nebius who are selling software services over their cloud services could have wider profit margins, but Nebius CEO and Chief Business Officer Roman Chernin says that his firm is less attracted by a possible margin boost than by its ability to bring in more customers with a wider array of products.

​"Simply having infrastructure is far from enough. We want to become a large enterprise, but we do not wish to be merely a utility company," Chernin said in an interview.