AI hyperscaler Nscale launches Serverless Inference Platform

Nscale makes AI accessible to organisations of all sizes, enabling cost-efficient AI model deployment with a simple pay-as-you-go model.

London, UK – 02/04/2025 – Nscale, the hyperscaler engineered for AI, today announced the launch of its Serverless Inference platform, the first public on-demand offering within its broader AI infrastructure suite. This new public cloud service allows developers and enterprises to quickly deploy leading AI models at scale without managing the underlying infrastructure, complementing Nscale’s established private cloud solutions tailored for large-scale enterprise AI workloads.

The platform’s token-based, pay-as-you-go pricing ensures users only pay for what they consume, eliminating idle capacity costs and reducing financial barriers to experimenting and deploying generative AI models.

“Launching our Serverless Inference platform marks Nscale’s expansion into public, on-demand AI services, making AI model deployment simple and cost-effective,” said Daniel Bathurst, Chief Product Officer at Nscale. “While our private cloud remains ideal for large-scale enterprise workloads, this new serverless option enables more developers to experiment with and scale inference workloads. With upcoming features set to include dedicated endpoints, fine-tuning capabilities and the ability to support custom model hosting, we’re proud to offer sovereign, European AI infrastructure to meet rapidly growing inference demand.

Users can immediately access popular generative AI models such as Meta’s Llama, Alibaba’s Qwen, and DeepSeek through OpenAI-compatible APIs or via Nscale’s intuitive web console. The broader Nscale platform provides comprehensive functionality, including Slurm and Kubernetes orchestration, observability, and multi-tenant security. These features deliver the reliability, performance, and compliance required for enterprise AI workloads.

Register here to get instant access to a range of AI models in the Nscale ecosystem today.

About Nscale

Nscale is the hyperscaler engineered for AI, delivering compute to the generative AI market at scale. Through its fully vertically integrated suite of AI services and compute – across its 60MW renewable energy-powered data centre in Norway and a pipeline of over 1.3GW of greenfield data centres across Europe and North America – Nscale enables customers to run efficient and scalable AI training, fine-tuning, and inferencing workloads.

Send us your news/press releases to info@impactnews-wire.com