The race to deploy AI models is heating up, but the hunt for the perfect chip is far from over. General Compute, a new inference neocloud, has found a potential game-changer in SambaNova's air-cooled chips, which promise 600 tokens per second compared to GPUs' 250.
‘We’re looking at a future where multiple models and agents coexist,’ says CEO Finn Puklowski. ‘Speed will be the key to success.’ General Compute is already deploying these chips in existing data centres, saving on costly infrastructure upgrades.
The demand for inference-specific chips has rocketed; Nvidia’s acquisition of Groq and Cerebras’ IPO highlight this shift. Meanwhile, SambaNova’s flexible architecture sets it apart, offering a more efficient way to store context during calculations. This is crucial as models become more complex and data-intensive.
General Compute isn’t just about chips; it's about strategic partnerships. The company is pursuing colocation deals with both data centre providers and crypto miners, repurposing existing infrastructure for faster inference. This could be a win-win, especially if the cost of producing a bitcoin exceeds its worth.
The future of AI computing is not just about hardware but also how it’s deployed. As models become more ubiquitous, the speed at which they generate responses will determine competitiveness. General Compute’s focus on efficiency and flexibility could put it ahead in this rapidly evolving landscape.







