Positron AI, the premier company for American-made semiconductors and inference hardware, announced the close of a $51.6 million oversubscribed Series A funding round, bringing its total capital raised this year to over $75 million. The round was led by Valor Equity Partners, Atreides Management and DFJ Growth. Additional investment came from Flume Ventures, which includes tech icon Scott McNealy, Resilience Reserve, 1517 Fund and Unless.
This new funding will support the continued deployment of Positron’s first-generation product, Atlas, and accelerate the rollout of its second-generation products in 2026.
With global tech firms projected to spend over $320 billion on AI infrastructure in 2025, enterprises face intensifying cost pressures, power ceilings and chronic shortages of NVIDIA GPUs. Positron’s purpose-built alternative delivers cost and efficiency advantages that come from specialization. The company is currently shipping its first-generation product, Atlas, which delivers 3.5x better performance-per-dollar and up to 66% lower power consumption than NVIDIA’s H100. Unlike general-purpose GPUs, Atlas is designed solely to accelerate and serve generative AI applications.
“The early benefits of AI are coming at a very high cost – it is expensive and energy-intensive to train AI models and to deliver curated results, or inference, to end users. Improving the cost and energy efficiency of AI inference is where the greatest market opportunity lies, and this is where Positron is focused,” said Randy Glein, co-founder and managing partner at DFJ Growth. “By generating 3x more tokens per watt than existing GPUs, Positron multiplies the revenue potential of data centers. Positron’s innovative approach to AI inference chip and memory architecture removes existing bottlenecks on performance and democratizes access to the world’s information and knowledge.”
Also Read: Coherent Introduces 100G Transimpedance Amplifiers for 400G/800G Transceivers
Positron Atlas’s memory-optimized FPGA-based architecture achieves 93% bandwidth utilization, compared to the typical 10–30% in GPU-based systems, and supports up to half a trillion-parameter models in a single 2-kilowatt server. It’s fully compatible with Hugging Face transformer models and serves inference requests through an OpenAI API compatible endpoint. Atlas is powered by chips fabricated in the U.S. and is already deployed in production environments, enabling LLM hosting, generative agents and enterprise copilots with significantly lower latency and reduced hardware overhead.
“Memory bandwidth and capacity are two of the key limiters for scaling AI inference workloads for next-generation models,” said Dylan Patel, founder and CEO of SemiAnalysis, and an advisor and investor in Positron. SemiAnalysis is a leading research firm specializing in semiconductors and AI infrastructure that provides detailed insights into the full compute stack. “Positron is taking a unique approach to the memory scaling problem, and with its next-generation chip, can deliver more than an order of magnitude greater high-speed memory capacity per chip than incumbent or upstart silicon providers.”
SOURCE: Businesswire