Bitcoin price today: hovers around $115k amid rate cut bets, but caution remains
Investing.com -- Alibaba Group Holding Ltd. has launched a new artificial intelligence model called Qwen3-Next, designed to significantly improve efficiency in both training and inference processes.
The new model features a hybrid attention mechanism, a highly sparse Mixture-of-Experts (MoE) structure, training-stability-friendly optimizations, and a multi-token prediction mechanism for faster inference.
Alibaba’s Qwen3-Next-80B-A3B-Base model contains 80 billion parameters but activates only 3 billion during inference. The company claims this base model achieves performance comparable to or slightly better than the dense Qwen3-32B model while using less than 10% of its training cost in GPU hours.
For inference with context lengths exceeding 32,000 tokens, the new model delivers more than 10 times higher throughput compared to previous versions.
Alibaba has also released two post-trained versions: Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking. The company reports solving stability and efficiency issues in reinforcement learning training caused by the hybrid attention and high-sparsity MoE architecture.
The Instruct version performs comparably to Alibaba’s flagship model Qwen3-235B-A22B-Instruct-2507 and shows advantages in tasks requiring ultra-long context of up to 256,000 tokens. The Thinking version excels at complex reasoning tasks, reportedly outperforming higher-cost models like Qwen3-30B-A3B-Thinking-2507 and Qwen3-32B-Thinking.
Alibaba has made Qwen3-Next available on Hugging Face and ModelScope. Users can access the Qwen3-Next service through Alibaba Cloud Model Studio and NVIDIA API Catalog.
This article was generated with the support of AI and reviewed by an editor. For more information see our T&C.