Ptechhub
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
PtechHub
No Result
View All Result

Latest Alibaba AI model demos AI improvements | Computer Weekly

By Computer Weekly by By Computer Weekly
March 7, 2025
Home Uncategorized
Share on FacebookShare on Twitter


Just two months after the tech world was upended by the DeepSeek-R1 AI model, Alibaba Cloud has introduced QwQ-32B, an open source large language model (LLM).

The Chinese cloud giant describes the new model as “a compact reasoning model” which uses only 32 billion parameters, yet is capable of delivering performance comparable to other large language AI models that use larger numbers of parameters.

On its website, Alibaba Cloud published performance benchmarks which suggest that the new model is comparable to AI models from DeepSeek and OpenAI. These benchmarks include AIME 24 (mathematical reasoning), Live CodeBench (coding proficiency), LiveBench (test set contamination and objective evaluation), IFEval (instruction-following ability), and BFCL (tool and function-calling capabilities).

By using continuous reinforced learning (RL) scaling, Alibaba claimed the QwQ-32B model demonstrates significant improvements in mathematical reasoning and coding proficiency.

In a blog post, the company said QwQ-32B, which uses 32 billion parameters, achieves performance comparable to DeepSeek-R1, which uses 671 billion parameters. Alibaba said that this shows the effectiveness of RL when applied to robust foundation models pretrained on extensive world knowledge.

“We have integrated agent-related capabilities into the reasoning model, enabling it to think critically while utilising tools and adapting its reasoning based on environmental feedback,” Alibaba said in the blog post. 

Alibaba said QwQ-32B demonstrates the effectiveness of using reinforcement learning (RL) to enhance reasoning capabilities. With this approach to AI training, a reinforcement learning AI agent is able to perceive and interpret its environment, as well as take actions and learn through trial and error. Reinforcement learning is one of several approaches developers use to train machine learning systems. Alibaba used RL to make its model more efficient.

“We have not only witnessed the immense potential of scaled RL, but also recognised the untapped possibilities within pretrained language models,” Alibaba said. “As we work towards developing the next generation of Qwen, we are confident that combining stronger foundation models with RL powered by scaled computational resources will propel us closer to achieving Artificial General Intelligence [AGI].”

Alibaba said it is actively exploring the integration of agents with RL to enable what it describes as “long-horizon reasoning” which, according to Alibaba, will eventually lead to greater intelligence with inference time scaling.

The QwQ-32B model was trained using rewards from a general reward model and rule-based verifiers, enhancing its general capabilities. According to Alibaba, these include better instruction-following, alignment with human preferences and improved agent performance.

China’s DeepSeek, which has been generally available since the start of the year, demonstrates the effectiveness of RL in its ability to deliver comparable benchmark results compared to rival US large language models. Its R1 LLM can rival US artificial intelligence without the need to resort to the latest GPU hardware.

The fact that Alibaba’s QwQ-32B model also uses RL is no coincidence. The US has banned the export of high-end AI accelerator chips – such as the Nvidia H100 graphics processor – to China, which means Chinese AI developers have had to look at alternative approaches to making their models work. Using RL does appear to deliver comparable benchmark results compared with what models like those from OpenAI are able to achieve.

What is interesting about the QwQ-32B model is that it uses significantly fewer parameters to achieve similar results to DeepSeek, which effectively means that it should be able to run on less powerful AI acceleration hardware.



Source link

By Computer Weekly

By Computer Weekly

Next Post
What PCI DSS v4 Really Means – Lessons from A&F Compliance Journey

What PCI DSS v4 Really Means – Lessons from A&F Compliance Journey

Recommended.

Platform engineering is about more than what’s going on Backstage | Computer Weekly

Platform engineering is about more than what’s going on Backstage | Computer Weekly

September 8, 2025
Au Mobile World Congress 2025 de Kigali (MWC Kigali 2025), Tongyu Communication met en avant ses technologies innovantes au service d’un avenir numérique connecté en Afrique

Au Mobile World Congress 2025 de Kigali (MWC Kigali 2025), Tongyu Communication met en avant ses technologies innovantes au service d’un avenir numérique connecté en Afrique

October 23, 2025

Trending.

Chai AI Announces Upcoming Rollout of Apple and Google Age Verification APIs to Enhance Platform Safety

Chai AI Announces Upcoming Rollout of Apple and Google Age Verification APIs to Enhance Platform Safety

March 10, 2026
Huawei lanceert Next Generation FAN-oplossing

Huawei lanceert Next Generation FAN-oplossing

March 7, 2026
Baidu Announces Fourth Quarter and Fiscal Year 2025 Results

Baidu Announces Fourth Quarter and Fiscal Year 2025 Results

February 26, 2026
Half of Google’s software development now AI-generated | Computer Weekly

Half of Google’s software development now AI-generated | Computer Weekly

February 5, 2026
Ghost Campaign Uses 7 npm Packages to Steal Crypto Wallets and Credentials

Ghost Campaign Uses 7 npm Packages to Steal Crypto Wallets and Credentials

March 24, 2026

PTechHub

A tech news platform delivering fresh perspectives, critical insights, and in-depth reporting — beyond the buzz. We cover innovation, policy, and digital culture with clarity, independence, and a sharp editorial edge.

Follow Us

Industries

  • AI & ML
  • Cybersecurity
  • Enterprise IT
  • Finance
  • Telco

Navigation

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Subscribe to Our Newsletter

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Copyright © 2025 | Powered By Porpholio

No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs

Copyright © 2025 | Powered By Porpholio