Cerebras breaks record for largest AI model trained on a single device

Cerebras breaks file for largest AI mannequin skilled on a single system

Posted on

This, Cerebras says, can scale back the engineering time to run large-scale pure language processing fashions from months to minutes, making them cheaper.

Cerebras Techniques claims to have achieved a brand new feat by coaching AI fashions with as much as 20 billion parameters on a single CS-2 system.

The AI ​​firm says utilizing a single system can reduce engineering effort and time from months to minutes when coaching pure language processing (NLP) fashions.

NLP permits computer systems to course of human language as textual content or speech information and perceive its full which means.

Cerebras CEO Andrew Feldman stated bigger NLP fashions are extra correct, however which means solely choose corporations have the sources and experience to do the “laborious work” wanted. stated.

“In consequence, only a few corporations had been in a position to prepare large-scale NLP fashions,” Feldman stated. “It was too costly and time consuming and never accessible to the remainder of the trade.”

Cerebras stated the newest outcomes will assist remove one of many “most painful elements” of coaching large-scale NLP fashions that sometimes contain deploying to lots of or hundreds of various GPUs.

The corporate added that the method of partitioning the mannequin throughout GPUs is exclusive to every pair of community computing clusters, so duties can’t be ported to different clusters or neural networks.

Cerebras says that the WSE-2 processor permits large-scale fashions to be skilled on a single system. It is 56x greater than the biggest GPU, has 2.55 trillion transistors, and has 100x extra compute cores.

Final yr, Cerebras stated it was utilizing the WSE-2 processor to energy a brand new cluster of chips that might “unlock brain-scale neural networks.”

The AI ​​firm stated this processor would imply {that a} single CS-2 may help fashions with lots of of billions and even trillions of parameters. Intersect360 Analysis CRO Dan Olds stated this might present organizations “a simple and inexpensive ramp to main league NLP.”

“Cerebras ushered in an thrilling new period in AI by making large-scale language fashions cost-effective and simply accessible,” stated Olds. “It will likely be thrilling to see new purposes and discoveries as our CS-2 clients prepare their GPT-3 and GPT-J class fashions on huge information units.”

Get the ten issues you have to know proper in your inbox each weekday. be part of day by day briefs ACC Fresno’s digest of important scientific and technological information.

Up to date, written and revealed by ACC Fresno