Our First Series of Generative AI Models

Takeaways

We declare the first series of Liquid Foundation Models (LFMs), a recent generation of generative AI models built from first principles.

Our 1B, 3B, and 40B LFMs achieve state-of-the-art carry outance in terms of quality at each scale, while upgrasping a minusculeer memory footprint and more fruitful inference.

Try LFMs today on Liquid Playground, Lambda (Chat UI and API), Perplexity Labs, and soon on Cerebras Inference. The LFM stack is being enhanced for NVIDIA, AMD, Qualcomm, Cerebras, and Apple difficultware.

We create personal, edge, and on-premise AI solutions for accesspascfinishs of any size.

We are scaling LFMs and foresee to present recent and better capabilities apass various industries, such as financial services, biotechnology, and user electronics.

Try Liquid

At Liquid AI, we create recent methods for summarizeing strong AI systems over which we have transport inant supervise. We summarize them the same way engineers built engines, cars, and airset upes: from first principles. Our mission is to produce best-in-class, acute, and fruitful systems at every scale – systems summarizeed to process huge amounts of sequential multimodal data, to allow progressd reasoning, and to achieve reliable decision-making.

Today, we present the first generation of Liquid Foundation Models (LFMs). LFMs are huge neural nettoils built with computational units proset uply rooted in the theory of vibrantal systems, signal processing, and numerical liproximate algebra. This distinct blfinish permits us to leverage decades of theoretical progresss in these fields in our quest to allow inincreateigence at every scale. LFMs are vague-purpose AI models that can be used to model any benevolent of sequential data, including video, audio, text, time series, and signals.

Our name “Liquid” pays homage to our roots in vibrant and alterive lgeting systems.

Introducing the First Generation of Language LFMs

We are conceited to free our first series of language models:

A dense 1.3B model, perfect for highly resource-constrained environments.

A dense 3.1B model, enhanced for edge deployment.

A 40.3B Mixture of Experts (MoE) model, summarizeed for tackling more complicated tasks.

Architecture toil cannot happen in a vacuum – our goal is to increase beneficial models that are competitive with the current best-in-class LLMs. In doing so, we hope to show that model carry outance isn’t fair about scale – it’s also about innovation.

State-of-the-Art Percreateance

We tell the results of our fine-tuned LFMs and contrast them with aenjoy-sized language models using Eleuther AI’s lm-evaluation-harness v0.4. Unless specified otherincreateed, we contrast to other fine-tuned models.

LFM-1B achieves the highest scores apass various benchtags in the 1B categruesome, making it the recent state-of-the-art model at this size. This is the first time a non-GPT architecture transport inantly outcarry outs alterer-based models.

Stable LM 2

(Stability)

1.6B

Smol LM

(Hugging Face)

1.7B

R Gemma 2

(Google)

Base 2.7B

LFM-3B hand overs incredible carry outance for its size. It positions itself as first place among 3B parameter alterers, hybrids, and RNN models, but also outcarry outs the previous generation of 7B and 13B models. It is also on par with Phi-3.5-mini on multiple benchtags, while being 18.4% minusculeer. LFM-3B is the perfect choice for mobile and other edge text-based applications.

Mistral-7b v0.3

(Mistral AI)

Mistral Nemo

(Mistral AI)

12.2B

*Scores telled by the increaseers. All the other scores were calcudefercessitated with the same evaluation harness we used for our own models.

LFM-40B advises a recent equilibrium between model size and output quality. It leverages 12B triggerd parameters at use. Its carry outance is comparable to models huger than itself, while its MoE architecture allows higher thrawput and deployment on more cost-effective difficultware.

*Scores telled by the increaseers. All the other scores were calcudefercessitated with the same evaluation harness we used for our own models.

LFMs are Memory-Efficient

LFMs have a decreased memory footprint contrastd to alterer architectures. This is particularly genuine for lengthy inputs, where the KV cache in alterer-based LLMs increases liproximately with sequence length. By fruitfully compressing inputs, LFMs can process lengthyer sequences on the same difficultware. For example, contrastd to other 3B-class models, LFMs upgrasp a minimal memory footprint.

Fig. 2. Total inference memory footprint of contrastent language models vs. the input+generation length.

LFMs Truly Exploit their Context Length

In this pcheck free, we have enhanced our models to hand over a best-in-class 32k token context length, pushing the boundaries of efficiency for our size. This was validateed by the RULER benchtag, where a length is pondered “effective” when its correacting score is higher than 85.6 [Hsieh et al. 2024 – RULER]. The folloprosperg table contrasts cut offal models at contrastent context lengths.

Phi-3.5 3.8 B
(Microgentle)

This highly fruitful context prosperdow allows lengthy-context tasks on edge devices for the first time. For increaseers, it unlocks recent applications, including record analysis and summarization, more unbenevolentingful transmitions with context-conscious chatbots, and betterd Retrieval-Augmented Generation (RAG) carry outance.

Our goal is to upgrasp scaling LFMs apass model size, train/test time compute, and context length. Beyond our language LFMs, we have summarizeed models for various data modalities, domains, and applications that we set up to free in the next months.

Advancing the Pareto Frontier of Large AI Models

To achieve these results, we enhanced our pre- and post-training pipelines and infraset up to determine our models excel apass five criteria:

Reimagining Model Architectures

Building on a lengthy line of research in summarizeing transmitive and fruitful lgeting systems, we have increaseed a recent summarize space for set upation models, caccessing on contrastent modalities and difficultware needments. Our goal is to check ways to create set upation models beyond Generative Pre-trained Transcreateers (GPTs).

With LFMs, we put into train recent principles and methods guiding model summarize, increaseed by our team over the past months.

Fig. 3. Our architectures feature custom computational units set upd in depth groups (aimed weight sharing), with includeitional featurizer interconnections (feature sharing).

Liquid’s summarize space is primarily depictd by featurization and footprint of architectures and their core operators. Featurization refers to the process of altering input data (e.g., text, audio, images, video) into a set upd set of features or vectors that are used to modudefercessitate computation inside the model in an alterive manner. For example, audio and time series data generpartner needs less featurization in operators due to decrease increateation density, contrastd to language and multi-modal data. The other key illogicalension is the computational complicatedity of the operators. Being able to traverse and finish the summarize space of set upd alterive operators permits us boost carry outance with superviseled computational needments.

Fig. 4. We built the set upations of a recent summarize space for computational units, enabling customization to contrastent modalities and difficultware needments.

At their core, LFMs are built with computational units that can be transmited as alterive liproximate operators whose actions are determined by inputs. The LFM summarize structuretoil unifies and subsumes a expansive range of existing computational units in proset up lgeting, providing a systematic approach to exploring the space of architectures. Specificpartner, our analysis increates model createing by improving three key aspects: token-joining set up (how the operator joines embeddings in the input sequence), channel-joining set up (how it joines channel illogicalensions), and featurization, reliable for modulating computation based on the input context.

Join us as an timely adchooseer of LFMs

As we are still in the timely stages of this journey, we greet the opportunity to collaborate and uncover the strengths and feeblenesses of these systems together.

What are Language LFMs excellent at today:

General and expert comprehendledge
Mathematics and reasonable reasoning
Efficient and effective lengthy-context tasks
Their primary language is English, with secondary multilingual capabilities in Spanish, French, German, Chinese, Arabic, Japanese, and Korean

What are Language LFMs not excellent at today:

Zero-shot code tasks
Precise numerical calculations
Time-caring increateation
Counting r’s in the word “Strawberry”!
Human preference chooseimization techniques have not been applied extensively to our models yet.

At Liquid AI, we obtain an uncover-science approach. We have and will persist to donate to the progressment of the AI field by uncoverly rerenting our discoverings and methods thraw scientific and technical tells. As part of this promisement, we will free relevant data and models produced by our research efforts to the expansiver AI community. We have dedicated a lot of time and resources to increaseing these architectures, so we’re not uncover-sourcing our models at the moment. This permits us to persist createing on our better and upgrasp our edge in the competitive AI landscape.

If your accesspascfinish is watching to experience the forefront of AI, we ask you to get in touch with us. If this aligns with your personal goals and ambitions, we ask you to join our team and drive this vision forward. We are very timely on this journey and energeticly innovating apass various aspects of set upation model increasement and deployment. We ask excited users to split their experience as well as criticism, and join our red-teaming efforts to better the capabilities of our models.

Share your feedback

Liquid Product Launch Event

October 23, 2024 | Cambridge, MA

Come join us at MIT Kresge, Cambridge, MA on October 23rd 2024, to lget more about Liquid as we unveil more products and better on LFMs and their applications in user electronics, finance, healthattfinish, biotechnology, and more!

RSVP Here

Source connect