New Open-Source ‘Falcon’ AI Language Model Overtakes Meta and Google

Falcon 180B: The Powerful New Open-Source Large Language Model

The artificial intelligence (AI) community has recently welcomed Falcon 180B, an open-source large language model (LLM) that has set new standards in terms of size and performance. With an impressive 180 billion parameters trained on a vast amount of data, Falcon 180B has surpassed previous open-source LLMs on multiple fronts.

Announced by the Hugging Face AI community in a blog post, Falcon 180B is now available on the Hugging Face Hub. This latest-model architecture builds on the success of the Falcon series of open-source LLMs and incorporates innovative features like multiquery attention. It has been trained on a staggering 3.5 trillion tokens, making it the longest single-epoch pretraining for an open-source model to date.

To achieve such remarkable results, the training process involved the simultaneous use of 4,096 GPUs for approximately 7 million GPU hours, with Amazon SageMaker being utilized for training and refining. The sheer size of Falcon 180B is evident when comparing its parameters to other models. It measures 2.5 times larger than Meta’s LLaMA 2 model, which was renowned for its capabilities after its launch earlier this year, boasting 70 billion parameters trained on 2 trillion tokens.

Falcon 180B not only surpasses LLaMA 2 but also outperforms other models in terms of scale and benchmark performance across various natural language processing (NLP) tasks. It achieves a solid ranking of 68.74 points on the leaderboard for open access models and comes close to commercial models like Google’s PaLM-2 in evaluations such as the HellaSwag benchmark.

Specifically, Falcon 180B matches or exceeds PaLM-2 Medium on commonly used benchmarks including HellaSwag, LAMBADA, WebQuestions, Winogrande, and more. In fact, it is considered to be on par with Google’s PaLM-2 Large, demonstrating the exceptional performance of an open-source model compared to industry giants’ solutions.

While Falcon 180B is more powerful than the free version of ChatGPT, it falls slightly short of the capabilities of the paid “plus” service. However, the model sits somewhere between GPT 3.5 and GPT4, depending on the evaluation benchmark, and the community’s further fine-tuning is eagerly anticipated now that it has been openly released.

The release of Falcon 180B signifies another leap forward in the rapid progress made with LLMs. Beyond scaling up parameters, groundbreaking techniques like LoRAs, weight randomization, and Nvidia’s Perfusion have enabled significantly more efficient training of large AI models.

With Falcon 180B now freely available on Hugging Face, researchers expect the model to further evolve and improve as the community develops additional enhancements. The initial showcases of Falcon 180B’s advanced natural language capabilities are truly exciting and highlight the significant developments happening in the open-source AI sphere.

