Llama (language model)

[9] Subsequent versions of Llama were made accessible outside academia and released under licenses that permitted some commercial use.

[10][7] Alongside the release of Llama 3, Meta added virtual assistant features to Facebook and WhatsApp in select regions, and a standalone website.

[12] The release of ChatGPT and its surprise success caused an increase in attention to large language models.

[13] Compared with other responses to ChatGPT, Meta's Chief AI scientist Yann LeCun stated that large language models are best for aiding with writing.

[18] LLaMA was announced on February 24, 2023, via a blog post and a paper describing the model's training, architecture, and performance.

[24] On March 20, Meta filed a DMCA takedown request for copyright infringement against a repository containing a script that downloaded LLaMA from a mirror, and GitHub complied the next day.

[26] The accompanying preprint[26] also mentions a model with 34B parameters that might be released in the future upon satisfying safety targets.

Meta AI's testing showed in April 2024 that Llama 3 70B was beating Gemini Pro 1.5 and Claude 3 Sonnet on most benchmarks.

Meta also announced plans to make Llama 3 multilingual and multimodal, better at coding and reasoning, and to increase its context window.

LLaMA 1 foundational models were trained on a data set with 1.4 trillion tokens, drawn from publicly available data sources, including:[2] On April 17, 2023, TogetherAI launched a project named RedPajama to reproduce and distribute an open source version of the LLaMA dataset.

[26] Llama 2 - Chat was additionally fine-tuned on 27,540 prompt-response pairs created for this project, which performed better than larger but lower-quality third-party datasets.

For AI alignment, reinforcement learning with human feedback (RLHF) was used with a combination of 1,418,091 Meta examples and seven smaller datasets.

[18] In a lawsuit brought by Richard Kadrey and others against Meta Platforms, CEO Mark Zuckerberg was alleged to have authorized the use of copyrighted content from Library Genesis to train Llama AI models and conceal its actions by removing copyright markers from the data.

For AI alignment, human annotators wrote prompts and then compared two model outputs (a binary protocol), giving confidence levels and separate safety labels with veto power.

Two separate reward models were trained from these preferences for safety and helpfulness using Reinforcement learning from human feedback (RLHF).

This was accomplished using the new "Ghost attention" technique during training, which concatenates relevant instructions to each new user message but zeros out the loss function for tokens in the prompt (earlier parts of the dialog).

[50][51][52] The model files were officially removed on March 21, 2023, over hosting costs and safety concerns, though the code and paper remain online for reference.

[56][57][58] Zoom used Meta Llama 2 to create an AI Companion that can summarize meetings, provide helpful presentation tips, and assist with message responses.

[62] The format focuses on supporting different quantization types, which can reduce memory usage, and increase speed at the expense of lower model precision.

[63] llamafile created by Justine Tunney is an open-source tool that bundles llama.cpp with the model into a single executable file.

Tunney et al. introduced new optimized matrix multiplication kernels for x86 and ARM CPUs, improving prompt evaluation performance for FP16 and 8-bit quantized data types.

Some experts contend that future models may facilitate causing damage more than defending against it, for example by making it relatively easy to engineer advanced bioweapons without specialized knowledge.