What is an LLM?
Large Language Models
Large language models are trained on large amounts of text data, typically using a technique called “unsupervised learning.” This means that there is no explicit information given on what to learn, but rather left to discover patterns in the data by itself.
- Popular LLM architectures include
- GPT (Generative Pretrained Transformer)
- BERT (Bidirectional Encoder Representations from Transformers)
- T5 (Text-to-Text Transfer Transformer)
- Characteristics
- Large-scale.
- Deep learning architecture, they are trained based on deep learning techniques like artificial neural networks, which allow them to learn complex language.
- Pre-training and re-trained on specific tasks, a trained LLM is fine-tuned to perform specific tasks.
- Contextual, they don’t just use the word, but also the context to better understand its meaning. This helps in creating a more human-like language.
- Generative, they can generate new text.
- High-quality output, it makes it difficult to distinguish from human-written text.
LLM Optimization
Optimizing Large Language Models (LLMs)
General LLM Concepts
General Intelligence
- AI is any system that exhibits behavior that can be interpreted as human intelligence.