Large Language Models

Large language models are trained on large amounts of text data, typically using a technique called “unsupervised learning.” This means that there is no explicit information given on what to learn, but rather left to discover patterns in the data by itself.

Popular LLM architectures include
- GPT (Generative Pretrained Transformer)
- BERT (Bidirectional Encoder Representations from Transformers)
- T5 (Text-to-Text Transfer Transformer)
Characteristics
- Large-scale.
- Deep learning architecture, they are trained based on deep learning techniques like artificial neural networks, which allow them to learn complex language.
- Pre-training and re-trained on specific tasks, a trained LLM is fine-tuned to perform specific tasks.
- Contextual, they don’t just use the word, but also the context to better understand its meaning. This helps in creating a more human-like language.
- Generative, they can generate new text.
- High-quality output, it makes it difficult to distinguish from human-written text.

LLM Optimization

Optimizing Large Language Models (LLMs)

General LLM Concepts

General Intelligence

AI is any system that exhibits behavior that can be interpreted as human intelligence.