- A large language model (LLM) is just two documents.
- Parameters.
- Run file that runs those parameters.
- Can be in any programming language but C is often used because of its simplicity.
- The magic is in the parameters.
- It is basically a compression of the internet. You have lots of data from the internet.
- A GPU cluster intended for heavy workloads is required which runs for several days to compress the large amount of data into parameters which is comparatively a very small file.
- This is a lossy compression.
- Neural network
- Tries to predict the next word in a sequence. This prediction forces the neural network to learn a lot about the world.
- The network dreams things based on what it was trained on.
Why are LLMs revolutionary?
- Enabled seemingly impossible possibilities
- Generates texts beyond average human writing ability.
- Demonstrate human-like complex reasoning and understanding patterns.
- Versatile on many language tasks altogether.
- Unprecedented crossover with society and other fields.