Introduction to Tokens
In the realm of language models like ChatGPT, the term 'token' is used to describe the smallest unit of text that the AI model can understand and process. A token can be as short as one character, like 'a', or as long as one word, like 'apple'.
Think of tokens as the building blocks of a conversation. Just like a Lego structure is made up of individual Lego bricks, a conversation or any piece of text is made up of individual tokens.
How LLMs Use Tokens
Large Language Models (LLMs) like ChatGPT read and generate text one token at a time. For instance, when the AI is given a prompt, it reads this prompt one token at a time and generates its response likewise.
Additionally, LLMs have a maximum limit of tokens they can process in one go. For instance, GPT-3 has a maximum token limit of 2048 tokens.
Understanding Token Count
The concept of tokens is critical because everything that an LLM like ChatGPT does β from understanding a prompt to generating a response β is influenced by the token count. The total number of tokens in a conversation, including both the prompt and the AI-generated response, should not exceed the model's token limit.
To provide a sense of scale, GPT-3's context window of 2048 tokens might be approximately 3-4 pages of a typical book. In comparison, a novel like "The Great Gatsby" contains around 50,000 words or slightly more tokens, which is far beyond the model's context window.
Difference Between Prompt and Sample Tokens
Prompt tokens are those that the user inputs to communicate with the AI. Sample tokens, on the other hand, are the tokens that the AI generates as its response.
For instance, if you ask the AI "What is the weather today?", the tokens in your question are the prompt tokens. The AI's response, such as "I'm an AI and don't have real-time access to current events, including weather," would be the sample tokens.
Key Points to Remember
Tokens are the smallest units of text that an AI model can process.
Language models read and generate text one token at a time.
The total number of tokens in a conversation, including both the prompt and the AI-generated response, should not exceed the model's token limit.
Prompt tokens are what the user inputs, and sample tokens are what the AI generates as a response.