Resources
The currency of AI: (Almost) everything you need to know about tokens

The currency of AI: (Almost) everything you need to know about tokens

Written by Gregor Blaj, June 2026

Wondered what exactly AI tokens are and why they are a hot topic of conversation? Essentially, AI tokens are the building blocks used by AI language models like ChatGPT, Claude, or Copilot to read, understand, and create text. They are also the basic units of AI trade; in other words, it is the way AI usage is measured and ultimately billed particularly for API access where AI is integrated into other applications.

For this reason, token efficiency matters, because more tokens consumed equates to a higher invoice for AI.

A simple way to understand the basic principle is to think of tokens the way your brain breaks a sentence into manageable pieces when reading a report or email. The AI chops everything in those sentences into small units called tokens.

The tokens aren't always whole words. A token could be:

A full short word (like "the" or "hello").
Part of a longer word (e.g., "dark" + "ness" for "darkness").
Punctuation, spaces, or even single letters in some cases.

A handy rule of thumb: 1 token equals around 4 characters. A hundred tokens is about 75 words, and a typical paragraph might be around 100 tokens. (This varies by language and the specific AI system.)

When a question is asked or a document pasted into AI:

The system breaks the input into tokens (input tokens).
The AI "thinks" using those tokens and generates a reply as output tokens.
It converts those output tokens back into readable words.

The token system is how AI learned language in the first place. During training, it saw millions of tokens and figured out patterns between them (and always bear in mind that every AI is, in effect, a pattern-recognition machine. They do not ‘think’ like we do).

Why tokens matter for your business

Tokens are the meter that decides the bill, much like kilowatt-hours on an electricity invoice or gigabytes on a data plan. Most AI services charge by tokens used, both those sent in and what the AI sends back.

This quickly becomes significant because AI usage typically scales fast. A customer-service chatbot handling thousands of chats per day, a legal team summarising contracts, a marketing team generating reports, or even your employees experimenting with AI. These actions rack up token usage.

Without understanding tokens, surprise costs emerge. Some companies have reported (unanticipated) six-figure bills from heavy AI use, with the cost coming from two main sources:

Input tokens: Everything fed into the AI, including prompts, previous conversation history, uploaded documents, and so on. These are usually cheaper.
Output tokens: The AI’s reply. These cost more because generating responses takes processing power.

Prices are quoted per million tokens and vary by model. More advanced models cost more, even within the same provider (e.g. Anthropic’s Opus vs Haiku). Typical pricing is a few dollars per million input tokens, and up to $20–$30 per million output tokens.

Token management explained

Can tokens be managed? Absolutely. Should tokens be managed? Also absolutely; a sudden ten thousand dollar bill will raise eyebrows and potentially result in other consequences.

There isn’t unlimited free use, particularly with API-connected AI, so smart practices are necessary to keep costs predictable and lower.

This should be approached like any other expense: measure, track, optimise, and set budgets.

There are practical ways to control token spend:

Educate your people: Have them read this blog.
Prompt engineering matters: Short, clear prompts use fewer tokens. Instead of a rambling 500-word instruction, say exactly what you need. This dramatically cuts input tokens.
Summarise first: Rather than pasting a 50-page report every time, ask the AI to summarise, then work with the shorter version.
Limit AI’s output: Ask for precisely what you want, for example “Answer in 3 bullet points” or “Keep under 200 words.” This controls output tokens.
Reuse context (caching): Mark repeated info (like company policies) so the AI doesn’t continually re-process. This can reduce token use by 75–90%.
Model selection: Use cheaper, fast models for simple tasks (e.g., “fix spelling”) and save the expensive models for tougher jobs.
Monitor and set limits: Most platforms give dashboards showing token use per team or project. Set alerts or daily caps.

Businesses that manage these factors well cut AI costs by 30 to 70% while getting the same results.

AI companies handle tokens differently

This is the important bit, because your costs very much depend on which brand of AI is in use.

Almost every major AI company offering API access uses tokens for billing because it aligns charges with work done.

OpenAI (GPT models like GPT-4o or newer): Pure token-based pricing on their API, with charges for input + output tokens used. Caching discounts and tools to count tokens in advance are available, as are subscription models.
Anthropic (Claude models): Purely token-based. Anthropic offers strong prompt caching and clear pricing tiers (e.g., cheaper “Haiku” model vs. powerful “Opus”). Subscription models are available.
Microsoft Copilot: Often bundled into Microsoft 365 subscriptions rather than pure per-token billing. For everyday business users (Copilot in Word, Excel, Teams, etc.), the flat per-user monthly fee covers usage, with Microsoft handling token tracking behind the scenes. For custom agents or advanced features in Copilot Studio, a credits system applies (e.g., certain tools cost credits per 1,000 tokens processed). GitHub Copilot (for coding) has shifted toward usage-based AI credits that map directly to token consumption. Broadly, Copilot makes costs predictable and easy for office teams, but heavy custom use can still require monitoring credits or extra packs.
Google Gemini: Token-based on the API, like OpenAI/Anthropic, with rates varying by model and context length. Subscription models are also available.

Complex tasks can quickly burn through an unexpected volume of tokens, causing either a sudden spike in bills, or depletion ahead of schedule. This happens when an agent is given an instruction that kicks off multiple ‘behind the scenes’ actions; for example, ‘find the bug and fix it’, might result in 50 hidden API calls.

This has resulted in two distinct models.

1. Dynamic Throttling

With subscriptions in Claude Code, Cursor, and Antigravity CLI, a flat rate is paid with a limit tied to a rolling compute pool. Different tasks consume the limit at different rates, with the net effect that while the price is fixed, however the token availability is highly dynamic (and therefore unpredictable).

2. Dynamic Pricing

This is a true pay-as-you-go model. Bypassing consumer subscriptions and connecting Anthropic, OpenAI, or Google API keys straight into a CLI or IDE results in paying per input and output token. This is suitable for heavy power users, as it avoids throttling which can interfere with delivery schedules.

The bottom line: Know tokens and the implications

Tokens are the currency of AI. Understanding them sets the scene for managing token use; explain how tokens work to your teams. It’s not that different from, for example, metered data usage. Using AI wisely contributes to keeping it affordable.

Mastering token management is also part of turning AI from expensive experimentation into a cost-effective competitive advantage.

As always, feel free to get in touch to discuss further.

About Gregor Blaj

Gregor Blaj is the Technical Director at Lancom Technology with expertise spanning systems engineering, project management and customer relationship management. Gregor is also an AWS Certified Solutions Architect Professional and an Azure Solutions Architect Expert.

Our Microsoft Expertise

With multiple Microsoft Partner designations, Lancom Technology are experts at designing, building, migrating and operating complex Microsoft Azure environments and delivering successful cloud projects for companies of all sizes, across all industries. Contact us to find out more.

Get In Touch