Estimating Token Usage
When planning a Brim project, one of the most common questions is:
“How many tokens will this project use?”
The short answer is: token usage is difficult to predict precisely.
However, you can create a practical estimate to support planning, budgeting, and optimization decisions.
Why Token Usage Is Difficult to Predict
Token usage in Brim depends on several interacting factors across your abstraction workflow, including:
- Document length (how much text exists per patient)
- Number of documents per patient
- Total number of patients
- Number of variables defined
- Variable structure (including dependent variables and hierarchies)
- Iteration cycles during optimization and validation
Because Brim supports iterative, human-in-the-loop abstraction, token usage naturally evolves as projects mature.
A Conservative Estimation Formula
To help teams plan early, Brim commonly uses the following conservative estimate:
Total Token Usage ≈ total_patients × characters_per_patient × number_of_variables_defined
What each component means
Total Patients
The number of patients included in your project.
Characters per Patient
The approximate total text volume in characters across all uploaded documents for a single patient.
Number of Variables Defined
The total number of variables generated during abstraction.
✅ The conservative estimate overestimates usage. For a recent group of 75 real-world projects:
- All projects had usage equal to or less than the conservative estimate.
- The conservative estimate overestimated usage by an average of 20x relative to real token usage.
- Some projects were overestimated by as much as 200x.
Example Conservative Estimate
If your project includes:
- 500 patients
- ~25,000 characters per patient
- 40 defined variables
Your conservative estimate would be:
500 × 25,000 × 40 = 500,000,000 tokens
At $0.40/million tokens, a conservative estimate would be that this project will be under $300 in compute.
This provides a useful upper-bound planning estimate before running generation.
Factors That Increase Token Usage
You may see higher token usage when:
- Variables reference large portions of source documents
- Projects contain long clinical narratives
- Variables are regenerated multiple times during iteration
How to Reduce Token Usage
Brim provides several tools to help manage and optimize token consumption:
- Designing clear, focused variables
- Using structured variable hierarchies
- Iterating intentionally rather than regenerating broadly
- Reviewing Variable Scorecard feedback before regeneration
For guidance, see:
Important Note
Token estimation should be treated as a planning tool, not a precise forecast.
The most accurate understanding of usage comes from:
- Running small pilot generations
- Monitoring token usage during early iterations
- Refining variables before generation
Recommended Workflow
We recommend the following approach:
- Estimate token usage conservatively using the formula above.
- Set a per-project Token Limit using that estimate and what's acceptable for your project.
- Run a small pilot cohort
- Reviewing token usage results frequently
- Optimize variables and hierarchy structure
- Scale generation gradually
If you’d like help estimating token usage for a specific project, contact support@brimanalytics.com.