Customizing LLM Usage by Project
This guide walks Brim Admins through how to add, manage, and assign Large Language Models (LLMs) in their Brim instance. Proper configuration allows you to optimize performance, control costs, and tailor AI behavior to different projects and workflows.
Overview
Brim lets you:
- Configure multiple LLM models at the environment level
- Choose different models per project
- Set token usage limits per project to control cost and performance
There are two main areas to configure LLMs:
- Admin → LLM Config: Manage available models and environment defaults
- Admin → Projects: Assign models and token limits to individual projects
Managing Available LLM Models
Navigate to:
Admin → LLM Config
At the top of the page, you’ll see the Available LLM Models table.
Here you can:
- View all models available in your Brim environment
- Add new models
- Edit existing models
- Delete models that are no longer needed
These models become selectable options for projects, variables, and dependent variables.
Viewing Existing Models
The Available LLM Models table shows:
| Column | Description |
|---|---|
| Display Name | Human-readable name shown in dropdowns |
| Model ID | Provider-specific identifier used in API calls |
| Provider | LLM provider (e.g., azure-openai) |
| Context Window | Maximum total input tokens supported |
| Max Output | Maximum tokens the model can generate |
| Active | Whether the model can currently be selected |
| Actions | Edit or delete the model |
Adding a New Model
To add a new model:
First, you will need to set up the model and make it available in your LLM provider. Once you're ready to add it to Brim, go to Admin → LLM Config and click Add New Model button at the top of the screen.
You’ll see the Add LLM Model dialog, which will ask you to fill in:
- Model ID. The exact identifier used by your LLM provider. Examples: gpt-4o-mini, gpt-4.1-mini, global.anthropic.claude-haiku-4-5-20251001-v1:0
- Display Name. A human-friendly name shown in dropdowns in Brim. Examples: GPT-4o Mini, Claude Haiku 4.5 (AWS).
- Provider. The configured LLM provider type (typically auto-filled from your environment). Example: azure-openai.
- Context Window (tokens). Maximum total tokens (prompt + context) the model supports in its context window.
- We recommend using the exact context window listed by your compute provider for this model.
- You can use something smaller to be safe, but if you use a larger number, you risk errors from Brim overflowing the model's context window.
- Max Output Tokens. Maximum number of tokens the model can generate in a single response.
- We recommend using the exact max output tokens listed by your compute provider for this model.
- You can use something smaller to be safe, but if you use a larger number, you risk errors if Brim tries to output more tokens than allowed by the model.
- Model is active and available for selection. Enables this model to be selected on the Project level.
Click Save to add a new model.
Editing a Model
To update a model’s configuration:
- Click Edit next to the model
- Modify any fields
- Click Save.
Note that if you edit a model in a way that doesn't match the configuration from your LLM (eg the ID), you may get an error. Changes apply immediately to future generations with this model.
Deleting a Model
To remove a model:
- Click Delete next to the model
- Confirm deletion.
⚠️ If a model is currently assigned to any projects, update those projects before deleting.
Environment-Level Defaults
At the bottom of Admin → LLM Config, you’ll find a table listing the environment defaults for your instance.
| Row Name | Description | Example |
| LLM Type | The LLM provider set up during deployment | azure-openai |
| LLM Model | The unique ID for the model name set up during deployment | gpt-4o-mini |
| LLM Advanced Model | The unique ID for the model name setup up for Advanced reasoning during deployment | gpt-4.1-mini |
| LLM API URL | The URL used to communicate with the LLM provider via API | https://some-org-prefix.azure.com |
| LLM Embedding Type | The provider of the model used for embedding in Brim. This allows Brim to search for relevant text in notes in a context-aware way. | azure-openai |
| LLM Embedding Model | The unique ID for the model used for embedding in Brim. | text-embedding-3-small |
| LLM Embedding URL | The URL used to communicate with the LLM provider for embedding via API | https://some-org-prefix.azure.com |
| Azure OpenAI API Key Set | Whether the deployment was set up with an Azure OpenAI API Key | Yes |
| OpenAI API Key Set | Whether the deployment was set up with an OpenAI API Key | Yes |
| Replicate API Token Set | Whether the deployment was set up with an API key for Replicate. | No |
These values act as fallback defaults and determine the baseline behavior of your Brim environment.
Most users will not need to change these unless setting up new infrastructure or providers. These defaults can only be changed by changing the environment configurations for the deployment.
Assigning Models Per Project
You can set the LLM Model and Advanced LLM Model on a project basis. You can also set token limits by project.
This allows you to:
- Use faster, lower-cost models for routine work
- Reserve advanced models for complex abstraction
- Control costs and latency per project
To customize LLM setup by project,
- Navigate to: Admin → Projects
- Select a project to configure its AI settings.
For each project, you can configure:
- LLM Model. Default model used for most generations
- Advanced LLM Model Option. Higher-capability model used for complex tasks
- Maximum Tokens per Generation. Hard cap on the number of tokens used per project.
- The default is 1 billion tokens, but you can configure it on a project level.
- Leaving this blank will give the project no limit.
- If a project hits its limit for token usage, all token usage will be blocked until its limit is increased.
Best Practices
- Add multiple models to allow experimentation and cost control
- Use advanced models sparingly for only the hardest abstraction tasks
- Set conservative token limits to prevent runaway usage
- Name models clearly so project owners understand performance vs cost tradeoffs