Data Aggregation in Brim

To support evidence-based chart abstraction at scale, Brim starts with the smallest pieces of evidence and aggregates upwards into more powerful and logic-based variables. Knowing more about how this works can help you build variables that leverage the process.

A generation consists of several ordered steps:

Generation Step	What happens for a variable	Notes
Prepare Generation	Brim creates generation tasks and adds them to the task queue. If there are large running tasks and not many GPUs, the new generation tasks may have to wait.
Variable Generation	Brim creates a list of evidence snippets that might be pertinent for each variable. Then Brim uses the variable instructions to assign labels to any relevant evidence snippets and discard the rest. If the variable has a scope of "Many per note", it is done generating after this step.
Variable Aggregation	Brim takes the list of many per note labels and uses the variable aggregation instructions to assign labels either: For each note for "One per note" variables or For each patient for "One per patient" variables	Selecting "Show all aggregated results" in the label table for a One per note or One per patient variable will show all of the underlying labels.
Dependent Variable Generation	Brim takes the list of all labeled variables and uses the dependent variable instructions to assign labels at the "One per patient" level.	If a dependent variable has dependent variables as inputs, Brim will run the dependent variables in the correct order.

Propagation of abstractor corrections

Since each step takes the previous steps as inputs, if you manually change the value of a label in an early step, you will need to re-run generation on later steps to see the logic propagate. You can do this by running generation for the specific variable you want refreshed, or by running a generation that does not overwrite human-reviewed labels.