Data Aggregation in Brim
To support evidence-based chart abstraction at scale, Brim starts with the smallest pieces of evidence and aggregates upwards into more powerful and logic-based variables. Knowing more about how this works can help you build variables that leverage the process.
A generation consists of several ordered steps:
Generation Step | What happens for a variable | Elements Used |
|
Brim creates generation tasks and adds them to the task queue. If there are large running tasks and not many GPUs, the new generation tasks may have to wait. | |
|
Brim creates a list of evidence snippets that might be pertinent for each variable. Then Brim uses the Variable Instructions to assign labels to any relevant evidence snippets and discard the rest. If the variable has a scope of "Many per note" and no input variables, it is done generating after this step. |
Variable Instructions Variable Definition Notes, Note Titles, and Note Dates If a Variable has input variables, those variable names, values, and evidence. |
|
Brim takes the list of many per note labels and uses the variable aggregation instructions to assign labels either:
Selecting "Show all inputs" in the label table for a One per note or One per patient variable will show all of the underlying labels. |
Variable Aggregation Strategy or Custom Aggregation Instructions Value for "Only use true value in aggregation" Pre-aggregation variable values, with evidence snippets. |
|
Brim takes the list of all labeled variables and uses the dependent variable instructions to assign labels at the "One per patient" level. If a dependent variable has dependent variables as inputs, Brim will run the dependent variables in the correct order. |
Dependent Variable Instructions Dependent Variable Definition Variable names, values, and evidence for input variables. |
|
Brim compares the generated labels with any uploaded validation datasets and calculates agreement metrics. | Variable and Dependent Variable names, scopes, and values Validation datasets |
Propagation of abstractor corrections
Since each step takes the previous steps as inputs, if you manually change the value of a label in an early step, you will need to re-run generation on later steps to see the logic propagate. You can do this by running generation for the specific variable you want refreshed, or by running a generation that does not overwrite human-reviewed labels.