Treatment Effect Calculation#
1. What is Treatment Effect Calculation?#
The Treatment Effect module’s objective is finding the differences in outcomes between a treatment group and a control group. This can be assessed thanks to a variety of metrics, such as the odds_ratio()
, the absolute_risk_reduction()
, or the hedges_g()
.
The example dataset used in the following section is explained in this dropdown:
Example Dataset
An example dataset for the following demonstrations was generated with the method from_advanced_example_dataset()
from the MedRecord
class.
medrecord = MedRecord().from_advanced_example_dataset()
This example dataset includes a set of patients, drugs, diagnoses and procedures. For this section, we will use the patients, diagnoses, drugs, and the edges that connect the patients’ nodes with the other two groups.
patients.head(5)
shape: (5, 3)
┌────────────┬─────┬────────┐
│ patient_id ┆ age ┆ gender │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ str │
╞════════════╪═════╪════════╡
│ pat_77 ┆ 6 ┆ M │
│ pat_348 ┆ 51 ┆ M │
│ pat_134 ┆ 27 ┆ F │
│ pat_301 ┆ 85 ┆ M │
│ pat_423 ┆ 29 ┆ F │
└────────────┴─────┴────────┘
diagnoses.head(5)
shape: (5, 2)
┌──────────────┬─────────────────────────────────┐
│ diagnosis_id ┆ description │
│ --- ┆ --- │
│ str ┆ str │
╞══════════════╪═════════════════════════════════╡
│ 127013003 ┆ Disorder of kidney due to diab… │
│ 236077008 ┆ Protracted diarrhea (finding) │
│ 192127007 ┆ Child attention deficit disord… │
│ 278588009 ┆ Fractured dental filling (find… │
│ 399211009 ┆ History of myocardial infarcti… │
└──────────────┴─────────────────────────────────┘
patients_diagnoses_edges.head(5)
shape: (5, 4)
┌─────────┬───────────┬───────────────┬─────────────────────┐
│ source ┆ target ┆ duration_days ┆ time │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ f64 ┆ datetime[μs] │
╞═════════╪═══════════╪═══════════════╪═════════════════════╡
│ pat_140 ┆ 414545008 ┆ 0.0 ┆ 1991-01-03 00:00:00 │
│ pat_151 ┆ 431856006 ┆ 0.0 ┆ 1994-08-01 00:00:00 │
│ pat_586 ┆ 66383009 ┆ 14.0 ┆ 2019-07-30 00:00:00 │
│ pat_415 ┆ 314529007 ┆ 63.0 ┆ 2018-10-01 00:00:00 │
│ pat_247 ┆ 160903007 ┆ 742.0 ┆ 2013-03-17 00:00:00 │
└─────────┴───────────┴───────────────┴─────────────────────┘
drugs.head(5)
shape: (5, 2)
┌─────────┬─────────────────────────────────┐
│ drug_id ┆ description │
│ --- ┆ --- │
│ str ┆ str │
╞═════════╪═════════════════════════════════╡
│ 209387 ┆ Acetaminophen 325 MG Oral Tabl… │
│ 206905 ┆ Ibuprofen 400 MG Oral Tablet [… │
│ 858817 ┆ enalapril maleate 10 MG Oral T… │
│ 1665060 ┆ cefazolin 2000 MG Injection │
│ 313988 ┆ Furosemide 40 MG Oral Tablet │
└─────────┴─────────────────────────────────┘
patients_drugs_edges.head(5)
shape: (5, 5)
┌─────────┬────────┬─────────────────────┬────────┬──────────┐
│ source ┆ target ┆ time ┆ cost ┆ quantity │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ datetime[μs] ┆ f64 ┆ i64 │
╞═════════╪════════╪═════════════════════╪════════╪══════════╡
│ pat_223 ┆ 205923 ┆ 2017-08-08 00:46:43 ┆ 29.21 ┆ 1 │
│ pat_374 ┆ 314076 ┆ 2022-12-18 00:08:47 ┆ 0.91 ┆ 1 │
│ pat_114 ┆ 314076 ┆ 2016-02-11 04:49:32 ┆ 0.91 ┆ 1 │
│ pat_285 ┆ 314076 ┆ 2024-12-28 07:44:22 ┆ 0.91 ┆ 1 │
│ pat_191 ┆ 106892 ┆ 2003-02-10 04:29:29 ┆ 126.89 ┆ 1 │
└─────────┴────────┴─────────────────────┴────────┴──────────┘
2. Building a Treatment Effect Instance#
As with other modules in MedModels, the TreatmentEffect
class is meant to be instantiated using a builder pattern, thanks to its builder()
method.
This instantiation requires a minimum of two arguments: a treatment
and an outcome
. These have to be the names of the MedRecord’s Groups
that contain the respective nodes. Also, the patient group needs to be specified if it does not correspond to the default patient
. Here, we can see how we can create these groups by using the Query Engine. More information on how to use this powerful and efficient tool here: Query Engine.
In this example case study, we will use as treatment group “Alendronic acid”, a primary treatment for osteoporosis, and “Fractures” as outcomes. We expect the treated patients to have less fractures than the control ones.
def find_alendronic_drugs(node: NodeOperand) -> NodeIndicesOperand:
node.in_group("drug")
description = node.attribute("description")
description.lowercase()
description.contains("alendronic")
return node.index()
def find_fracture_diagnoses(node: NodeOperand) -> NodeIndicesOperand:
node.in_group("diagnosis")
description = node.attribute("description")
description.lowercase()
description.contains("fracture")
return node.index()
medrecord.unfreeze_schema()
medrecord.add_group("alendronic", find_alendronic_drugs)
medrecord.add_group("fracture", find_fracture_diagnoses)
Methods used in the snippet
in_group()
: Query nodes that belong to that group.attribute()
: Returns aNodeMultipleValuesWithIndexOperand
representing the values of that attribute for the nodes.lowercase()
: Convert the multiple values to lowercase.contains()
: Query which multiple values contain a value.index()
: Returns aNodeIndicesOperand
representing the indices of the nodes queried.unfreeze_schema()
: Unfreezes the schema. Changes in the schema are automatically inferred.add_group()
: Adds a group to the MedRecord, optionally with node and edge indices.
Note
Since the MedRecord we are using has a Provided
schema, we need to use unfreeze_schema()
in order to add new groups to the MedRecord.
Once we have the required treatment
and outcome
groups (alendronic and fracture), we can go forward and create a treatment effect instance.
treatment_effect_basic = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_patients_group("patient")
.build()
)
Methods used in the snippet
TreatmentEffect
: The TreatmentEffect class for analyzing treatment effects in medical records.builder()
: Creates a TreatmentEffectBuilder instance for the TreatmentEffect class.with_treatment()
: Sets the treatment group for the treatment effect estimation.with_outcome()
: Sets the outcome group for the treatment effect estimation.build()
: Builds the treatment effect with all the provided configurations.
With this instance of a treatment effect class, we can test in which groups the patients of our MedRecord are divided into thanks to a ContingencyTable
. This Contingency Table, contains the counts of how many patients are divided into the four important subgroups the treatment effect cares about the most:
Treated with outcome: Patients who received the treatment and experienced the outcome.
Treated with no outcome: Patients who received the treatment but did not experience the outcome.
Control with outcome: Patients who did not receive the treatment but experienced the outcome.
Control with no outcome: Patients who neither received the treatment nor experienced the outcome.
treatment_effect_basic.estimate.subject_counts(medrecord)
-----------------------------------
Outcome
Group True False
-----------------------------------
Treated 3 15
Control 55 527
-----------------------------------
Methods used in the snippet
subject_counts
: Overview of how many subjects are in which group from the contingency table.
2.1. Adding a Time Component#
In the previous section, we saw how to instantiate a TreatmentEffect class with default settings. Using the builder pattern, however, we can customize key properties that influence how treatment effect metrics are calculated.
One important property is time. By default, treatment and outcome groups are formed without considering the time at which each event occurred. But when a time attribute is provided, the logic changes: the Treated groups (with or without outcome) are now determined based on whether the outcome happened after the treatment. This allows for a more causal interpretation by ensuring that only post-treatment outcomes are considered in the analysis for treated individuals.
treatment_effect_with_time = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_time_attribute("time")
.build()
)
treatment_effect_with_time.estimate.subject_counts(medrecord)
-----------------------------------
Outcome
Group True False
-----------------------------------
Treated 0 18
Control 55 527
-----------------------------------
Methods used in the snippet
TreatmentEffect
: The TreatmentEffect class for analyzing treatment effects in medical records.builder()
: Creates a TreatmentEffectBuilder instance for the TreatmentEffect class.with_treatment()
: Sets the treatment group for the treatment effect estimation.with_outcome()
: Sets the outcome group for the treatment effect estimation.with_time_attribute()
: Sets the time attribute to be used in the treatment effect estimation.build()
: Builds the treatment effect with all the provided configurations.
You can also further customize the time-related properties, such as the grace period, the follow-up period, or whether we should exclude the patients in which there was an outcome before the treatment. These can only be used once a time attribute is set.
treatment_effect_customized = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_time_attribute("time")
.with_grace_period(days=30)
.with_follow_up_period(days=365)
.with_outcome_before_treatment_exclusion(days=15)
.build()
)
Methods used in the snippet
TreatmentEffect
: The TreatmentEffect class for analyzing treatment effects in medical records.builder()
: Creates a TreatmentEffectBuilder instance for the TreatmentEffect class.with_treatment()
: Sets the treatment group for the treatment effect estimation.with_outcome()
: Sets the outcome group for the treatment effect estimation.with_grace_period()
: Sets the grace period for the treatment effect estimation.with_follow_up_period()
: Sets the follow-up period for the treatment effect estimation.with_outcome_before_treatment_exclusion()
: Define whether we allow the outcome to exist before the treatment or not.build()
: Builds the treatment effect with all the provided configurations.
Washout periods for specific drugs that may impact the test results can also be included:
def find_corticosteroids(node: NodeOperand) -> NodeIndicesOperand:
node.in_group("drug")
description = node.attribute("description")
description.lowercase()
description.contains("sone")
return node.index()
medrecord.add_group("corticosteroids", find_corticosteroids)
treatment_effect_with_washout = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_washout_period({"corticosteroids": 15})
.build()
)
Methods used in the snippet
TreatmentEffect
: The TreatmentEffect class for analyzing treatment effects in medical records.builder()
: Creates a TreatmentEffectBuilder instance for the TreatmentEffect class.with_treatment()
: Sets the treatment group for the treatment effect estimation.with_outcome()
: Sets the outcome group for the treatment effect estimation.with_washout_period()
: Sets the washout period for the treatment effect estimation.build()
: Builds the treatment effect with all the provided configurations.
2.2. Implement Control Group Matching#
And we have also integrated matching algorithms, like nearest neighbors or propensity matching to conform control groups that can clearly resemble the treated population. For that, we can use variables like age or gender.
treatment_effect_matching = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_nearest_neighbors_matching(
essential_covariates=["age", "gender"], number_of_neighbors=2
)
.build()
)
treatment_effect_matching.estimate.subject_counts(medrecord)
-----------------------------------
Outcome
Group True False
-----------------------------------
Treated 3 15
Control 7 29
-----------------------------------
Methods used in the snippet
TreatmentEffect
: The TreatmentEffect class for analyzing treatment effects in medical records.builder()
: Creates a TreatmentEffectBuilder instance for the TreatmentEffect class.with_treatment()
: Sets the treatment group for the treatment effect estimation.with_outcome()
: Sets the outcome group for the treatment effect estimation.with_nearest_neighbors_matching()
: Adjust the treatment effect estimate using nearest neighbors matching.build()
: Builds the treatment effect with all the provided configurations.subject_counts
: Overview of how many subjects are in which group from the contingency table.
As we can see, the distribution of the groups makes much more sense when matching the controls to the treated patients than when running a basic analysis. That is because the treatment is normally prescribed to patients with a high rish of getting a fracture, and the control group in the previous instances did not show a close representation of the treated one.
2.3. Using Queries to Filter Controls#
We can also use the aforementioned Query Engine to filter which patients we want to include in the control groups:
def query_patients_over_70(node: NodeOperand) -> NodeIndicesOperand:
node.in_group("patient")
node.attribute("age").greater_than(70)
return node.index()
treatment_effect_over_70 = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.filter_controls(query_patients_over_70)
.build()
)
treatment_effect_over_70.estimate.subject_counts(medrecord)
-----------------------------------
Outcome
Group True False
-----------------------------------
Treated 3 15
Control 15 41
-----------------------------------
Methods used in the snippet
in_group()
: Query nodes that belong to that group.attribute()
: Returns aNodeMultipleValuesWithIndexOperand
to query on the values of the nodes for that attribute.greater_than()
: Query values that are greater than that value.index()
: Returns aNodeIndicesOperand
representing the indices of the nodes queried.TreatmentEffect
: The TreatmentEffect class for analyzing treatment effects in medical records.builder()
: Creates a TreatmentEffectBuilder instance for the TreatmentEffect class.with_treatment()
: Sets the treatment group for the treatment effect estimation.with_outcome()
: Sets the outcome group for the treatment effect estimation.filter_controls()
: Filter the control group based on the provided query.build()
: Builds the treatment effect with all the provided configurations.subject_counts
: Overview of how many subjects are in which group from the contingency table.
3. Estimating metrics#
Once we have instantiated the Treatment Effect class with the desired properties, we can go on and estimate a lot of different metrics, such as:
Odds ratio.
treatment_effect_over_70.estimate.odds_ratio(medrecord)
0.5466666666666667
Methods used in the snippet
odds_ratio
: Calculates the odds ratio (OR).
Relative risk.
treatment_effect_over_70.estimate.relative_risk(medrecord)
0.6222222222222222
Methods used in the snippet
relative_risk
: Calculates the relative risk (RR).
Average treatment effect, where we calculate the difference between the outcome means of the treated and control sets for an outcome variable (e.g., duration_days).
treatment_effect_over_70.estimate.average_treatment_effect(medrecord, "duration_days")
9.382352941176471
Methods used in the snippet
average_treatment_effect
: Calculates the Average Treatment Effect (ATE).
Disclaimer: the values of the outcome variables used in this report are randomly sampled and unexpected results can be obtained.
4. Generating a full metrics report#
You can also create a report with all the possible metrics in the treatment effect class:
treatment_effect_over_70.report.full_report(medrecord)
{'relative_risk': 0.6222222222222222, 'odds_ratio': 0.5466666666666667, 'confounding_bias': 1.042531272994849, 'absolute_risk_reduction': 0.10119047619047619, 'number_needed_to_treat': 9.882352941176471, 'hazard_ratio': 0.6222222222222222}
And also another report with all the continuous estimators.
treatment_effect_over_70.report.continuous_estimators_report(medrecord, "duration_days")
Small sample size detected. Consider using Hedges' g for an unbiased effect size estimate.
{'average_treatment_effect': 9.382352941176471, 'cohens_d': 0.19095566663705052, 'hedges_g': 0.1833174399715685}
5. Full example Code#
The full code examples for this chapter can be found here:
from medmodels import MedRecord
from medmodels.medrecord.querying import (
NodeIndicesOperand,
NodeOperand,
)
from medmodels.treatment_effect import TreatmentEffect
medrecord = MedRecord().from_advanced_example_dataset()
def find_alendronic_drugs(node: NodeOperand) -> NodeIndicesOperand:
node.in_group("drug")
description = node.attribute("description")
description.lowercase()
description.contains("alendronic")
return node.index()
def find_fracture_diagnoses(node: NodeOperand) -> NodeIndicesOperand:
node.in_group("diagnosis")
description = node.attribute("description")
description.lowercase()
description.contains("fracture")
return node.index()
medrecord.unfreeze_schema()
medrecord.add_group("alendronic", find_alendronic_drugs)
medrecord.add_group("fracture", find_fracture_diagnoses)
treatment_effect_basic = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_patients_group("patient")
.build()
)
treatment_effect_basic.estimate.subject_counts(medrecord)
# Adding time attribute to the treatment effect
treatment_effect_with_time = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_time_attribute("time")
.build()
)
treatment_effect_with_time.estimate.subject_counts(medrecord)
# Highly customized treatment effect instance
treatment_effect_customized = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_time_attribute("time")
.with_grace_period(days=30)
.with_follow_up_period(days=365)
.with_outcome_before_treatment_exclusion(days=15)
.build()
)
# Using washout drugs (drugs that should not be taken before treatment)
def find_corticosteroids(node: NodeOperand) -> NodeIndicesOperand:
node.in_group("drug")
description = node.attribute("description")
description.lowercase()
description.contains("sone")
return node.index()
medrecord.add_group("corticosteroids", find_corticosteroids)
treatment_effect_with_washout = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_washout_period({"corticosteroids": 15})
.build()
)
# Using matching algorithms
treatment_effect_matching = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.with_nearest_neighbors_matching(
essential_covariates=["age", "gender"], number_of_neighbors=2
)
.build()
)
treatment_effect_matching.estimate.subject_counts(medrecord)
# Using queries to filter controls
def query_patients_over_70(node: NodeOperand) -> NodeIndicesOperand:
node.in_group("patient")
node.attribute("age").greater_than(70)
return node.index()
treatment_effect_over_70 = (
TreatmentEffect.builder()
.with_treatment("alendronic")
.with_outcome("fracture")
.filter_controls(query_patients_over_70)
.build()
)
treatment_effect_over_70.estimate.subject_counts(medrecord)
# Estimating odds ratio
treatment_effect_over_70.estimate.odds_ratio(medrecord)
# Relative risk estimation
treatment_effect_over_70.estimate.relative_risk(medrecord)
# Average Treatment Effect
treatment_effect_over_70.estimate.average_treatment_effect(medrecord, "duration_days")
# Full report of treatment effect estimation
treatment_effect_over_70.report.full_report(medrecord)
# Continuous treatment effect estimators report
treatment_effect_over_70.report.continuous_estimators_report(medrecord, "duration_days")