Targeted Annotation¶
This example creates a tiny target list, generates expected adduct candidates, and then annotates a peak table against those adduct hypotheses.
The important workflow is:
- Create a target table with molecular formulas and retention times.
- Select adduct definitions from
emzed.adducts. - Build a candidate space with
emzed.targeted.solution_space. - Annotate peaks with
emzed.annotate.annotate_adducts.
"""Create a targeted adduct search space and annotate matching peaks."""
import emzed
def main():
targets = emzed.Table.create_table(
["mf", "rt"],
[str, emzed.RtType],
rows=[["H2O", 10.0]],
)
targets.add_enumeration()
adducts = emzed.Table.stack_tables(
[
emzed.adducts.M_plus_Br,
emzed.adducts.Two_M_plus_H,
emzed.adducts.M_plus_ACN_plus_H,
emzed.adducts.M_minus_H,
]
)
adducts.drop_columns("id")
adducts.add_enumeration()
candidates = emzed.targeted.solution_space(targets, adducts, 0.99)
peaks = candidates.extract_columns("target_id", "rt", "mz", "adduct_name")
peaks.rename_columns(adduct_name="original_adduct_name")
annotated = emzed.annotate.annotate_adducts(
peaks,
adducts,
mz_tol=2e-5,
rt_tol=5.0,
explained_abundance=0.99,
).sort_by("adduct_cluster_id")
print(annotated.extract_columns("mz", "rt", "adduct_name", "adduct_cluster_id"))
if __name__ == "__main__":
main()
found 0 gaps > rt_tol in rt values
process 1 out of 1
process 8 peaks in rt range 0.0..11.0
build up lookup table
look for matches
found matches
mz rt adduct_name adduct_cluster_id
MzType RtType str int
----------- -------- ----------- -----------------
96.929451 0.17 m M+Br 0
98.927404 0.17 m M+Br 0
37.028407 0.17 m 2M+H 0
60.044391 0.17 m M+ACN+H 0
61.047746 0.17 m M+ACN+H 0
61.041426 0.17 m M+ACN+H 0
17.003289 0.17 m M-H 0