Targeted Annotation¶
This example creates a tiny target list, generates expected adduct candidates, and then annotates a peak table against those adduct hypotheses.
The important workflow is:
- Create a target table with molecular formulas and retention times.
- Select adduct definitions from
emzed.adducts. - Build a candidate space with
emzed.targeted.solution_space. - Annotate peaks with
emzed.annotate.annotate_adducts.
"""Create a targeted adduct search space and annotate matching peaks."""
import emzed
targets = emzed.Table.create_table(
["mf", "rt"],
[str, emzed.RtType],
rows=[["C6H12O6", 20.0]],
)
targets.add_enumeration()
adducts = emzed.Table.stack_tables(
[
emzed.adducts.M_plus_Br,
emzed.adducts.Two_M_plus_H,
emzed.adducts.M_plus_ACN_plus_H,
]
)
adducts.drop_columns("id")
adducts.add_enumeration()
# solution_space expands each target formula/RT pair into expected adduct and
# isotope hypotheses (candidate peaks) with predicted masses and abundances.
candidates = emzed.targeted.solution_space(targets, adducts, 0.99)
print("solution-space preview:")
print(candidates[:5])
print()
candidates.rename_columns(adduct_name="original_adduct_name")
# annotate_adducts links compatible hypotheses across adduct types using m/z
# and RT tolerances, then assigns cluster IDs for consistent explanations.
annotated = emzed.annotate.annotate_adducts(
candidates,
adducts,
mz_tol=2e-5,
rt_tol=5.0,
explained_abundance=0.95,
).sort_by("adduct_name")
print(
annotated.extract_columns(
"mf",
"rt",
"adduct_name",
"isotope_decomposition",
"adduct_isotopes",
"m0",
"abundance",
"adduct_cluster_id",
)
)
solution-space preview:
id target_id mf rt adduct_id adduct_name m_multiplier adduct_add adduct_sub z sign_z full_mf isotope_id isotope_decomposition m0 abundance mz
int int str RtType int str int str str int int str int str MzType float MzType
--- --------- ------- -------- --------- ----------- ------------ ---------- ---------- --- ------ --------- ---------- --------------------------------- ----------- --------- -----------
0 0 C6H12O6 0.33 m 0 M+Br 1 Br 1 -1 C6H12O6Br 0 [12]C6 [1]H12 [16]O6 [79]Br 258.981727 0.468 258.982276
1 0 C6H12O6 0.33 m 0 M+Br 1 Br 1 -1 C6H12O6Br 1 [12]C6 [1]H12 [16]O6 [81]Br 260.979681 0.455 260.980230
2 0 C6H12O6 0.33 m 0 M+Br 1 Br 1 -1 C6H12O6Br 2 [12]C5 [13]C [1]H12 [16]O6 [79]Br 259.985082 0.030 259.985631
3 0 C6H12O6 0.33 m 0 M+Br 1 Br 1 -1 C6H12O6Br 3 [12]C5 [13]C [1]H12 [16]O6 [81]Br 261.983036 0.030 261.983585
4 0 C6H12O6 0.33 m 0 M+Br 1 Br 1 -1 C6H12O6Br 4 [12]C6 [1]H12 [16]O5 [18]O [79]Br 260.985981 0.006 260.986530
found 0 gaps > rt_tol in rt values
process 1 out of 1
process 16 peaks in rt range 0.0..21.0
build up lookup table
look for matches
found matches
mf rt adduct_name isotope_decomposition adduct_isotopes m0 abundance adduct_cluster_id
str RtType str str str MzType float int
------- -------- ----------- --------------------------------- ------------------- ----------- --------- -----------------
C6H12O6 0.33 m 2M+H [12]C12 [1]H25 [16]O12 +[1]H 361.134606 0.851 0
C6H12O6 0.33 m 2M+H [12]C10 [13]C2 [1]H25 [16]O12 +[1]H 363.141316 0.007 1
C6H12O6 0.33 m M+ACN+H [12]C8 [1]H16 [14]N [16]O6 +[12]C2 [1]H4 [14]N 222.097765 0.899 0
C6H12O6 0.33 m M+ACN+H [12]C7 [13]C [1]H16 [14]N [16]O6 +[12]C2 [1]H4 [14]N 223.101120 0.078 1
C6H12O6 0.33 m M+ACN+H [12]C8 [1]H16 [14]N [16]O5 [18]O +[12]C2 [1]H4 [14]N 224.102019 0.011 2
C6H12O6 0.33 m M+Br [12]C6 [1]H12 [16]O6 [79]Br +[79]Br 258.981727 0.468 0
C6H12O6 0.33 m M+Br [12]C6 [1]H12 [16]O6 [81]Br +[81]Br 260.979681 0.455 0
C6H12O6 0.33 m M+Br [12]C5 [13]C [1]H12 [16]O6 [79]Br +[79]Br 259.985082 0.030 1
C6H12O6 0.33 m M+Br [12]C5 [13]C [1]H12 [16]O6 [81]Br +[81]Br 261.983036 0.030 1
C6H12O6 0.33 m M+Br [12]C6 [1]H12 [16]O5 [18]O [79]Br +[79]Br 260.985981 0.006 2
C6H12O6 0.33 m M+Br [12]C6 [1]H12 [16]O5 [18]O [81]Br +[81]Br 262.983935 0.006 2