Targeted Annotation¶

This example creates a tiny target list, generates expected adduct candidates, and then annotates a peak table against those adduct hypotheses.

The important workflow is:

Create a target table with molecular formulas and retention times.
Select adduct definitions from emzed.adducts.
Build a candidate space with emzed.targeted.solution_space.
Annotate peaks with emzed.annotate.annotate_adducts.

"""Create a targeted adduct search space and annotate matching peaks."""

import emzed

targets = emzed.Table.create_table(
    ["mf", "rt"],
    [str, emzed.RtType],
    rows=[["C6H12O6", 20.0]],
)
targets.add_enumeration()

adducts = emzed.Table.stack_tables(
    [
        emzed.adducts.M_plus_Br,
        emzed.adducts.Two_M_plus_H,
        emzed.adducts.M_plus_ACN_plus_H,
    ]
)
adducts.drop_columns("id")
adducts.add_enumeration()

# solution_space expands each target formula/RT pair into expected adduct and
# isotope hypotheses (candidate peaks) with predicted masses and abundances.
candidates = emzed.targeted.solution_space(targets, adducts, 0.99)
print("solution-space preview:")
print(candidates[:5])
print()

candidates.rename_columns(adduct_name="original_adduct_name")

# annotate_adducts links compatible hypotheses across adduct types using m/z
# and RT tolerances, then assigns cluster IDs for consistent explanations.
annotated = emzed.annotate.annotate_adducts(
    candidates,
    adducts,
    mz_tol=2e-5,
    rt_tol=5.0,
    explained_abundance=0.95,
).sort_by("adduct_name")

print(
    annotated.extract_columns(
        "mf",
        "rt",
        "adduct_name",
        "isotope_decomposition",
        "adduct_isotopes",
        "m0",
        "abundance",
        "adduct_cluster_id",
    )
)

solution-space preview:
id   target_id  mf       rt        adduct_id  adduct_name  m_multiplier  adduct_add  adduct_sub  z    sign_z  full_mf    isotope_id  isotope_decomposition              m0           abundance  mz         
int  int        str      RtType    int        str          int           str         str         int  int     str        int         str                                MzType       float      MzType     
---  ---------  -------  --------  ---------  -----------  ------------  ----------  ----------  ---  ------  ---------  ----------  ---------------------------------  -----------  ---------  -----------
  0          0  C6H12O6    0.33 m          0  M+Br                    1  Br                        1      -1  C6H12O6Br           0  [12]C6 [1]H12 [16]O6 [79]Br         258.981727      0.468   258.982276
  1          0  C6H12O6    0.33 m          0  M+Br                    1  Br                        1      -1  C6H12O6Br           1  [12]C6 [1]H12 [16]O6 [81]Br         260.979681      0.455   260.980230
  2          0  C6H12O6    0.33 m          0  M+Br                    1  Br                        1      -1  C6H12O6Br           2  [12]C5 [13]C [1]H12 [16]O6 [79]Br   259.985082      0.030   259.985631
  3          0  C6H12O6    0.33 m          0  M+Br                    1  Br                        1      -1  C6H12O6Br           3  [12]C5 [13]C [1]H12 [16]O6 [81]Br   261.983036      0.030   261.983585
  4          0  C6H12O6    0.33 m          0  M+Br                    1  Br                        1      -1  C6H12O6Br           4  [12]C6 [1]H12 [16]O5 [18]O [79]Br   260.985981      0.006   260.986530


found 0 gaps > rt_tol in rt values

process 1 out of 1
    process 16 peaks in rt range 0.0..21.0
    build up lookup table
    look for matches
    found matches

mf       rt        adduct_name  isotope_decomposition              adduct_isotopes      m0           abundance  adduct_cluster_id
str      RtType    str          str                                str                  MzType       float      int              
-------  --------  -----------  ---------------------------------  -------------------  -----------  ---------  -----------------
C6H12O6    0.33 m  2M+H         [12]C12 [1]H25 [16]O12             +[1]H                 361.134606      0.851                  0
C6H12O6    0.33 m  2M+H         [12]C10 [13]C2 [1]H25 [16]O12      +[1]H                 363.141316      0.007                  1
C6H12O6    0.33 m  M+ACN+H      [12]C8 [1]H16 [14]N [16]O6         +[12]C2 [1]H4 [14]N   222.097765      0.899                  0
C6H12O6    0.33 m  M+ACN+H      [12]C7 [13]C [1]H16 [14]N [16]O6   +[12]C2 [1]H4 [14]N   223.101120      0.078                  1
C6H12O6    0.33 m  M+ACN+H      [12]C8 [1]H16 [14]N [16]O5 [18]O   +[12]C2 [1]H4 [14]N   224.102019      0.011                  2
C6H12O6    0.33 m  M+Br         [12]C6 [1]H12 [16]O6 [79]Br        +[79]Br               258.981727      0.468                  0
C6H12O6    0.33 m  M+Br         [12]C6 [1]H12 [16]O6 [81]Br        +[81]Br               260.979681      0.455                  0
C6H12O6    0.33 m  M+Br         [12]C5 [13]C [1]H12 [16]O6 [79]Br  +[79]Br               259.985082      0.030                  1
C6H12O6    0.33 m  M+Br         [12]C5 [13]C [1]H12 [16]O6 [81]Br  +[81]Br               261.983036      0.030                  1
C6H12O6    0.33 m  M+Br         [12]C6 [1]H12 [16]O5 [18]O [79]Br  +[79]Br               260.985981      0.006                  2
C6H12O6    0.33 m  M+Br         [12]C6 [1]H12 [16]O5 [18]O [81]Br  +[81]Br               262.983935      0.006                  2