Targeted Analysis¶
This page highlights the most often used API functionalities and is not complete.
It covers candidate generation with solution_space and adduct annotation
with annotate_adducts.
For complete coverage, see the API Reference.
solution_space(targets, adducts, explained_abundance=0.999, resolution=None)
¶
Expand target formulas into adduct and isotope hypotheses.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
targets
|
table containing at least |
required | |
adducts
|
adduct-definition table, typically from
|
required | |
explained_abundance
|
cumulative isotope abundance to explain. |
0.999
|
|
resolution
|
optional resolving power used for measured centroids. |
None
|
Returns:
| Type | Description |
|---|---|
|
table describing target/adduct/isotope combinations and expected m/z. |
Example:
import emzed
targets = emzed.Table.create_table(
["id", "mf", "rt"],
[int, str, emzed.RtType],
rows=[[0, "C6H12O6", 20.0]],
)
adducts = emzed.Table.stack_tables(
[
emzed.adducts.M_plus_Br,
emzed.adducts.Two_M_plus_H,
emzed.adducts.M_plus_ACN_plus_H,
]
)
candidates = emzed.targeted.solution_space(targets, adducts, 0.99)
print(candidates.sort_by("abundance", ascending=False)[:5])
id target_id mf rt adduct_id adduct_name m_multiplier adduct_add adduct_sub z sign_z full_mf isotope_id isotope_decomposition m0 abundance mz
int int str RtType int str int str str int int str int str MzType float MzType
--- --------- ------- -------- --------- ----------- ------------ ---------- ---------- --- ------ --------- ---------- ---------------------------- ----------- --------- -----------
6 0 C6H12O6 0.33 m 26 M+ACN+H 1 C2H3NH 1 1 C8H16NO6 0 [12]C8 [1]H16 [14]N [16]O6 222.097765 0.899 222.097216
11 0 C6H12O6 0.33 m 36 2M+H 2 H 1 1 C12H25O12 0 [12]C12 [1]H25 [16]O12 361.134606 0.851 361.134057
0 0 C6H12O6 0.33 m 12 M+Br 1 Br 1 -1 C6H12O6Br 0 [12]C6 [1]H12 [16]O6 [79]Br 258.981727 0.468 258.982276
1 0 C6H12O6 0.33 m 12 M+Br 1 Br 1 -1 C6H12O6Br 1 [12]C6 [1]H12 [16]O6 [81]Br 260.979681 0.455 260.980230
12 0 C6H12O6 0.33 m 36 2M+H 2 H 1 1 C12H25O12 1 [12]C11 [13]C [1]H25 [16]O12 362.137961 0.110 362.137412
annotate_adducts(peaks, adducts, mz_tol, rt_tol, explained_abundance=0.2)
¶
Annotate peaks with adduct hypotheses that are mutually consistent.
The algorithm generates adduct and adduct-isotope hypotheses for each input peak, converts each hypothesis into an inferred neutral mass, and then links hypotheses that agree in both retention time and inferred neutral mass. Connected components of that hypothesis graph are reported as adduct clusters.
Input rows must provide mz and rt. Other columns are preserved.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
peaks
|
input peak table containing at least |
required | |
adducts
|
adduct-definition table, typically from |
required | |
mz_tol
|
tolerance used when comparing inferred neutral masses
( |
required | |
rt_tol
|
tolerance used when comparing retention times and for partitioning the peak table into RT windows. |
required | |
explained_abundance
|
cumulative theoretical isotope abundance to
include when generating centroids for the
adduct addition/subtraction formulas. For
example, |
0.2
|
Returns:
| Type | Description |
|---|---|
|
table with the original peak rows plus annotation columns:
Notes:
- Peaks are compared in inferred neutral-mass space, not by direct m/z
matching alone.
- |
Example:
import emzed
targets = emzed.Table.create_table(
["id", "mf", "rt"],
[int, str, emzed.RtType],
rows=[[0, "C6H12O6", 20.0]],
)
adducts = emzed.Table.stack_tables(
[
emzed.adducts.M_plus_Br,
emzed.adducts.Two_M_plus_H,
emzed.adducts.M_plus_ACN_plus_H,
]
)
candidates = emzed.targeted.solution_space(targets, adducts, 0.99)
candidates.rename_columns(adduct_name="original_adduct_name")
annotated = emzed.annotate.annotate_adducts(
candidates,
adducts,
mz_tol=2e-5,
rt_tol=5.0,
explained_abundance=0.95,
).sort_by("adduct_name")
print(
annotated.extract_columns(
"mf",
"rt",
"adduct_name",
"isotope_decomposition",
"adduct_isotopes",
"m0",
"abundance",
"adduct_cluster_id",
)[:5]
)
found 0 gaps > rt_tol in rt values
process 1 out of 1
process 16 peaks in rt range 0.0..21.0
build up lookup table
look for matches
found matches
mf rt adduct_name isotope_decomposition adduct_isotopes m0 abundance adduct_cluster_id
str RtType str str str MzType float int
------- -------- ----------- -------------------------------- ------------------- ----------- --------- -----------------
C6H12O6 0.33 m 2M+H [12]C12 [1]H25 [16]O12 +[1]H 361.134606 0.851 0
C6H12O6 0.33 m 2M+H [12]C10 [13]C2 [1]H25 [16]O12 +[1]H 363.141316 0.007 1
C6H12O6 0.33 m M+ACN+H [12]C8 [1]H16 [14]N [16]O6 +[12]C2 [1]H4 [14]N 222.097765 0.899 0
C6H12O6 0.33 m M+ACN+H [12]C7 [13]C [1]H16 [14]N [16]O6 +[12]C2 [1]H4 [14]N 223.101120 0.078 1
C6H12O6 0.33 m M+ACN+H [12]C8 [1]H16 [14]N [16]O5 [18]O +[12]C2 [1]H4 [14]N 224.102019 0.011 2
Candidate Chromatogram Extraction¶
Use top solution_space candidates to build extraction windows and then
extract chromatograms from a peak map.
from pathlib import Path
import emzed
targets = emzed.Table.create_table(
["id", "mf", "rt"],
[int, str, emzed.RtType],
rows=[[0, "C6H12O6", 20.0]],
)
adducts = emzed.Table.stack_tables(
[
emzed.adducts.M_plus_Br,
emzed.adducts.Two_M_plus_H,
emzed.adducts.M_plus_ACN_plus_H,
]
)
candidates = emzed.targeted.solution_space(targets, adducts, 0.99)
pm = emzed.io.load_peak_map(Path("tests/data/test_smaller.mzXML"))
# Keep top hypotheses and create the columns required by extract_chromatograms:
# mzmin, mzmax, rtmin, rtmax, peakmap
peaks = candidates.sort_by("abundance", ascending=False)[:5].consolidate()
peaks.add_column("mzmin", peaks.mz - 0.01, emzed.MzType)
peaks.add_column("mzmax", peaks.mz + 0.01, emzed.MzType)
peaks.add_column("rtmin", peaks.rt - 5.0, emzed.RtType)
peaks.add_column("rtmax", peaks.rt + 5.0, emzed.RtType)
peaks.add_column_with_constant_value("peakmap", pm, emzed.PeakMap)
chrom = emzed.extract_chromatograms(peaks, ms_level=1)
chrom.set_col_format("peakmap", None)
chrom.set_col_format("chromatogram", None)
print(
chrom.extract_columns(
"adduct_name",
"mz",
"rt",
"rtmin_chromatogram",
"rtmax_chromatogram",
)[:5]
)
needed 0.0 seconds
adduct_name mz rt rtmin_chromatogram rtmax_chromatogram
str MzType RtType RtType RtType
----------- ----------- -------- ------------------ ------------------
M+ACN+H 222.097216 0.33 m 0.25 m 0.42 m
2M+H 361.134057 0.33 m 0.25 m 0.42 m
M+Br 258.982276 0.33 m 0.25 m 0.42 m
M+Br 260.980230 0.33 m 0.25 m 0.42 m
2M+H 362.137412 0.33 m 0.25 m 0.42 m