Skip to content

emzed.annotate

Annotation helpers for enriching peak tables with derived metadata.

annotate_adducts(peaks, adducts, mz_tol, rt_tol, explained_abundance=0.2)

Annotate peaks with adduct hypotheses that are mutually consistent.

The algorithm generates adduct and adduct-isotope hypotheses for each input peak, converts each hypothesis into an inferred neutral mass, and then links hypotheses that agree in both retention time and inferred neutral mass. Connected components of that hypothesis graph are reported as adduct clusters.

Input rows must provide mz and rt. Other columns are preserved.

Parameters:

Name Type Description Default
peaks

input peak table containing at least mz and rt.

required
adducts

adduct-definition table, typically from emzed.adducts.

required
mz_tol

tolerance used when comparing inferred neutral masses (adduct_m0) between hypotheses.

required
rt_tol

tolerance used when comparing retention times and for partitioning the peak table into RT windows.

required
explained_abundance

cumulative theoretical isotope abundance to include when generating centroids for the adduct addition/subtraction formulas. For example, 0.99 includes isotope centroids until 99% of the theoretical abundance is covered.

0.2

Returns:

Type Description

table with the original peak rows plus annotation columns: adduct_name, adduct_isotopes, adduct_isotopes_abundance, adduct_m0, and adduct_cluster_id.

Notes: - Peaks are compared in inferred neutral-mass space, not by direct m/z matching alone. - adduct_cluster_id identifies a connected cluster of compatible hypotheses, not a unique best assignment. - A single input peak can appear in multiple output rows if several adduct hypotheses remain compatible. - Clusters supported only by multiple hypotheses of the same original peak are discarded. - Peaks are pre-partitioned into RT windows separated by gaps larger than rt_tol; only peaks within the same window are compared. - explained_abundance is not derived from observed peak intensities; it only controls hypothesis generation from theoretical isotope patterns. - Rows with missing mz or rt are preserved, but their annotation columns are set to None.