Skip to content

API Introduction

This page is a curated entry point into the emzed API.

Use the API Reference for complete autodoc of every subpackage.

Start Here

These are the most central entry points for day-to-day work with emzed.

Load And Save Data

The test suite leans heavily on the convenience functions in emzed.io, especially for .table and mass spectrometry files.

load_table

Load table in emzed format.

Parameters:

Name Type Description Default
path

Path to the file to load. if not specified and if emzed-gui is installed a user dialog will pop up and ask for the file.

None

Returns:

Type Description

emzed.Table

save_table

Save table in emzed format.

Parameters:

Name Type Description Default
table

Instance of emzed.Table.

required
path

Target file location. If not specified and if emzed-gui is installed a user dialog will pop up asking for the destination.

None
overwrite

Enforce overwriting if file already exists.

False

load_peak_map

Load peak-map.

Parameters:

Name Type Description Default
path

Path to the file to load. if not specified and if emzed-gui is installed a user dialog will pop up and ask for the file.

None

Returns:

Type Description

emzed.PeakMap

load_csv

Load CSV file.

Parameters:

Name Type Description Default
path

Path to the file to load. If not specified and if emzed-gui is installed a user dialog will pop up and ask for the file.

None
delimiter

CSV field delimiter

';'
dash_is_none

If set to True the value - will be interpreted as None (missing value).

True

Returns:

Type Description

emzed.Table

load_excel

Load Excel file.

Parameters:

Name Type Description Default
path

Path to the file to load. If not specified and if emzed-gui is installed a user dialog will pop up and ask for the file.

None

Returns:

Type Description

emzed.Table

Chromatograms And Peak Data

These APIs show up repeatedly in integration tests and feature extraction scenarios.

run_feature_finder_metabo

RunFeatureFinderMetabo

__call__(peak_map, ms_level=None, verbose=True, run_feature_grouper=True, split_by_precursors_mz_tol=0.0, **parameters)

runs openms feature finder on peakmap.

Parameters:

Name Type Description Default
peak_map

emzed PeakMap object

required
ms_level

ms level to pick peaks from, default picks from MS1

None
verbose

set to False to supress output

True
run_feature_grouper

also run feature grouper from openms.

True
split_by_precursors_mz_tol

ms2 peakmaps are split by precusor first. this is the tolearance used for PeakMap.split_by_precursors. Set to None to disable this!.

0.0

PeakMap

Spectrum

extract_chromatograms

extract_chromatograms(peak_table, ms_level=None, post_fixes=None, path=None, overwrite=False)

Extract chromatograms from table with peak map and peak limits.

Parameters:

Name Type Description Default
peak_table

Table with columns rtmin*, rtmax*, mzmin*, mzmax* and peakmap* for given post_fixes.

required
ms_level

optional MS level to consider.

None
post_fixes

optional post_fixes to consider.

None
path

optional path for out-of-memory table.

None
overwrite

allow overwriting existing out-of-memory table.

False

Returns:

Type Description

new table with chromatograms and chromatogram boundaries.

extract_ms_chromatograms

Table

create_table(col_names, col_types, col_formats=None, rows=None, title=None, meta_data=None, path=None) staticmethod

creates a table.

Parameters:

Name Type Description Default
col_names

list or tuple of strings.

required
col_types

list of types.

required
col_formats

list of formats using format specifiers like "%.2f" If not specified emzed tries to guess appropriate formats based on column type and column name.

None
rows

list of lists.

None
title

table title as string.

None
meta_data

dictionary to manage user defined meta data.

None
path

path for the db backend, default is None to use the the in-memory db backend.

None

Returns:

Type Description

emzed.Table.

load(path) classmethod

loads table from disk into memory.

Parameters:

Name Type Description Default
path

path to file.

required

Returns:

Type Description

emzed.Table.

open(path) classmethod

opens table on disk without loading data into memory.

Parameters:

Name Type Description Default
path

path to file.

required

Returns:

Type Description

emzed.Table.

save(path, *, overwrite=False)

save table to a file.

Parameters:

Name Type Description Default
path

path describing target location.

required
overwrite

If set to True an existing file will be overwritten, else an exception will be thrown.

False

filter(condition)

creates a new table by filtering rows fulfiling the given condition. similar use as pandas query.

Parameters:

Name Type Description Default
condition

expression like t.a < 0 or t.a <= t.b.

required

Returns:

Type Description

emzed.Table with filtered rows.

extract_columns(*col_names)

returns new Table with selected columns col_names.

Parameters:

Name Type Description Default
col_names

list or tuple with selected, existing column names.

()

add_column(name, what, type_, format_=not_specified, insert_before=None, insert_after=None)

adds a new column with name in place.

Parameters:

Name Type Description Default
name

the name of the new column.

required
what

either a list with the same length as table or an expression.

required
type_

supported colum types are int, float, bool, MzType, RtType, str, PeakMap, Table, object. In case you want to use Python objects like lists or dicts, use column type 'object' instead.

required
format_

is a format string as "%d" or or an executable string with python code. To suppress visibility set format_ = None. By default (not_specified) the method tries to determine a default format for the type.

not_specified
insert_before

to add column name at a defined position, one can specify its position left-wise to column insert_before via the name of an existing column, or an integer index (negative values allowed !).

None
insert_after

to add column name at a defined position, one can specify its position right-wise to column insert_after.

None

stack_tables(tables, path=None, overwrite=False) staticmethod

builds a single Table from list or tuple of Tables.

Parameters:

Name Type Description Default
tables

list or tuple of Tables. All tables must have the same colum names with same types and formats.

required
path

If specified the result will be a Table with a db file backend, else the result will be managed in memory.

None
overwrite

Indicate if an already existing database file should be overwritten.

False

Returns:

Type Description

emzed.Table.

to_table

generates a one-column Table from an iterable, e.g. from a list.

Parameters:

Name Type Description Default
name

name of the column.

required
values

iterable with column values.

required
type_

supported colum types are int, float, bool, MzType, RtType, str, PeakMap, Table, object. In case you want to use Python objects like lists or dicts, use column type 'object' instead.

required
format_

is a format string as "%d". To suppress visibility set format_ = None. By default (not_specified) the method tries to determine a default format for the type.

None
title

Table title as string.

None
meta_data

Python dictionary to assign meta data to the table.

None
path

Path for the db backend, use None for an in memory db backend.

None

Returns:

Type Description

emzed.Table

MzType

Bases: float

Represents Mass-to-Charge ratio (m/z). Inherits from float and provides high-precision formatting (6 decimal places) in tables.

RtType

Bases: float

Represents Retention Time in seconds. Inherits from float and provides specialized formatting in tables (e.g., '12.34 m').

Chemistry And Adduct Workflows

mass, abundance, molecular formula parsing, and adduct tables are the most visible chemistry-facing APIs in the tests.

mf

Represent a molecular formula as both string and element-count mapping.

as_dict()

Return the molecular formula as a plain dict mapping atoms to counts.

as_string()

Return the normalized molecular-formula string or None if invalid.

mass(**specialisations)

Calculate the exact mass of the formula.

Parameters:

Name Type Description Default
specialisations

optional isotope overrides such as C=12.0 or C=mass.C12 for unresolved elements.

{}

Returns:

Type Description

exact mass as float or None if an element/isotope is unknown.

mass

Convenience access to exact masses and common particle masses.

The module exposes a small set of particle masses directly (e, p, n) and lazily forwards element and isotope masses via __getattr__ so that expressions such as emzed.mass.C12 work without preloading the full elements table.

__dir__()

forward attributes for autocompletion

of(mf, **specialisation)

Calculate the exact mass for a molecular formula.

Parameters:

Name Type Description Default
mf

molecular formula string.

required
specialisation

optional isotope specialisations forwarded to emzed.chemistry.MolecularFormula.mass.

{}

Returns:

Type Description

exact mass as float.

abundance

Lazy access to natural isotope abundances.

The module forwards isotope abundances such as C12 and element abundance maps such as C via __getattr__ without loading the full element table up front.

__dir__()

forward attributes for autocompletion

adducts

Predefined adduct tables and convenience subsets for targeted annotation.

The module exposes:

  • all: every predefined adduct
  • charge-based subsets such as positive and negative
  • single-adduct tables addressable as Python identifiers such as M_plus_H

Targeted Annotation Workflow

These functions form the main higher-level workflow for generating candidate spaces and annotating peaks with adduct hypotheses.

solution_space

Targeted annotation helpers for combining targets with adduct and isotope space.

solution_space(targets, adducts, explained_abundance=0.999, resolution=None)

Expand target formulas into adduct and isotope hypotheses.

Parameters:

Name Type Description Default
targets

table containing at least id and mf columns.

required
adducts

adduct-definition table, typically from :mod:emzed.chemistry.adducts.

required
explained_abundance

cumulative isotope abundance to explain.

0.999
resolution

optional resolving power used for measured centroids.

None

Returns:

Type Description

table describing target/adduct/isotope combinations and expected m/z.

annotate_adducts

annotate_adducts(peaks, adducts, mz_tol, rt_tol, explained_abundance=0.2)

Annotate peaks with adduct hypotheses that are mutually consistent.

The algorithm generates adduct and adduct-isotope hypotheses for each input peak, converts each hypothesis into an inferred neutral mass, and then links hypotheses that agree in both retention time and inferred neutral mass. Connected components of that hypothesis graph are reported as adduct clusters.

Input rows must provide mz and rt. Other columns are preserved.

Parameters:

Name Type Description Default
peaks

input peak table containing at least mz and rt.

required
adducts

adduct-definition table, typically from emzed.adducts.

required
mz_tol

tolerance used when comparing inferred neutral masses (adduct_m0) between hypotheses.

required
rt_tol

tolerance used when comparing retention times and for partitioning the peak table into RT windows.

required
explained_abundance

cumulative theoretical isotope abundance to include when generating centroids for the adduct addition/subtraction formulas. For example, 0.99 includes isotope centroids until 99% of the theoretical abundance is covered.

0.2

Returns:

Type Description

table with the original peak rows plus annotation columns: adduct_name, adduct_isotopes, adduct_isotopes_abundance, adduct_m0, and adduct_cluster_id.

Notes: - Peaks are compared in inferred neutral-mass space, not by direct m/z matching alone. - adduct_cluster_id identifies a connected cluster of compatible hypotheses, not a unique best assignment. - A single input peak can appear in multiple output rows if several adduct hypotheses remain compatible. - Clusters supported only by multiple hypotheses of the same original peak are discarded. - Peaks are pre-partitioned into RT windows separated by gaps larger than rt_tol; only peaks within the same window are compared. - explained_abundance is not derived from observed peak intensities; it only controls hypothesis generation from theoretical isotope patterns. - Rows with missing mz or rt are preserved, but their annotation columns are set to None.