Skip to content

Peak and Chromatogram Integration

This page highlights the most often used API functionalities and is not complete. It covers peak and chromatogram integration in emzed.quantification. For complete coverage, see the API Reference.

Peak Integration

Integrate peak windows from a peak table.

Supported peak-shape models:

  • linear: integrates the piecewise-linear signal (equivalent to trapezoidal integration).
  • asym_gauss: asymmetric Gaussian peak-shape fit; useful for skewed peaks.
  • emg: exponentially modified Gaussian fit; useful for tailed chromatographic peaks.
  • sgolay: Savitzky-Golay based smoothing/integration workflow.
  • no_integration: creates the same result columns as other models, but does not perform peak integration.

integrate(peak_table, peak_shape_model, ms_level=None, show_progress=True, n_cores=1, min_size_for_parallel_execution=MIN_SIZE_DEFAULT, post_fixes=None, max_cores=8, in_place=False, path=None, overwrite=False, **model_extra_args)

integrates peaks of peak_table.

Parameters:

Name Type Description Default
peak_table

emzed.Table with required columns id, mzmin, mzmax, rtmin, rtmax, peakmap.

required
peak_shape_model

String of model name applied to determine peak area. Available models are: asym_gauss, linear, no_integration, sgolay, emg.

required
ms_level

MS level of peak integration. Must only be specified if peakmap has more than one MS levels. Default = None.

None
show_progress

Boolean value to activate progress bar. Default = True.

True
n_cores

Defines the number of cores used for multicore processing. If n_cores exceeds the number of available cores a warning is displayed and. Default = 1. On Windows, prefer available_cores - 1 (at least 1) for interactive work.

1
min_size_for_parallel_execution

Defines the number of table rows required to execute multicore processing. Default = 100.

MIN_SIZE_DEFAULT
post_fixes

Defines a subset of peaks via postfixes i. e. ['__0', '__1']. By default, all peaks in a table row are integrated. Default = None.

None
max_cores

The maximal number of cores used for multicore processing. If max_cores exceeds the number of available cores a warning is displayed and the n_cores is set to max_cores. Default is 8.

8
in_place

Allows operation in place if True. Note: if in_place is True multicore processing is not possible and n_cores is set to 1. Default = False. Using in-place integration has performance benefits.

False
path

If specified the result will be a Table with a db file backend, else the result will be managed in memory.

None
overwrite

Indicate if an already existing database file should be overwritten.

False

Returns:

Type Description

emzed.Table by default. Returns None if in_place is True

Example:

t1 = emzed.quantification.integrate(t, "linear")

The result keeps the original peak columns and adds: peak_shape_model, area, rmse, and valid_model.

Example:

from pathlib import Path

import emzed

peaks = emzed.io.load_table(Path("tests/data/peaks.table"))

print(sorted(emzed.quantification.available_peak_shape_models.keys()))

integrated = emzed.quantification.integrate(
    peaks[:3],
    "linear",
    show_progress=False,
)

integrated.set_col_format("peakmap", None)
integrated.set_col_format("model", None)
print(
    integrated.extract_columns(
        "mz",
        "rt",
        "peak_shape_model",
        "area",
        "rmse",
        "valid_model",
    )
)
['asym_gauss', 'emg', 'linear', 'no_integration', 'sgolay']
mz           rt        peak_shape_model  area      rmse      valid_model
MzType       RtType    str               float     float     bool
-----------  --------  ----------------  --------  --------  -----------
 219.174787    0.13 m  linear            1.71e+06  0.00e+00  True
 256.159512    0.14 m  linear            2.35e+06  0.00e+00  True
 258.156445    0.14 m  linear            1.48e+06  0.00e+00  True

Chromatogram Integration

Integrate pre-extracted chromatograms.

integrate_chromatograms(chromatogram_table, peak_shape_model, ms_level=None, show_progress=True, n_cores=1, min_size_for_parallel_execution=MIN_SIZE_DEFAULT, post_fixes=None, max_cores=8, in_place=False, path=None, overwrite=False, **model_extra_args)

integrates peaks of chromatogram_table.

Parameters:

Name Type Description Default
chromatogram_table

emzed.Table with required columns id, mzmin, mzmax, rtmin, rtmax, peakmap.

required
peak_shape_model

String of model name applied to determine peak area. Available models are: asym_gauss, linear, no_integration, sgolay, emg.

required
ms_level

MS level of peak integration. Must only be specified if peakmap has more than one MS levels. Default = None.

None
show_progress

Boolean value to activate progress bar. Default = True.

True
n_cores

Defines the number of cores used for multicore processing. If n_cores exceeds the number of available cores a warning is displayed and. Default = 1. On Windows, prefer available_cores - 1 (at least 1) for interactive work.

1
min_size_for_parallel_execution

Defines the number of table rows required to execute multicore processing. Default = 100.

MIN_SIZE_DEFAULT
post_fixes

Defines a subset of peaks via postfixes i. e. ['__0', '__1']. By default, all peaks in a table row are integrated. Default = None.

None
max_cores

The maximal number of cores used for multicore processing. If max_cores exceeds the number of available cores a warning is displayed and the n_cores is set to max_cores. Default is 8.

8
in_place

Allows operation in place if True. Note: if in_place is True multicore processing is not possible and n_cores is set to 1. Default = False. Using in-place integration has performance benefits.

False
path

If specified the result will be a Table with a db file backend, else the result will be managed in memory.

None
overwrite

Indicate if an already existing database file should be overwritten.

False

Returns:

Type Description

emzed.Table by default. Returns None if in_place is True

Example:

from pathlib import Path

import emzed

peaks = emzed.io.load_table(Path("tests/data/peaks.table"))
chrom = emzed.extract_chromatograms(peaks[:3])

integrated = emzed.quantification.integrate_chromatograms(
    chrom,
    "linear",
    show_progress=False,
)

integrated.set_col_format("peakmap", None)
integrated.set_col_format("chromatogram", None)
integrated.set_col_format("model_chromatogram", None)
print(
    integrated.extract_columns(
        "mz",
        "rt",
        "peak_shape_model_chromatogram",
        "area_chromatogram",
        "rmse_chromatogram",
        "valid_model_chromatogram",
    )
)
needed 0.0 seconds
mz           rt        peak_shape_model_chromatogram  area_chromatogram  rmse_chromatogram  valid_model_chromatogram
MzType       RtType    str                            float              float              bool
-----------  --------  -----------------------------  -----------------  -----------------  ------------------------
 219.174787    0.13 m  linear                                  1.71e+06           0.00e+00  True
 256.159512    0.14 m  linear                                  2.35e+06           0.00e+00  True
 258.156445    0.14 m  linear                                  1.48e+06           0.00e+00  True

Using multiple cores

Both integrate(...) and integrate_chromatograms(...) share the same runtime controls:

  • in_place=False: when True, multicore execution is disabled.
  • n_cores=1: default single-core execution.
  • min_size_for_parallel_execution=100: parallel execution starts only above this row count.
  • max_cores=8: upper bound for multicore execution.

Gotchas:

  • in_place=True and multicore processing are mutually exclusive.
  • On Windows, prefer n_cores = max(1, available_cores - 1) to reduce the risk of UI freezes during interactive work.
  • Smaller tables are often faster with single-core execution, especially on Windows.