Peak and Chromatogram Integration¶

This page highlights the most often used API functionalities and is not complete. It covers peak and chromatogram integration in emzed.quantification. For complete coverage, see the API Reference.

Peak Integration¶

Integrate peak windows from a peak table.

Supported peak-shape models:

linear: integrates the piecewise-linear signal (equivalent to trapezoidal integration).
asym_gauss: asymmetric Gaussian peak-shape fit; useful for skewed peaks.
emg: exponentially modified Gaussian fit; useful for tailed chromatographic peaks.
sgolay: Savitzky-Golay based smoothing/integration workflow.
no_integration: creates the same result columns as other models, but does not perform peak integration.

`integrate(peak_table, peak_shape_model, ms_level=None, show_progress=True, n_cores=1, min_size_for_parallel_execution=MIN_SIZE_DEFAULT, post_fixes=None, max_cores=8, in_place=False, path=None, overwrite=False, **model_extra_args)` ¶

integrates peaks of peak_table.

Parameters:

Name	Description	Default
`peak_table`	`emzed.Table` with required columns `id, mzmin, mzmax, rtmin, rtmax, peakmap`.	required
`peak_shape_model`	String of model name applied to determine peak area. Available models are: `asym_gauss`, `linear`, `no_integration`, `sgolay`, `emg`.	required
`ms_level`	MS level of peak integration. Must only be specified if peakmap has more than one MS levels. `Default = None`.	`None`
`show_progress`	Boolean value to activate progress bar. `Default = True`.	`True`
`n_cores`	Defines the number of cores used for multicore processing. If `n_cores` exceeds the number of available cores a warning is displayed and. `Default = 1`. On Windows, prefer `available_cores - 1` (at least 1) for interactive work.	`1`
`min_size_for_parallel_execution`	Defines the number of table rows required to execute multicore processing. `Default = 100`.	`MIN_SIZE_DEFAULT`
`post_fixes`	Defines a subset of peaks via postfixes i. e. ['__0', '__1']. By default, all peaks in a table row are integrated. `Default = None`.	`None`
`max_cores`	The maximal number of cores used for multicore processing. If `max_cores` exceeds the number of available cores a warning is displayed and the `n_cores` is set to `max_cores`. Default is `8`.	`8`
`in_place`	Allows operation in place if True. Note: if `in_place` is `True` multicore processing is not possible and n_cores is set to 1. Default = `False`. Using in-place integration has performance benefits.	`False`
`path`	If specified the result will be a Table with a db file backend, else the result will be managed in memory.	`None`
`overwrite`	Indicate if an already existing database file should be overwritten.	`False`

Returns:

Type	Description
	`emzed.Table` by default. Returns None if `in_place` is `True` Example: `t1 = emzed.quantification.integrate(t, "linear")` The result keeps the original peak columns and adds: `peak_shape_model`, `area`, `rmse`, and `valid_model`.

Example:

from pathlib import Path

import emzed

peaks = emzed.io.load_table(Path("tests/data/peaks.table"))

print(sorted(emzed.quantification.available_peak_shape_models.keys()))

integrated = emzed.quantification.integrate(
    peaks[:3],
    "linear",
    show_progress=False,
)

integrated.set_col_format("peakmap", None)
integrated.set_col_format("model", None)
print(
    integrated.extract_columns(
        "mz",
        "rt",
        "peak_shape_model",
        "area",
        "rmse",
        "valid_model",
    )
)

['asym_gauss', 'emg', 'linear', 'no_integration', 'sgolay']
mz           rt        peak_shape_model  area      rmse      valid_model
MzType       RtType    str               float     float     bool
-----------  --------  ----------------  --------  --------  -----------
 219.174787    0.13 m  linear            1.71e+06  0.00e+00  True
 256.159512    0.14 m  linear            2.35e+06  0.00e+00  True
 258.156445    0.14 m  linear            1.48e+06  0.00e+00  True

Chromatogram Integration¶

Integrate pre-extracted chromatograms.

`integrate_chromatograms(chromatogram_table, peak_shape_model, ms_level=None, show_progress=True, n_cores=1, min_size_for_parallel_execution=MIN_SIZE_DEFAULT, post_fixes=None, max_cores=8, in_place=False, path=None, overwrite=False, **model_extra_args)` ¶

integrates peaks of chromatogram_table.

Parameters:

Name	Description	Default
`chromatogram_table`	`emzed.Table` with required columns `id, mzmin, mzmax, rtmin, rtmax, peakmap`.	required
`peak_shape_model`	String of model name applied to determine peak area. Available models are: `asym_gauss`, `linear`, `no_integration`, `sgolay`, `emg`.	required
`ms_level`	MS level of peak integration. Must only be specified if peakmap has more than one MS levels. `Default = None`.	`None`
`show_progress`	Boolean value to activate progress bar. `Default = True`.	`True`
`n_cores`	Defines the number of cores used for multicore processing. If `n_cores` exceeds the number of available cores a warning is displayed and. `Default = 1`. On Windows, prefer `available_cores - 1` (at least 1) for interactive work.	`1`
`min_size_for_parallel_execution`	Defines the number of table rows required to execute multicore processing. `Default = 100`.	`MIN_SIZE_DEFAULT`
`post_fixes`	Defines a subset of peaks via postfixes i. e. ['__0', '__1']. By default, all peaks in a table row are integrated. `Default = None`.	`None`
`max_cores`	The maximal number of cores used for multicore processing. If `max_cores` exceeds the number of available cores a warning is displayed and the `n_cores` is set to `max_cores`. Default is `8`.	`8`
`in_place`	Allows operation in place if True. Note: if `in_place` is `True` multicore processing is not possible and n_cores is set to 1. Default = `False`. Using in-place integration has performance benefits.	`False`
`path`	If specified the result will be a Table with a db file backend, else the result will be managed in memory.	`None`
`overwrite`	Indicate if an already existing database file should be overwritten.	`False`

Returns:

Type	Description
	`emzed.Table` by default. Returns None if `in_place` is `True`

Example:

from pathlib import Path

import emzed

peaks = emzed.io.load_table(Path("tests/data/peaks.table"))
chrom = emzed.extract_chromatograms(peaks[:3])

integrated = emzed.quantification.integrate_chromatograms(
    chrom,
    "linear",
    show_progress=False,
)

integrated.set_col_format("peakmap", None)
integrated.set_col_format("chromatogram", None)
integrated.set_col_format("model_chromatogram", None)
print(
    integrated.extract_columns(
        "mz",
        "rt",
        "peak_shape_model_chromatogram",
        "area_chromatogram",
        "rmse_chromatogram",
        "valid_model_chromatogram",
    )
)

needed 0.0 seconds
mz           rt        peak_shape_model_chromatogram  area_chromatogram  rmse_chromatogram  valid_model_chromatogram
MzType       RtType    str                            float              float              bool
-----------  --------  -----------------------------  -----------------  -----------------  ------------------------
 219.174787    0.13 m  linear                                  1.71e+06           0.00e+00  True
 256.159512    0.14 m  linear                                  2.35e+06           0.00e+00  True
 258.156445    0.14 m  linear                                  1.48e+06           0.00e+00  True

Using multiple cores¶

Both integrate(...) and integrate_chromatograms(...) share the same runtime controls:

in_place=False: when True, multicore execution is disabled.
n_cores=1: default single-core execution.
min_size_for_parallel_execution=100: parallel execution starts only above this row count.
max_cores=8: upper bound for multicore execution.

Gotchas:

in_place=True and multicore processing are mutually exclusive.
On Windows, prefer n_cores = max(1, available_cores - 1) to reduce the risk of UI freezes during interactive work.
Smaller tables are often faster with single-core execution, especially on Windows.