Peak and Chromatogram Integration¶
This page highlights the most often used API functionalities and is not complete.
It covers peak and chromatogram integration in emzed.quantification.
For complete coverage, see the API Reference.
Peak Integration¶
Integrate peak windows from a peak table.
Supported peak-shape models:
linear: integrates the piecewise-linear signal (equivalent to trapezoidal integration).asym_gauss: asymmetric Gaussian peak-shape fit; useful for skewed peaks.emg: exponentially modified Gaussian fit; useful for tailed chromatographic peaks.sgolay: Savitzky-Golay based smoothing/integration workflow.no_integration: creates the same result columns as other models, but does not perform peak integration.
integrate(peak_table, peak_shape_model, ms_level=None, show_progress=True, n_cores=1, min_size_for_parallel_execution=MIN_SIZE_DEFAULT, post_fixes=None, max_cores=8, in_place=False, path=None, overwrite=False, **model_extra_args)
¶
integrates peaks of peak_table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
peak_table
|
|
required | |
peak_shape_model
|
String of model name applied to determine peak area.
Available models are: |
required | |
ms_level
|
MS level of peak integration. Must only be specified if peakmap
has more than one MS levels. |
None
|
|
show_progress
|
Boolean value to activate progress bar. |
True
|
|
n_cores
|
Defines the number of cores used for multicore processing.
If |
1
|
|
min_size_for_parallel_execution
|
Defines the number of table rows required
to execute multicore processing. |
MIN_SIZE_DEFAULT
|
|
post_fixes
|
Defines a subset of peaks via postfixes i. e. ['__0', '__1'].
By default, all peaks in a table row are integrated.
|
None
|
|
max_cores
|
The maximal number of cores used for multicore processing.
If |
8
|
|
in_place
|
Allows operation in place if True.
Note: if |
False
|
|
path
|
If specified the result will be a Table with a db file backend, else the result will be managed in memory. |
None
|
|
overwrite
|
Indicate if an already existing database file should be overwritten. |
False
|
Returns:
| Type | Description |
|---|---|
|
Example:
The result keeps the original peak columns and adds:
|
Example:
from pathlib import Path
import emzed
peaks = emzed.io.load_table(Path("tests/data/peaks.table"))
print(sorted(emzed.quantification.available_peak_shape_models.keys()))
integrated = emzed.quantification.integrate(
peaks[:3],
"linear",
show_progress=False,
)
integrated.set_col_format("peakmap", None)
integrated.set_col_format("model", None)
print(
integrated.extract_columns(
"mz",
"rt",
"peak_shape_model",
"area",
"rmse",
"valid_model",
)
)
['asym_gauss', 'emg', 'linear', 'no_integration', 'sgolay']
mz rt peak_shape_model area rmse valid_model
MzType RtType str float float bool
----------- -------- ---------------- -------- -------- -----------
219.174787 0.13 m linear 1.71e+06 0.00e+00 True
256.159512 0.14 m linear 2.35e+06 0.00e+00 True
258.156445 0.14 m linear 1.48e+06 0.00e+00 True
Chromatogram Integration¶
Integrate pre-extracted chromatograms.
integrate_chromatograms(chromatogram_table, peak_shape_model, ms_level=None, show_progress=True, n_cores=1, min_size_for_parallel_execution=MIN_SIZE_DEFAULT, post_fixes=None, max_cores=8, in_place=False, path=None, overwrite=False, **model_extra_args)
¶
integrates peaks of chromatogram_table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chromatogram_table
|
|
required | |
peak_shape_model
|
String of model name applied to determine peak area.
Available models are: |
required | |
ms_level
|
MS level of peak integration. Must only be specified if peakmap
has more than one MS levels. |
None
|
|
show_progress
|
Boolean value to activate progress bar. |
True
|
|
n_cores
|
Defines the number of cores used for multicore processing.
If |
1
|
|
min_size_for_parallel_execution
|
Defines the number of table rows required
to execute multicore processing. |
MIN_SIZE_DEFAULT
|
|
post_fixes
|
Defines a subset of peaks via postfixes i. e. ['__0', '__1'].
By default, all peaks in a table row are integrated.
|
None
|
|
max_cores
|
The maximal number of cores used for multicore processing.
If |
8
|
|
in_place
|
Allows operation in place if True.
Note: if |
False
|
|
path
|
If specified the result will be a Table with a db file backend, else the result will be managed in memory. |
None
|
|
overwrite
|
Indicate if an already existing database file should be overwritten. |
False
|
Returns:
| Type | Description |
|---|---|
|
|
Example:
from pathlib import Path
import emzed
peaks = emzed.io.load_table(Path("tests/data/peaks.table"))
chrom = emzed.extract_chromatograms(peaks[:3])
integrated = emzed.quantification.integrate_chromatograms(
chrom,
"linear",
show_progress=False,
)
integrated.set_col_format("peakmap", None)
integrated.set_col_format("chromatogram", None)
integrated.set_col_format("model_chromatogram", None)
print(
integrated.extract_columns(
"mz",
"rt",
"peak_shape_model_chromatogram",
"area_chromatogram",
"rmse_chromatogram",
"valid_model_chromatogram",
)
)
needed 0.0 seconds
mz rt peak_shape_model_chromatogram area_chromatogram rmse_chromatogram valid_model_chromatogram
MzType RtType str float float bool
----------- -------- ----------------------------- ----------------- ----------------- ------------------------
219.174787 0.13 m linear 1.71e+06 0.00e+00 True
256.159512 0.14 m linear 2.35e+06 0.00e+00 True
258.156445 0.14 m linear 1.48e+06 0.00e+00 True
Using multiple cores¶
Both integrate(...) and integrate_chromatograms(...) share the same
runtime controls:
in_place=False: whenTrue, multicore execution is disabled.n_cores=1: default single-core execution.min_size_for_parallel_execution=100: parallel execution starts only above this row count.max_cores=8: upper bound for multicore execution.
Gotchas:
in_place=Trueand multicore processing are mutually exclusive.- On Windows, prefer
n_cores = max(1, available_cores - 1)to reduce the risk of UI freezes during interactive work. - Smaller tables are often faster with single-core execution, especially on Windows.