[1]:
import emzed
import os
# path to files
data_folder = os.path.join(os.getcwd(), "tutorial_data")
path = os.path.join(data_folder, "CoA_ester_ms_ms2.mzXML")
coa = emzed.io.load_peak_map(path)
start remote ip in /root/.emzed3/pyopenms_venv
pyopenms client started.
connected to pyopenms client.
In Chapter 2 we started targeted extraction of amino acids with a table containing the corresponding m/z values of protonized monoisotopic masses. The emzed module ``targeted`` provides the method ``solution_space`` supporting a more advanced approach of targeted peak extraction that includes possible isotopologues and adducts: ~~~ solution_space(targets, adducts, explained_abundance=0.999, resolution=None) ~~~
Arguments:
targets
: a table with the 2 mandatory columns id
and mf
, with column mf
containing the molecular formulas of interest
adducts
: a table of the emzed module emzed.adducts
with selected adducts
explained_abundance
: determines the number of isotopologue peaks taken into account
resolution
: the mass resolution of the MS instrument
The function returns the mz solution space of molecular formulas of selected adducts including isotopologues.
The number of included isotopologues depends on the parameter explained_abundance
. Note, the larger a molecule, the more isotopologues will be included with the same value. As an example, we build the m/z solution space of the acetyl-CoA M+H ion:
[2]:
# 1. we create the target table
target = emzed.to_table("mf", ["C23H38N7O17P3S"], str)
target.add_enumeration()
# 2. we select the adduct of interest
adducts = emzed.adducts.M_plus_H
# 3. we build the table
sp = emzed.targeted.solution_space
t = sp(target, adducts, explained_abundance=0.95, resolution=6e4)
t
[2]:
id | target_id | mf | adduct_id | adduct_name | m_multiplier | adduct_add | adduct_sub | z | sign_z | full_mf | isotope_id | isotope_decomposition | m0 | abundance | mz |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
int | int | str | int | str | int | str | str | int | int | str | int | str | MzType | float | MzType |
0 | 0 | C23H38N7O17P3S | 20 | M+H | 1 | H | 1 | 1 | C23H39N7O17P3S | 0 | - | 810.133604 | 0.745 | 810.133056 | |
1 | 0 | C23H38N7O17P3S | 20 | M+H | 1 | H | 1 | 1 | C23H39N7O17P3S | 1 | - | 811.136597 | 0.196 | 811.136049 | |
2 | 0 | C23H38N7O17P3S | 20 | M+H | 1 | H | 1 | 1 | C23H39N7O17P3S | 2 | - | 812.136114 | 0.059 | 812.135565 |
Once we determined the solution space, we can continue as explained in chapter 2:
[3]:
# we initially use the whole RT range of the peakmap
rtmin, rtmax = coa.rt_range()
mztol = 0.003
def add_peak_columns(t, rtmin, rtmax, peakmap):
t.add_column("mzmin", t.mz - mztol, emzed.MzType)
t.add_column("mzmax", t.mz + mztol, emzed.MzType)
t.add_column_with_constant_value("rtmin", rtmin, emzed.RtType)
t.add_column_with_constant_value("rtmax", rtmax, emzed.RtType)
t.add_column_with_constant_value(
"peakmap", peakmap, emzed.PeakMap
)
add_peak_columns(t, rtmin, rtmax, coa)
[4]:
t = emzed.quantification.integrate(
t, "linear", ms_level=1, show_progress=False
)
t
[4]:
id | target_id | mf | adduct_id | adduct_name | m_multiplier | adduct_add | adduct_sub | z | sign_z | full_mf | isotope_id | isotope_decomposition | m0 | abundance | mz | mzmin | mzmax | rtmin | rtmax | peak_shape_model | area | rmse | valid_model |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
int | int | str | int | str | int | str | str | int | int | str | int | str | MzType | float | MzType | MzType | MzType | RtType | RtType | str | float | float | bool |
0 | 0 | C23H38N7O17P3S | 20 | M+H | 1 | H | 1 | 1 | C23H39N7O17P3S | 0 | - | 810.133604 | 0.745 | 810.133056 | 810.130056 | 810.136056 | 5.10 m | 32.01 m | linear | 2.67e+07 | 0.00e+00 | True | |
1 | 0 | C23H38N7O17P3S | 20 | M+H | 1 | H | 1 | 1 | C23H39N7O17P3S | 1 | - | 811.136597 | 0.196 | 811.136049 | 811.133049 | 811.139049 | 5.10 m | 32.01 m | linear | 7.38e+06 | 0.00e+00 | True | |
2 | 0 | C23H38N7O17P3S | 20 | M+H | 1 | H | 1 | 1 | C23H39N7O17P3S | 2 | - | 812.136114 | 0.059 | 812.135565 | 812.132565 | 812.138565 | 5.10 m | 32.01 m | linear | 2.10e+06 | 0.00e+00 | True |
The resulting a peaks table containing the isotopologues M0, M1, and M2 of M+H ion.
The method ``solution_space`` is also extremely useful to extract and group all LC-MS peaks originating from a known compound. In our example, we know acetyl-CoA forms the major ion M+H eluting between rtmin = 890s and rtmax = 920s. We will now search the m/z solution space of acetyl-CoA for all other possible peaks originating from the compound in the positive ESI mode.
[5]:
# use the adduct table with all common positive adducts
adducts = emzed.adducts.positive
# build the solution space
t = sp(target, adducts, explained_abundance=0.95, resolution=6e4)
add_peak_columns(t, 890, 920, coa)
# extract peaks
t = emzed.quantification.integrate(t, "linear", show_progress=False)
t = t.filter(t.area > 1e3) # keep only features with significant area
print("number of peaks:", len(t))
number of peaks: 16
In given example, we can find 16 peaks origination from the single compound acetyl-CoA.
© Copyright 2012-2024 ETH Zurich
Last build 2024-03-25 10:41:42.995953.
Created using Sphinx 7.2.6.