[1]:
import emzed
import os

# path to files
data_folder = os.path.join(os.getcwd(), "tutorial_data")
path = os.path.join(data_folder, "CoA_ester_ms_ms2.mzXML")
coa = emzed.io.load_peak_map(path)
start remote ip in /root/.emzed3/pyopenms_venv
pyopenms client started.
connected to pyopenms client.

7 Targeted sample analysisΒΆ

In Chapter 2 we started targeted extraction of amino acids with a table containing the corresponding m/z values of protonized monoisotopic masses. The emzed module ``targeted`` provides the method ``solution_space`` supporting a more advanced approach of targeted peak extraction that includes possible isotopologues and adducts: ~~~ solution_space(targets, adducts, explained_abundance=0.999, resolution=None) ~~~

Arguments:

  • targets: a table with the 2 mandatory columns id and mf, with column mf containing the molecular formulas of interest

  • adducts: a table of the emzed module emzed.adducts with selected adducts

  • explained_abundance: determines the number of isotopologue peaks taken into account

  • resolution: the mass resolution of the MS instrument \(\frac{mz}{FWHM}\)

The function returns the mz solution space of molecular formulas of selected adducts including isotopologues.

The number of included isotopologues depends on the parameter explained_abundance. Note, the larger a molecule, the more isotopologues will be included with the same value. As an example, we build the m/z solution space of the acetyl-CoA M+H ion:

[2]:
# 1. we create the target table
target = emzed.to_table("mf", ["C23H38N7O17P3S"], str)
target.add_enumeration()
# 2. we select the adduct of interest
adducts = emzed.adducts.M_plus_H
# 3. we build the table
sp = emzed.targeted.solution_space
t = sp(target, adducts, explained_abundance=0.95, resolution=6e4)
t
[2]:
id target_id mf adduct_id adduct_name m_multiplier adduct_add adduct_sub z sign_z full_mf isotope_id isotope_decomposition m0 abundance mz
int int str int str int str str int int str int str MzType float MzType
0 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 0 -  810.133604 0.745  810.133056
1 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 1 -  811.136597 0.196  811.136049
2 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 2 -  812.136114 0.059  812.135565

Once we determined the solution space, we can continue as explained in chapter 2:

[3]:
# we initially use the whole RT range of the peakmap

rtmin, rtmax = coa.rt_range()
mztol = 0.003


def add_peak_columns(t, rtmin, rtmax, peakmap):
    t.add_column("mzmin", t.mz - mztol, emzed.MzType)
    t.add_column("mzmax", t.mz + mztol, emzed.MzType)
    t.add_column_with_constant_value("rtmin", rtmin, emzed.RtType)
    t.add_column_with_constant_value("rtmax", rtmax, emzed.RtType)
    t.add_column_with_constant_value(
        "peakmap", peakmap, emzed.PeakMap
    )


add_peak_columns(t, rtmin, rtmax, coa)
[4]:
t = emzed.quantification.integrate(
    t, "linear", ms_level=1, show_progress=False
)
t
[4]:
id target_id mf adduct_id adduct_name m_multiplier adduct_add adduct_sub z sign_z full_mf isotope_id isotope_decomposition m0 abundance mz mzmin mzmax rtmin rtmax peak_shape_model area rmse valid_model
int int str int str int str str int int str int str MzType float MzType MzType MzType RtType RtType str float float bool
0 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 0 -  810.133604 0.745  810.133056  810.130056  810.136056   5.10 m  32.01 m linear 2.67e+07 0.00e+00 True
1 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 1 -  811.136597 0.196  811.136049  811.133049  811.139049   5.10 m  32.01 m linear 7.38e+06 0.00e+00 True
2 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 2 -  812.136114 0.059  812.135565  812.132565  812.138565   5.10 m  32.01 m linear 2.10e+06 0.00e+00 True

The resulting a peaks table containing the isotopologues M0, M1, and M2 of M+H ion.

coa solution space

The method ``solution_space`` is also extremely useful to extract and group all LC-MS peaks originating from a known compound. In our example, we know acetyl-CoA forms the major ion M+H eluting between rtmin = 890s and rtmax = 920s. We will now search the m/z solution space of acetyl-CoA for all other possible peaks originating from the compound in the positive ESI mode.

[5]:
# use the adduct table with all common positive adducts
adducts = emzed.adducts.positive
# build the solution space
t = sp(target, adducts, explained_abundance=0.95, resolution=6e4)
add_peak_columns(t, 890, 920, coa)
# extract peaks
t = emzed.quantification.integrate(t, "linear", show_progress=False)
t = t.filter(t.area > 1e3)  # keep only features with significant area
print("number of peaks:", len(t))
number of peaks: 16
coa_complete

In given example, we can find 16 peaks origination from the single compound acetyl-CoA.

Back to top

© Copyright 2012-2024 ETH Zurich
Last build 2024-03-25 10:41:42.995953.
Created using Sphinx 7.2.6.