[1]:
import emzed
import os

# path to files
data_folder = os.path.join(os.getcwd(), "tutorial_data")
path = os.path.join(data_folder, "CoA_ester_ms_ms2.mzXML")
coa = emzed.io.load_peak_map(path)
start remote ip in /root/.emzed3/pyopenms_venv
pyopenms client started.
connected to pyopenms client.

7 Targeted sample analysis

In Chapter 2 we started targeted extraction of amino acids with a table containing the corresponding m/z values of protonized monoisotopic masses. The emzed module ``targeted`` provides the method ``solution_space`` supporting a more advanced approach of targeted peak extraction that includes possible isotopologues and adducts: ~~~ solution_space(targets, adducts, explained_abundance=0.999, resolution=None) ~~~

Arguments:

  • targets: a table with the 2 mandatory columns id and mf, with column mf containing the molecular formulas of interest

  • adducts: a table of the emzed module emzed.adducts with selected adducts

  • explained_abundance: determines the number of isotopologue peaks taken into account

  • resolution: the mass resolution of the MS instrument mzFWHM

The function returns the mz solution space of molecular formulas of selected adducts including isotopologues.

The number of included isotopologues depends on the parameter explained_abundance. Note, the larger a molecule, the more isotopologues will be included with the same value. As an example, we build the m/z solution space of the acetyl-CoA M+H ion:

[2]:
# 1. we create the target table
target = emzed.to_table("mf", ["C23H38N7O17P3S"], str)
target.add_enumeration()
# 2. we select the adduct of interest
adducts = emzed.adducts.M_plus_H
# 3. we build the table
sp = emzed.targeted.solution_space
t = sp(target, adducts, explained_abundance=0.95, resolution=6e4)
t
[2]:
id target_id mf adduct_id adduct_name m_multiplier adduct_add adduct_sub z sign_z full_mf isotope_id isotope_decomposition m0 abundance mz
int int str int str int str str int int str int str MzType float MzType
0 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 0 -  810.133604 0.745  810.133056
1 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 1 -  811.136597 0.196  811.136049
2 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 2 -  812.136114 0.059  812.135565

Once we determined the solution space, we can continue as explained in chapter 2:

[3]:
# we initially use the whole RT range of the peakmap

rtmin, rtmax = coa.rt_range()
mztol = 0.003


def add_peak_columns(t, rtmin, rtmax, peakmap):
    t.add_column("mzmin", t.mz - mztol, emzed.MzType)
    t.add_column("mzmax", t.mz + mztol, emzed.MzType)
    t.add_column_with_constant_value("rtmin", rtmin, emzed.RtType)
    t.add_column_with_constant_value("rtmax", rtmax, emzed.RtType)
    t.add_column_with_constant_value(
        "peakmap", peakmap, emzed.PeakMap
    )


add_peak_columns(t, rtmin, rtmax, coa)
[4]:
t = emzed.quantification.integrate(
    t, "linear", ms_level=1, show_progress=False
)
t
[4]:
id target_id mf adduct_id adduct_name m_multiplier adduct_add adduct_sub z sign_z full_mf isotope_id isotope_decomposition m0 abundance mz mzmin mzmax rtmin rtmax peak_shape_model area rmse valid_model
int int str int str int str str int int str int str MzType float MzType MzType MzType RtType RtType str float float bool
0 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 0 -  810.133604 0.745  810.133056  810.130056  810.136056   5.10 m  32.01 m linear 2.67e+07 0.00e+00 True
1 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 1 -  811.136597 0.196  811.136049  811.133049  811.139049   5.10 m  32.01 m linear 7.38e+06 0.00e+00 True
2 0 C23H38N7O17P3S 20 M+H 1 H 1 1 C23H39N7O17P3S 2 -  812.136114 0.059  812.135565  812.132565  812.138565   5.10 m  32.01 m linear 2.10e+06 0.00e+00 True

The resulting a peaks table containing the isotopologues M0, M1, and M2 of M+H ion.

coa solution space

The method ``solution_space`` is also extremely useful to extract and group all LC-MS peaks originating from a known compound. In our example, we know acetyl-CoA forms the major ion M+H eluting between rtmin = 890s and rtmax = 920s. We will now search the m/z solution space of acetyl-CoA for all other possible peaks originating from the compound in the positive ESI mode.

[5]:
# use the adduct table with all common positive adducts
adducts = emzed.adducts.positive
# build the solution space
t = sp(target, adducts, explained_abundance=0.95, resolution=6e4)
add_peak_columns(t, 890, 920, coa)
# extract peaks
t = emzed.quantification.integrate(t, "linear", show_progress=False)
t = t.filter(t.area > 1e3)  # keep only features with significant area
print("number of peaks:", len(t))
number of peaks: 16
coa_complete

In given example, we can find 16 peaks origination from the single compound acetyl-CoA.

Back to top

© Copyright 2012-2024 ETH Zurich
Last build 2024-03-25 10:41:42.995953.
Created using Sphinx 7.2.6.