Tables¶

This page highlights the most often used API functionalities and is not complete. It covers the Table container and core value types (MzType, RtType) used in table columns. Table expression syntax is documented on the dedicated Table Column Expressions page. For complete coverage, see the API Reference.

`Table`¶

Primary tabular container used across emzed.

Table Model¶

Each table column has three parts:

a name (col_names)
a type (col_types)
a display format (col_formats)

Supported column types include standard Python scalars (int, float, str, bool) and emzed domain types such as PeakMap or nested Table.

For arbitrary Python data (for example dict, tuple, list, or custom objects), use column type object. In object columns, all pickleable Python data types are supported.

col_formats controls how values are displayed (for example in print(table) or GUI inspectors). It does not change the stored raw values.

Typical format values:

format strings such as %.5f, %.2f, %d
callable formatters. E.g. the RtType uses lambda v: f"{v/60:.2f} m"
None to hide a column in printed/GUI views while keeping the data

Formats can be changed later with set_col_format and changed back if needed.

Usage pattern (class-level constructors vs instance methods):¶

import emzed

# class-level constructor (staticmethod)
peaks = emzed.Table.create_table(
    ["name", "sum_formula"],
    [str, str],
    rows=[["glucose", "C6H12O6"], ["caffeine", "C8H10N4O2"]],
)

# class-level method
loaded = emzed.Table.load("peaks.table")

# instance methods
filtered = peaks.filter(peaks.name.contains("gluc"))
peaks.save("peaks.table", overwrite=True)

In-memory vs on-disk¶

emzed tables can live in RAM or on disk, backed by a SQLite database file.

Table.load(path) reads the whole file into memory.
Table.open(path) returns a lightweight handle to the on-disk file; no data is loaded until accessed. This lets you work with tables larger than your available RAM, for example in workflows that process many samples.

Use t.is_in_memory() to check which mode a table is in, and t.close() to release the file handle when done.

Several combining and consolidating operations accept a path= argument to write their result directly to disk instead of returning an in-memory table, for example Table.stack_tables(..., path=...) or view.consolidate(path=...).

See the out-of-memory processing example for a runnable workflow.

Table Attributes And Column Access¶

meta_data: table metadata dictionary wrapper
col_names: tuple of column names
col_types: tuple of Python/emzed column types
col_formats: tuple of column format definitions
dynamic column access: t.col_name is equivalent to t["col_name"]; the bracket form is often simpler for programmatic access
t.mz and t["mz"] are equivalent column expression objects
t.mz[0] (or t["mz"][0]) returns the first value in that column

Example:

import emzed

t = emzed.Table.create_table(
    ["name", "mz"],
    [str, float],
    rows=[["glucose", 181.071]],
)
print(t.col_names)
print(t.mz[0], t["mz"][0])
col_name = "mz"
print(t[col_name][0])  # useful for programmatic access
t.set_col_format("mz", "%.3f")
print(t)
t.set_col_format("mz", None)  # hidden in print/gui views, data is still there

Example output:

('name', 'mz')
181.071 181.071
181.071
name     mz
str      float
-------  -----
glucose  181.071

Table Methods¶

Class level constructors¶

`create_table(col_names, col_types, col_formats=None, rows=None, title=None, meta_data=None, path=None)` `staticmethod` ¶

creates a table.

Parameters:

Name	Description	Default
`col_names`	list or tuple of strings.	required
`col_types`	list of types.	required
`col_formats`	list of formats using format specifiers like "%.2f" If not specified emzed tries to guess appropriate formats based on column type and column name.	`None`
`rows`	list of lists.	`None`
`title`	table title as string.	`None`
`meta_data`	dictionary to manage user defined meta data.	`None`
`path`	path for the db backend, default is `None` to use the the in-memory db backend.	`None`

Returns:

Type	Description
	`emzed.Table`.

`to_table(name, values, type_, format_=None, title=None, meta_data=None, path=None)` ¶

generates a one-column Table from an iterable, e.g. from a list.

Parameters:

Name	Description	Default
`name`	name of the column.	required
`values`	iterable with column values.	required
`type_`	supported colum types are int, float, bool, MzType, RtType, str, PeakMap, Table, object. In case you want to use Python objects like lists or dicts, use column type 'object' instead.	required
`format_`	is a format string such as `"%d"`. To suppress visibility set `format_ = None`.	`None`
`title`	Table title as string.	`None`
`meta_data`	Python dictionary to assign meta data to the table.	`None`
`path`	Path for the db backend, use `None` for an in memory db backend.	`None`

Returns:

Type	Description
	`emzed.Table`

`load(path)` `classmethod` ¶

loads table from disk into memory.

Parameters:

Name	Type	Description	Default
`path`		path to file.	required

Returns:

Type	Description
	`emzed.Table`.

`open(path)` `classmethod` ¶

opens table on disk without loading data into memory.

Parameters:

Name	Type	Description	Default
`path`		path to file.	required

Returns:

Type	Description
	`emzed.Table`.

Column Access, Mutation, And Layout¶

`get_column(name)` ¶

returns column expression object for column name.

Parameters:

Name	Type	Description	Default
`name`		existing column name or `"_index"`. You can use `t[name]` or `t.name` instead.	required

`add_column(name, what, type_, format_=not_specified, insert_before=None, insert_after=None)` ¶

adds a new column with name in place.

Parameters:

Name	Description	Default
`name`	the name of the new column.	required
`what`	either a `list` with the same length as table or an `expression`.	required
`type_`	supported colum types are int, float, bool, MzType, RtType, str, PeakMap, Table, object. In case you want to use Python objects like lists or dicts, use column type 'object' instead.	required
`format_`	is a format string as "%d" or an executable string with python code. To suppress visibility set `format_ = None`. The default `not_specified` is a sentinel meaning "argument not provided" and is different from `None`. In that case the method determines a default format for the column type.	`not_specified`
`insert_before`	to add column `name` at a defined position, one can specify its position left-wise to column `insert_before` via the name of an existing column, or an integer index (negative values allowed !).	`None`
`insert_after`	to add column `name` at a defined position, one can specify its position right-wise to column `insert_after`.	`None`

`add_column_with_constant_value(name, value, type_, format_=not_specified, insert_before=None, insert_after=None)` ¶

add column name with unique value value.

Parameters:

Name	Description	Default
`name`	new column name.	required
`value`	any of accepted types int, float, bool, MzType, RtType, str, PeakMap, Table.	required
`type_`	target column type.	required
`format_`	target column format. The default `not_specified` is a sentinel (different from `None`) and means "use the default format for `type_`".	`not_specified`
`insert_before`	insertion position for the new column.	`None`
`insert_after`	insertion position for the new column.	`None`

`replace_column(name, what, type_=None, format_=not_specified)` ¶

replaces content of existing column name in place.

Parameters:

Name	Description	Default
`name`	the name of the exisiting column.	required
`what`	you can use a `list` with the same length as table or an `expression`.	required
`type_`	supported colum types are int, float, bool, MzType, RtType, str, PeakMap, Table, object. In case you want to use Python objects like lists or dicts, use column type 'object' instead.	`None`
`format_`	is a format string as "%d" or an executable string with python code. To suppress visibility set `format_ = None`. The default `not_specified` is a sentinel meaning "argument not provided" and is different from `None`. In that case the method determines a default format for the column type. The column keeps its existing position in the table.	`not_specified`

`add_or_replace_column(name, what, type_=None, format_=not_specified, insert_before=None, insert_after=None)` ¶

replaces the content of column name if it exists, else name is added (in place).

Parameters:

Name	Description	Default
`name`	column name to replace or create.	required
`what`	either a `list` with the same length as table or an `expression`.	required
`type_`	target column type. If `None` and the column exists, the current column type is reused.	`None`
`format_`	target column format. The default `not_specified` is a sentinel (different from `None`) and means "use the existing format or the default format for `type_`".	`not_specified`
`insert_before`	insertion position used when the column is created.	`None`
`insert_after`	insertion position used when the column is created.	`None`

`add_or_replace_column_with_constant_value(name, what, type_=None, format_=not_specified, insert_before=None, insert_after=None)` ¶

replaces the content of column name with unique value if name exists, else name is added (in place).

Parameters:

Name	Description	Default
`name`	column name to replace or create.	required
`what`	scalar value assigned to all rows.	required
`type_`	target column type. If `None` and the column exists, the current column type is reused.	`None`
`format_`	target column format. The default `not_specified` is a sentinel (different from `None`) and means "use the existing format or the default format for `type_`".	`not_specified`
`insert_before`	insertion position used when the column is created.	`None`
`insert_after`	insertion position used when the column is created.	`None`

`replace_column_with_constant_value(name, what, type_=None, format_=not_specified)` ¶

replaces the content of column name with unique value what.

Parameters:

Name	Description	Default
`name`	existing column name.	required
`what`	any of accepted types int, float, bool, MzType, RtType, str, PeakMap, Table.	required
`type_`	target column type. If `None`, the current column type is used.	`None`
`format_`	target column format. The default `not_specified` is a sentinel (different from `None`) and means "use the existing format or the default format for `type_`".	`not_specified`

`drop_columns(*col_names)` ¶

removes columns in place.

Parameters:

Name	Type	Description	Default
`col_names`		column names. either exact names or names containg wild cards like `?` and ``. Example: Table `t` with colnames `id, mz, mzmin, mzmax, sample_1k1, sample_1m1, sample_1k2` `t.drop_columns('mz', 'sample_1?1')` results `t` with columns `id, sample_1k2`	`()`

`extract_columns(*col_names)` ¶

returns new Table with selected columns col_names.

Parameters:

Name	Type	Description	Default
`col_names`		list or tuple with selected, existing column names.	`()`

`rename_columns(**from_to)` ¶

Rename columns from current names to new names.

Parameters:

Name	Type	Description	Default
`from_to`		Keyword arguments mapping old column names to new column names, for example `a="b"`. Example: `t.rename_columns(a="b")` renames column `"a"` to `"b"`.	`{}`

`rename_postfixes(**from_to)` ¶

Rename postfixes in column names using keyword arguments.

Each key is the old postfix and each value is the new postfix.

Example: t.rename_postfixes(__0="_zero") changes columns like a__0 and b__0 to a_zero and b_zero.

`set_col_format(col_name, format_)` ¶

sets format of column col_name to format format_.

Parameters:

Name	Type	Description	Default
`col_name`		column name.	required
`format_`		accepted column format (see `add_column`).	required

Returns:

Type	Description
	`None`.

Row Selection, Ordering, Grouping, And Collapse¶

`filter(condition)` ¶

creates a new table by filtering rows fulfiling the given condition. similar use as pandas query.

Parameters:

Name	Type	Description	Default
`condition`		expression like `t.a < 0` or `t.a <= t.b`.	required

Returns:

Type	Description
	`emzed.Table` with filtered rows.

`sort_by(*col_names, ascending=True)` ¶

sort table by given column names in given order.

Parameters:

Name	Type	Description	Default
`col_names`		one or more column names as separate arguments.	`()`
`ascending`		either bool or list/tuple of bools of same number as specified column names.	`True`

Returns:

Type	Description
	`emzed.Table`.

`split_by(*col_names, keep_view=False)` ¶

Split a table into subtables by unique values of selected columns.

Parameters:

Name	Type	Description	Default
`col_names`		Column names defining split groups.	`()`
`keep_view`		If `True`, return views. If `False`, return consolidated standalone tables.	`False`

Returns:

Type	Description
	List of `Table` subtables, one per unique group.

`collapse(*col_names, new_col_name='collapsed', path=None)` ¶

Collapse rows into grouped nested-table rows.

Parameters:

Name	Description	Default
`col_names`	Column names defining grouping keys.	`()`
`new_col_name`	Name of the column that stores collapsed subtables.	`'collapsed'`
`path`	Optional target database path for the result.	`None`

Returns:

Type	Description
	Collapsed `emzed.Table` with one nested table per group.

Joining And Combining Tables¶

`join(other, expression=None, *, path=None, overwrite=False)` ¶

Join two tables.

Parameters:

Name	Description	Default
`other`	Right-hand side table.	required
`expression`	Optional join condition expression. If `None`, a full row-wise cross product is returned.	`None`
`path`	Optional target database path for the result.	`None`
`overwrite`	Whether an existing `path` may be overwritten.	`False`

Returns:

Type	Description
	Joined `emzed.Table`.

`left_join(other, expression=None, *, path=None, overwrite=False)` ¶

Left-join two tables while keeping all rows from the left table.

Parameters:

Name	Description	Default
`other`	Right-hand side table.	required
`expression`	Optional join condition expression. If `None`, a full row-wise cross product is used.	`None`
`path`	Optional target database path for the result.	`None`
`overwrite`	Whether an existing `path` may be overwritten.	`False`

Returns:

Type	Description
	Joined `emzed.Table` with all left-side rows preserved.

`fast_join(other, col_name, col_name_other=None, atol=0.0, rtol=0.0, extra_condition=None, *, path=None, overwrite=False)` ¶

joins (combines) two tables based on comparing approximate equality of two numerical columns.

Parameters:

Name	Description	Default
`other`	second table for join.	required
`col_name`	column name to consider.	required
`col_name_other`	column name of other to consider in case it is different to `col_name`.	`None`
`atol`	absolute tolerance for approximate equality check.	`0.0`
`rtol`	relative tolerance for approximate equality check.	`0.0`
`extra_condition`	optional additional join expression that must also match for a row pair to be included.	`None`

Returns:

Type	Description
	`emzed.Table`. Performance: In case `other` is significantly larger than `self`, it is recommended to swap the tables. The apprimate equality check for two numbers a and b is: `abs(a - b) <= atol + rtol * abs(a)` So if you only need comparison based absolute tolerance you can set rtol to 0.0, and if you only need relative tolerance check you can set atol to 0.0.

`fast_left_join(other, col_name, col_name_other=None, atol=0.0, rtol=0.0, extra_condition=None, *, path=None, overwrite=False)` ¶

joins (combines) two tables based on comparing approximate equality of two numerical columns.

In contrast to fast_join this method will include also non-matching rows from self.

Parameters:

Name	Description	Default
`other`	second table for join.	required
`col_name`	column name to consider.	required
`col_name_other`	column name of other to consider in case it is different to `col_name`.	`None`
`atol`	absolute tolerance for approximate equlity check.	`0.0`
`rtol`	relative tolerance for approximate equlity check.	`0.0`
`extra_condition`	optional additional join expression that must also match for a row pair to be included.	`None`

Returns:

Type	Description
	`emzed.Table`. Performance: In case `other` is significantly larger than `self`, it is recommended to swap the tables. The apprimate equality check for two numbers a and b is: `abs(a - b) <= atol + rtol * abs(a)` So if you only need comparison based absolute tolerance you can set rtol to 0.0, and if you only need relative tolerance check you can set atol to 0.0.

`stack_tables(tables, path=None, overwrite=False)` `staticmethod` ¶

builds a single Table from list or tuple of Tables.

Parameters:

Name	Description	Default
`tables`	list or tuple of Tables. All tables must have the same colum names with same types and formats.	required
`path`	If specified the result will be a Table with a db file backend, else the result will be managed in memory.	`None`
`overwrite`	Indicate if an already existing database file should be overwritten.	`False`

Returns:

Type	Description
	`emzed.Table`.

Storage, Export, And Summary¶

`save(path, *, overwrite=False)` ¶

save table to a file.

Parameters:

Name	Type	Description	Default
`path`		path describing target location.	required
`overwrite`		If set to `True` an existing file will be overwritten, else an exception will be thrown.	`False`

`save_csv(path, delimiter=';', as_printed=False, dash_is_none=True, *, overwrite=False)` ¶

saves Table as csv in path.

Parameters:

Name	Description	Default
`path`	specifies path of the file. The path must end with `.csv`.	required
`delimiter`	Alias for sep. Default value is set to Excel dialect ';'.	`';'`
`as_printed`	If `True`, formatted values will be stored. Note, format settings can lead to information loss, i.e. if column format value is set to .2f% only the first 2 decimal places will be saved.	`False`
`dash_is_none`	if `True`, missing values are written as `-` when `as_printed` is enabled.	`True`
`overwrite`	If set to `True` an existing file will be overwritten, else an exception will be thrown.	`False`

`save_excel(path, *, overwrite=False)` ¶

saves Table as xls or xlsx in path.

Parameters:

Name	Type	Description	Default
`path`		specifies path of the file. The path must end with `.xls` or `.xlsx`.	required

`to_pandas()` ¶

converts table to pandas DataFrame object

`consolidate(path=None, *, overwrite=False)` ¶

consolidates if underlying database table is a view.

Parameters:

Name	Type	Description	Default
`path`		If specified the result will be a Table with a db file backend, else the result will be managed in memory.	`None`
`overwrite`		Indicate if an already existing database file should be overwritten.	`False`

Returns:

Type	Description
	`emzed.Table`.

`summary()` ¶

`is_mutable()` ¶

returns boolean value to show whether the content of a Table is mutable.

`MzType` ¶

Bases: float

Represents Mass-to-Charge ratio (m/z). Inherits from float and provides high-precision formatting (6 decimal places) in tables.

`RtType` ¶

Bases: float

Represents Retention Time in seconds. Inherits from float and provides specialized formatting in tables (e.g., '12.34 m').

Tables¶

Table¶

Table Model¶

Usage pattern (class-level constructors vs instance methods):¶

In-memory vs on-disk¶

Table Attributes And Column Access¶

Table Methods¶

Class level constructors¶

create_table(col_names, col_types, col_formats=None, rows=None, title=None, meta_data=None, path=None) staticmethod ¶

to_table(name, values, type_, format_=None, title=None, meta_data=None, path=None) ¶

load(path) classmethod ¶

open(path) classmethod ¶

Column Access, Mutation, And Layout¶

get_column(name) ¶

add_column(name, what, type_, format_=not_specified, insert_before=None, insert_after=None) ¶

add_column_with_constant_value(name, value, type_, format_=not_specified, insert_before=None, insert_after=None) ¶

replace_column(name, what, type_=None, format_=not_specified) ¶

add_or_replace_column(name, what, type_=None, format_=not_specified, insert_before=None, insert_after=None) ¶

add_or_replace_column_with_constant_value(name, what, type_=None, format_=not_specified, insert_before=None, insert_after=None) ¶

replace_column_with_constant_value(name, what, type_=None, format_=not_specified) ¶

drop_columns(*col_names) ¶

extract_columns(*col_names) ¶

rename_columns(**from_to) ¶

rename_postfixes(**from_to) ¶

set_col_format(col_name, format_) ¶

Row Selection, Ordering, Grouping, And Collapse¶

filter(condition) ¶

sort_by(*col_names, ascending=True) ¶

split_by(*col_names, keep_view=False) ¶

collapse(*col_names, new_col_name='collapsed', path=None) ¶

Joining And Combining Tables¶

join(other, expression=None, *, path=None, overwrite=False) ¶

left_join(other, expression=None, *, path=None, overwrite=False) ¶

fast_join(other, col_name, col_name_other=None, atol=0.0, rtol=0.0, extra_condition=None, *, path=None, overwrite=False) ¶

fast_left_join(other, col_name, col_name_other=None, atol=0.0, rtol=0.0, extra_condition=None, *, path=None, overwrite=False) ¶

stack_tables(tables, path=None, overwrite=False) staticmethod ¶

Storage, Export, And Summary¶

save(path, *, overwrite=False) ¶

save_csv(path, delimiter=';', as_printed=False, dash_is_none=True, *, overwrite=False) ¶

save_excel(path, *, overwrite=False) ¶

to_pandas() ¶

consolidate(path=None, *, overwrite=False) ¶

summary() ¶

is_mutable() ¶

MzType ¶

RtType ¶

`Table`¶

`create_table(col_names, col_types, col_formats=None, rows=None, title=None, meta_data=None, path=None)` `staticmethod` ¶

`to_table(name, values, type_, format_=None, title=None, meta_data=None, path=None)` ¶

`load(path)` `classmethod` ¶

`open(path)` `classmethod` ¶

`get_column(name)` ¶

`add_column(name, what, type_, format_=not_specified, insert_before=None, insert_after=None)` ¶

`add_column_with_constant_value(name, value, type_, format_=not_specified, insert_before=None, insert_after=None)` ¶

`replace_column(name, what, type_=None, format_=not_specified)` ¶

`add_or_replace_column(name, what, type_=None, format_=not_specified, insert_before=None, insert_after=None)` ¶

`add_or_replace_column_with_constant_value(name, what, type_=None, format_=not_specified, insert_before=None, insert_after=None)` ¶

`replace_column_with_constant_value(name, what, type_=None, format_=not_specified)` ¶

`drop_columns(*col_names)` ¶

`extract_columns(*col_names)` ¶

`rename_columns(**from_to)` ¶

`rename_postfixes(**from_to)` ¶

`set_col_format(col_name, format_)` ¶

`filter(condition)` ¶

`sort_by(*col_names, ascending=True)` ¶

`split_by(*col_names, keep_view=False)` ¶

`collapse(*col_names, new_col_name='collapsed', path=None)` ¶

`join(other, expression=None, *, path=None, overwrite=False)` ¶

`left_join(other, expression=None, *, path=None, overwrite=False)` ¶

`fast_join(other, col_name, col_name_other=None, atol=0.0, rtol=0.0, extra_condition=None, *, path=None, overwrite=False)` ¶

`fast_left_join(other, col_name, col_name_other=None, atol=0.0, rtol=0.0, extra_condition=None, *, path=None, overwrite=False)` ¶

`stack_tables(tables, path=None, overwrite=False)` `staticmethod` ¶

`save(path, *, overwrite=False)` ¶

`save_csv(path, delimiter=';', as_printed=False, dash_is_none=True, *, overwrite=False)` ¶

`save_excel(path, *, overwrite=False)` ¶

`to_pandas()` ¶

`consolidate(path=None, *, overwrite=False)` ¶

`summary()` ¶

`is_mutable()` ¶

`MzType` ¶

`RtType` ¶