emzed.table¶
Table
¶
col_formats
property
¶
Column formats.
Returns:
| Type | Description |
|---|---|
|
tuple of format specifiers. |
col_names
property
¶
Column names.
Returns:
| Type | Description |
|---|---|
|
tuple of strings. |
col_types
property
¶
Column types.
Returns:
| Type | Description |
|---|---|
|
tuple of types. |
rows
property
¶
Returns:
| Type | Description |
|---|---|
|
All rows as list of tuples. |
unique_id
property
¶
computes unique identifier based on table content and meta data.
Returns:
| Type | Description |
|---|---|
|
unique identifier as string. |
add_column(name, what, type_, format_=not_specified, insert_before=None, insert_after=None)
¶
adds a new column with name in place.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
the name of the new column. |
required | |
what
|
either a |
required | |
type_
|
supported colum types are int, float, bool, MzType, RtType, str, PeakMap, Table, object. In case you want to use Python objects like lists or dicts, use column type 'object' instead. |
required | |
format_
|
is a format string as "%d" or or an executable string with
python code. To suppress visibility set |
not_specified
|
|
insert_before
|
to add column |
None
|
|
insert_after
|
to add column |
None
|
add_column_with_constant_value(name, value, type_, format_=not_specified, insert_before=None, insert_after=None)
¶
add column name with unique value value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
new column name. |
required | |
value
|
any of accepted types int, float, bool, MzType, RtType, str, PeakMap, Table. |
required | |
type_
|
target column type. |
required | |
format_
|
target column format. By default the default format for
|
not_specified
|
|
insert_before
|
insertion position for the new column. |
None
|
|
insert_after
|
insertion position for the new column. |
None
|
add_enumeration(col_name='id', insert_before=None, insert_after=None, start_with=0)
¶
adds enumerated column as first column to table in place.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_name
|
name of added column. Default name is |
'id'
|
|
insert_before
|
to add column |
None
|
|
insert_after
|
to add column |
None
|
|
start_with
|
start value for creating the ids. default value is 0. |
0
|
add_or_replace_column(name, what, type_=None, format_=not_specified, insert_before=None, insert_after=None)
¶
replaces the content of column name if it exists, else name is added
(in place).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
column name to replace or create. |
required | |
what
|
either a |
required | |
type_
|
target column type. If |
None
|
|
format_
|
target column format. By default the existing format or the
default format for |
not_specified
|
|
insert_before
|
insertion position used when the column is created. |
None
|
|
insert_after
|
insertion position used when the column is created. |
None
|
add_or_replace_column_with_constant_value(name, what, type_=None, format_=not_specified, insert_before=None, insert_after=None)
¶
replaces the content of column name with unique value if name
exists, else name is added (in place).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
column name to replace or create. |
required | |
what
|
scalar value assigned to all rows. |
required | |
type_
|
target column type. If |
None
|
|
format_
|
target column format. By default the existing format or the
default format for |
not_specified
|
|
insert_before
|
insertion position used when the column is created. |
None
|
|
insert_after
|
insertion position used when the column is created. |
None
|
add_row(row)
¶
adds row.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
row
|
list or tuple of values. Length must match. |
required |
apply(function, *args, ignore_nones=True, result_type=None)
¶
allows computing columns using a function with multiple arguments.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
function
|
any function accepting arguments |
required | |
args
|
function arguments. arguments can be column expressions like t['col_name'], or local or global variables accepted by the function. |
()
|
|
ignore_nones
|
since Example: the following code .. code-block:: python prints .. parsed-literal:: |
True
|
collapse(*col_names, new_col_name='collapsed', path=None)
¶
colapses a table by grouping according to columns col_names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_names
|
column names with values defining colapsing groups. |
()
|
|
new_col_name
|
column name of the new column holding the collapsed sub-tables. |
'collapsed'
|
|
path
|
If specified the result will be a Table with a db file backend, else the result will be managed in memory. |
None
|
Returns:
| Type | Description |
|---|---|
|
Example: .. parsed-literal::
results .. parsed-literal:: |
consolidate(path=None, *, overwrite=False)
¶
consolidates if underlying database table is a view.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
If specified the result will be a Table with a db file backend, else the result will be managed in memory. |
None
|
|
overwrite
|
Indicate if an already existing database file should be overwritten. |
False
|
Returns:
| Type | Description |
|---|---|
|
|
create_table(col_names, col_types, col_formats=None, rows=None, title=None, meta_data=None, path=None)
staticmethod
¶
creates a table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_names
|
list or tuple of strings. |
required | |
col_types
|
list of types. |
required | |
col_formats
|
list of formats using format specifiers like "%.2f" If not specified emzed tries to guess appropriate formats based on column type and column name. |
None
|
|
rows
|
list of lists. |
None
|
|
title
|
table title as string. |
None
|
|
meta_data
|
dictionary to manage user defined meta data. |
None
|
|
path
|
path for the db backend, default is |
None
|
Returns:
| Type | Description |
|---|---|
|
|
drop_columns(*col_names)
¶
removes columns in place.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_names
|
column names. either exact names or names containg wild cards
like Example: Table
results |
()
|
extend(other, path=None, overwrite=False)
¶
appends the rows of another compatible table in place.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
table with the same columns, types, and formats. |
required | |
path
|
unused legacy argument kept for API compatibility. |
None
|
|
overwrite
|
unused legacy argument kept for API compatibility. |
False
|
extract_columns(*col_names)
¶
returns new Table with selected columns col_names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_names
|
list or tuple with selected, existing column names. |
()
|
fast_join(other, col_name, col_name_other=None, atol=0.0, rtol=0.0, extra_condition=None, *, path=None, overwrite=False)
¶
joins (combines) two tables based on comparing approximate equality of two numerical columns.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
second table for join. |
required | |
col_name
|
column name to consider. |
required | |
col_name_other
|
column name of other to consider in case it is different
to |
None
|
|
atol
|
absolute tolerance for approximate equlity check. |
0.0
|
|
rtol
|
relative tolerance for approximate equlity check. |
0.0
|
|
extra_condition
|
optional additional join expression that must also match for a row pair to be included. |
None
|
Returns:
| Type | Description |
|---|---|
|
Performance: In case The apprimate equality check for two numbers a and b is: So if you only need comparison based absolute tolerance you can set rtol to 0.0, and if you only need relative tolerance check you can set atol to 0.0. |
fast_left_join(other, col_name, col_name_other=None, atol=0.0, rtol=0.0, extra_condition=None, *, path=None, overwrite=False)
¶
joins (combines) two tables based on comparing approximate equality of two numerical columns.
In contrast to fast_join this method will include also non-matching
rows from self.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
second table for join. |
required | |
col_name
|
column name to consider. |
required | |
col_name_other
|
column name of other to consider in case it is different
to |
None
|
|
atol
|
absolute tolerance for approximate equlity check. |
0.0
|
|
rtol
|
relative tolerance for approximate equlity check. |
0.0
|
|
extra_condition
|
optional additional join expression that must also match for a row pair to be included. |
None
|
Returns:
| Type | Description |
|---|---|
|
Performance: In case The apprimate equality check for two numbers a and b is: So if you only need comparison based absolute tolerance you can set rtol to 0.0, and if you only need relative tolerance check you can set atol to 0.0. |
filter(condition)
¶
creates a new table by filtering rows fulfiling the given condition.
similar use as pandas query.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
condition
|
expression like |
required |
Returns:
| Type | Description |
|---|---|
|
|
from_pandas(df, col_names=None, col_types=None, col_formats=None)
staticmethod
¶
converts pandas data frame into emzed Table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
pandas data frame. |
required | |
col_names
|
list of colum names, can be used to override data frame colum names. |
None
|
|
col_types
|
list of colum types, if not provided emzed determines types from column contents and names. |
None
|
|
col_formats
|
list of colum formats, if not provided emzed determines formats from column contents and names. |
None
|
Returns:
| Type | Description |
|---|---|
|
|
get_column(name)
¶
returns column expression object for column name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
existing column name or You can use |
required |
group_by(*colums, group_nones=False)
¶
return Table group_by object where rows got grouped by columns.
:param columns: table columns i.e t.a, or t['b'].
:param group_nones: ignores rows where group columns are None.
:returns: GroupBy object
Examples: For given Table t
.. parsed-literal::
a b c int int int
0 1 2
1 - 1
2 - 0
2 2 3
t.add_Column('ga', t.group_by(t.a).min(t.c), int) t.add_Column('gb1', t.group_by(t.b).min(t.c), int) t.add_Column('gb2', t.group_by(t.c).min(t.c), int)
print(t)
.. parsed-literal::
a b c ga gb1 gb2 int int int int int int
0 1 2 2 2 2
1 - 1 1 - 0
2 - 0 0 - 0
2 2 3 0 3 3
is_mutable()
¶
returns boolean value to show whether the content of a Table is mutable.
join(other, expression=None, *, path=None, overwrite=False)
¶
joins (combines) two tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
second table for join. |
required | |
expression
|
If |
None
|
Returns:
| Type | Description |
|---|---|
|
Example: if you have two table .. parsed-literal:: and .. parsed-literal:: Then the result of .. parsed-literal:: If you do not provide an expression, this method returns the full cross product. |
left_join(other, expression=None, *, path=None, overwrite=False)
¶
Combines two tables (also known as outer join).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
other
|
Second table for join. |
required | |
expression
|
If |
None
|
Returns:
| Type | Description |
|---|---|
|
If we take the example from Then .. parsed-literal:: |
load(path)
classmethod
¶
loads table from disk into memory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
path to file. |
required |
Returns:
| Type | Description |
|---|---|
|
|
load_csv(path, col_names=None, col_types=None, col_formats=None, *, delimiter=';', dash_is_none=True)
staticmethod
¶
loads csv file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
path to csv file. |
required | |
col_names
|
list of colum names, if not provided first line of csv file is used instead. |
None
|
|
col_types
|
list of colum types, if not provided emzed determines types from column contents and names. |
None
|
|
col_formats
|
list of colum formats, if not provided emzed determines formats from column contents and names. |
None
|
|
delimiter
|
csv delimiter character. |
';'
|
|
dash_is_none
|
cells with '-' are interpreted as None (missing value).
types. In case |
True
|
Returns:
| Type | Description |
|---|---|
|
|
load_excel(path, col_names=None, col_types=None, col_formats=None)
staticmethod
¶
loads excel file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
path to file. |
required | |
col_names
|
list of column names, if not provided first line of .xlsx or .xls file is used instead. |
None
|
|
col_types
|
list of colum types, if not provided emzed determines types from column contents and names. |
None
|
|
col_formats
|
list of colum formats, if not provided emzed determines formats from column contents and names. |
None
|
Returns:
| Type | Description |
|---|---|
|
|
open(path)
classmethod
¶
opens table on disk without loading data into memory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
path to file. |
required |
Returns:
| Type | Description |
|---|---|
|
|
print_(max_rows=30, max_col_width=None, stream=None)
¶
print table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
max_rows
|
Maximum number of rows to display. If the table is longer only head and tail of the table are shown. The missing part is denoted with "...". |
30
|
|
max_col_width
|
If specified the width of columns can be restricted. |
None
|
|
stream
|
file object to redirect printing, e.g. to a file. |
None
|
rename_columns(**from_to)
¶
changes column names from current to new name using key word arguments.
:param from_to: key word arguments like a="b", see example below.
Example: t.rename_columns(a="b") renames column "a" to "b"
rename_postfixes(**from_to)
¶
changes column names from current to new name using key word arguments.
Example:
.. code-block:: python
t = emzed.Table.create_table(
["a", "a__0", "a__1", "b__0", "b__1"],
[int, int, int, int, int],
rows=[[1, 2, 3, 4, 5]]
)
print(t)
t.rename_postfixes(__0="_zero")
print(t)
prints
.. parsed-literal::
a a__0 a__1 b__0 b__1
int int int int int
--- ---- ---- ---- ----
1 2 3 4 5
a a_zero a__1 b_zero b__1
int int int int int
--- ------ ---- ------ ----
1 2 3 4 5
replace_column(name, what, type_=None, format_=not_specified)
¶
replaces content of existing column name in place.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
the name of the exisiting column. |
required | |
what
|
you can use a |
required | |
type_
|
supported colum types are int, float, bool, MzType, RtType, str, PeakMap, Table, object. In case you want to use Python objects like lists or dicts, use column type 'object' instead. |
None
|
|
format_
|
is a format string as "%d" or or an executable string with
python code. To suppress visibility set The column keeps its existing position in the table. |
not_specified
|
replace_column_with_constant_value(name, what, type_=None, format_=not_specified)
¶
replaces the content of column name with unique value what.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
existing column name. |
required | |
what
|
any of accepted types int, float, bool, MzType, RtType, str, PeakMap, Table. |
required | |
type_
|
target column type. If |
None
|
|
format_
|
target column format. By default the existing format or the
default format for |
not_specified
|
save(path, *, overwrite=False)
¶
save table to a file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
path describing target location. |
required | |
overwrite
|
If set to |
False
|
save_csv(path, delimiter=';', as_printed=False, dash_is_none=True, *, overwrite=False)
¶
saves Table as csv in path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
specifies path of the file. The path must end with |
required | |
delimiter
|
Alias for sep. Default value is set to Excel dialect ';'. |
';'
|
|
as_printed
|
If |
False
|
|
dash_is_none
|
if |
True
|
|
overwrite
|
If set to |
False
|
save_excel(path, *, overwrite=False)
¶
saves Table as xls or xlsx in path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
specifies path of the file. The path must end with |
required |
set_col_format(col_name, format_)
¶
sets format of column col_name to format format_.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_name
|
column name. |
required | |
format_
|
accepted column format (see |
required |
Returns:
| Type | Description |
|---|---|
|
|
set_col_type(col_name, type_)
¶
sets type of column col_name to type type_.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_name
|
column name. |
required | |
type_
|
accepted column type (see |
required |
Returns:
| Type | Description |
|---|---|
|
|
set_title(title)
¶
sets the table title.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
title
|
title string stored with the table metadata. |
required |
sort_by(*col_names, ascending=True)
¶
sort table by given column names in given order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_names
|
one or more column names as separate arguments. |
()
|
|
ascending
|
either bool or list/tuple of bools of same number as specified column names. |
True
|
Returns:
| Type | Description |
|---|---|
|
|
split_by(*col_names, keep_view=False)
¶
generates a list of subtables, whereby split columns col_names contain
unique values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_names
|
column names with values defining split groups. |
()
|
Returns:
| Type | Description |
|---|---|
|
a list of sub_tables Example: If we have a table .. parsed-literal::
sub_tables[0] .. parsed-literal:: sub_tables[1] .. parsed-literal:: and subtables[2] .. parsed-literal:: |
split_by_iter(*col_names, keep_view=False)
¶
builds a generator yielding subtables, whereby subtable split columns
col_names contain unique values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_names
|
column names with values defining split groups. |
()
|
Returns:
| Type | Description |
|---|---|
|
a generator object of subtables refering to example table
results .. parsed-literal:: hence the first sub_table of t, corresponding to sub_tables[0] in
|
stack_tables(tables, path=None, overwrite=False)
staticmethod
¶
builds a single Table from list or tuple of Tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tables
|
list or tuple of Tables. All tables must have the same colum names with same types and formats. |
required | |
path
|
If specified the result will be a Table with a db file backend, else the result will be managed in memory. |
None
|
|
overwrite
|
Indicate if an already existing database file should be overwritten. |
False
|
Returns:
| Type | Description |
|---|---|
|
|
supported_postfixes(col_names)
¶
returns common postfixes (endings) of column col_names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col_names
|
list or tuple of column names. |
required |
Returns:
| Type | Description |
|---|---|
|
list of common postfixes. Examples: Assuming a Table with columns
returns
returns
returns |
to_pandas()
¶
converts table to pandas DataFrame object
to_table(name, values, type_, format_=None, title=None, meta_data=None, path=None)
¶
generates a one-column Table from an iterable, e.g. from a list.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
name of the column. |
required | |
values
|
iterable with column values. |
required | |
type_
|
supported colum types are int, float, bool, MzType, RtType, str, PeakMap, Table, object. In case you want to use Python objects like lists or dicts, use column type 'object' instead. |
required | |
format_
|
is a format string as "%d". To suppress visibility set
|
None
|
|
title
|
Table title as string. |
None
|
|
meta_data
|
Python dictionary to assign meta data to the table. |
None
|
|
path
|
Path for the db backend, use |
None
|
Returns:
| Type | Description |
|---|---|
|
|