MultiOmics#
- class openomics.MultiOmics(cohort_name, omics_data=None)[source][source]#
Bases:
object
A data object which holds multiple -omics data for a single clinical cohort.
Methods Summary
add_clinical_data
(clinical, **kwargs)Add a ClinicalData instance to the MultiOmics instance.
add_omic
(omic_data[, init_annotations])Adds an omic object to the Multiomics such that the samples in omic matches the samples existing in the other omics.
build_samples
([agg_by])Running this function will build a dataframe for all samples across the different omics (either by a union or intersection).
load_data
(omics[, target, ...])Prepare the multiomics data in format
match_samples
(omics)Return the index of sample IDs of the intersection of samples from all modalities
Removes duplicate genes between any omics such that the gene index across all omics has no duplicates.
Methods Documentation
- add_clinical_data(clinical, **kwargs)[source][source]#
Add a ClinicalData instance to the MultiOmics instance.
- Parameters
clinical (openomics.clinical.ClinicalData) –
**kwargs –
- add_omic(omic_data, init_annotations=True)[source][source]#
Adds an omic object to the Multiomics such that the samples in omic matches the samples existing in the other omics.
- Parameters
omic_data (Expression) – The omic to add, e.g., MessengerRNA, MicroRNA, LncRNA, etc.
init_annotations (bool) – default True. If true, initializes the annotation dataframe in the omic object
- build_samples(agg_by='union')[source][source]#
Running this function will build a dataframe for all samples across the different omics (either by a union or intersection). Then,
- Parameters
agg_by (str) – [“union”, “intersection”]
- load_data(omics, target=['pathologic_stage'], pathologic_stages=None, histological_subtypes=None, predicted_subtypes=None, tumor_normal=None, samples_barcode=None, remove_duplicates=True)[source][source]#
Prepare the multiomics data in format
- Parameters
omics (list) – A list of the data modalities to load. Default “all” to select all modalities
target (list) – The clinical data fields to include in the
pathologic_stages (list) – Only fetch samples having certain stages in their corresponding patient’s clinical data. For instance, [“Stage I”, “Stage II”] will only fetch samples from Stage I and Stage II patients. Default is [] which fetches all pathologic stages.
histological_subtypes – A list specifying the histological subtypes to fetch. Default is [] which fetches all histological sybtypes.
predicted_subtypes – A list specifying the histological subtypes to fetch. Default is [] which fetches all histological sybtypes.
tumor_normal – [“Tumor”, “Normal”]. Default is [], which fetches all tumor or normal sample types.
samples_barcode – A list of sample’s barcode. If not None, only fetch data with matching samples provided in this list.
remove_duplicates (bool) – If True, only selects samples with non-duplicated index.
- Returns
Returns (X, y), where X is a dictionary containing the multiomics data with matched samples, and y contain the :param target: labels for those samples.
- Return type
Tuple[Dict[str, pd.DataFrame], pd.DataFrame]