MultiOmics¶
-
class
openomics.multiomics.
MultiOmics
(cohort_name, omics_data=None)[source][source]¶ Bases:
object
A data object which holds multiple -omics data for a single clinical cohort.
Methods Summary
add_clinical_data
(clinical, **kwargs)Add a ClinicalData instance to the MultiOmics instance.
add_omic
(omic_data[, initialize_annotations])Adds an omic object to the Multiomics such that the samples in omic matches the samples existing in the other omics.
annotate_samples
(dictionary)This function adds a “predicted_subtype” field to the patients clinical data.
build_samples
([agg_by])Running this function will build a dataframe for all samples across the different omics (either by a union or intersection).
get_sample_attributes
(matched_samples)Fetch patient’s clinical data for each given samples barcodes in the matched_samples
load_data
(omics[, target, …])- param omics
A list of the data modalities to load. Default “all”
match_samples
(omics)Return the index of bcr_sample_barcodes of the intersection of samples from all modalities
Removes duplicate genes between any omics such that the gene index across all omics has no duplicates.
Methods Documentation
-
add_clinical_data
(clinical, **kwargs)[source][source]¶ Add a ClinicalData instance to the MultiOmics instance.
- Parameters
clinical (
openomics.ClinicalData
) –
-
add_omic
(omic_data, initialize_annotations=True)[source][source]¶ Adds an omic object to the Multiomics such that the samples in omic matches the samples existing in the other omics.
- Parameters
omic_data (Expression) – The omic to add, e.g., MessengerRNA, MicroRNA, LncRNA, etc.
initialize_annotations (bool) – default True. If true, initializes the annotation dataframe in the omic object
-
annotate_samples
(dictionary)[source][source]¶ This function adds a “predicted_subtype” field to the patients clinical data. For instance, patients were classified into subtypes based on their expression profile using k-means, then, to use this function, do:
annotate_patients(dict(zip(patient index>, <list of corresponding patient’s subtypes>)))
Adding a field to the patients clinical data allows openomics to query the patients data through the .load_data(subtypes=[]) parameter,
- Parameters
dictionary – A dictionary mapping patient’s index to a subtype
-
build_samples
(agg_by='union')[source][source]¶ Running this function will build a dataframe for all samples across the different omics (either by a union or intersection). Then,
- Parameters
agg_by (str) – [“union”, “intersection”]
-
get_sample_attributes
(matched_samples)[source][source]¶ Fetch patient’s clinical data for each given samples barcodes in the matched_samples
- Returns
samples_index: Index of samples
- Parameters
matched_samples – A list of sample barcodes
-
load_data
(omics, target=['pathologic_stage'], pathologic_stages=None, histological_subtypes=None, predicted_subtypes=None, tumor_normal=None, samples_barcode=None)[source][source]¶ - Parameters
omics (list) – A list of the data modalities to load. Default “all” to select all modalities
target (list) – The clinical data fields to include in the
pathologic_stages (list) – Only fetch samples having certain stages in their corresponding patient’s clinical data. For instance, [“Stage I”, “Stage II”] will only fetch samples from Stage I and Stage II patients. Default is [] which fetches all pathologic stages.
histological_subtypes – A list specifying the histological subtypes to fetch. Default is [] which fetches all histological sybtypes.
predicted_subtypes – A list specifying the histological subtypes to fetch. Default is [] which fetches all histological sybtypes.
tumor_normal – [“Tumor”, “Normal”]. Default is [], which fetches all tumor or normal sample types.
samples_barcode – A list of sample’s barcode. If not None, only fetch data with matching samples provided in this list.
- Returns
Returns X, a dictionary containing the multiomics data that have data
- Return type
(X, y)