RNAcentral#
- class openomics.database.sequence.RNAcentral(path='https://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/', file_resources=None, col_rename={'GO terms': 'go_id', 'ensembl_gene_id': 'gene_id', 'external id': 'transcript_id', 'gene symbol': 'gene_id'}, species_id=None, index_col='RNAcentral id', keys=None, remove_version_num=True, remove_species_suffix=True, **kwargs)[source][source]#
Bases:
openomics.database.sequence.SequenceDatabase
Loads the RNAcentral database from https://rnacentral.org/ and provides a series of methods to extract sequence data from it.
Default path: https://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/ . Default file_resources: {
“rnacentral_rfam_annotations.tsv”: “go_annotations/rnacentral_rfam_annotations.tsv.gz”, “database_mappings/gencode.tsv”: “id_mapping/database_mappings/gencode.tsv”, “gencode.fasta”: “sequences/by-database/gencode.fasta”, …
}
Attributes Summary
Methods Summary
add_rfam_annotation
(transcripts_df, ...[, ...])- rtype
Union
[DataFrame
,DataFrame
]
get_sequences
([index, omic, agg])- param index
load_dataframe
(file_resources[, blocksize])- param file_resources
load_sequences
(fasta_file[, index, keys, ...])- param index ()
Attributes Documentation
- COLUMNS_RENAME_DICT = {'GO terms': 'go_id', 'ensembl_gene_id': 'gene_id', 'external id': 'transcript_id', 'gene symbol': 'gene_id'}[source]#
Methods Documentation
- add_rfam_annotation(transcripts_df, file_resources, blocksize=None)[source][source]#
- Return type
Union
[DataFrame
,DataFrame
]
- get_sequences(index='RNAcentral id', omic=None, agg='all', **kwargs)[source][source]#
- Parameters
index –
omic –
agg –
**kwargs –