GENCODE

class openomics.database.sequence.GENCODE(path='ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/', file_resources=None, col_rename=None, npartitions=0, replace_U2T=False, remove_version_num=False)[source][source]

Bases: openomics.database.sequence.SequenceDatabase

Loads the GENCODE database from https://www.gencodegenes.org/ .

Default path: ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/ . Default file_resources: {

“basic.annotation.gtf”: “gencode.v32.basic.annotation.gtf.gz”, “long_noncoding_RNAs.gtf”: “gencode.v32.long_noncoding_RNAs.gtf.gz”, “lncRNA_transcripts.fa”: “gencode.v32.lncRNA_transcripts.fa.gz”, “transcripts.fa”: “gencode.v32.transcripts.fa.gz”,

}

Methods Summary

get_rename_dict([from_index, to_index])

param from_index

get_sequences(index, omic, agg_sequences[, …])

param index

load_dataframe(file_resources[, npartitions])

param file_resources

read_fasta(fasta_file, replace_U2T[, …])

param fasta_file

Methods Documentation

get_rename_dict(from_index='gene_id', to_index='gene_name')[source][source]
Parameters
  • from_index

  • to_index

get_sequences(index, omic, agg_sequences, biotypes=None)[source][source]
Parameters
  • index (str) –

  • omic (str) –

  • agg_sequences (str) –

  • biotypes ([str]) –

load_dataframe(file_resources, npartitions=None)[source][source]
Parameters
  • file_resources

  • npartitions

read_fasta(fasta_file, replace_U2T, npartitions=None)[source][source]
Parameters
  • fasta_file

  • replace_U2T

  • npartitions