Skip to contents

This function extracts the gene position from either NCBI or ENSEMBL CDS input.

Usage

cds2genepos(cds, source = "NCBI", keep.names = NULL)

Arguments

cds

DNAStringSet [mandatory]

source

source indicating either NCBI or ENSEMBL [default: NCBI]

keep.names

vector indicating gene ids to be kept before chromosomal position assignment [default: NULL]

Value

matrix 1: $gene.seq.id
2: $gene.chr
3: $gene.start
4: $gene.end
5: $gene.mid
6: $gene.strand
7: $gene.idx

See also

Author

Kristian K Ullrich

Examples

## load example sequence data
data("ath", package="CRBHits")
ath.genepos <- cds2genepos(
    cds=ath,
    source="ENSEMBL")
if (FALSE) { # \dontrun{
## load example sequence data
## set EnsemblPlants URL
ensemblPlants <- "ftp://ftp.ensemblgenomes.org/pub/plants/release-48/fasta/"
## set Arabidopsis thaliana CDS URL
ARATHA.cds.url <- paste0(ensemblPlants,
    "arabidopsis_thaliana/cds/Arabidopsis_thaliana.TAIR10.cds.all.fa.gz")
ARATHA.cds.file <- tempfile()
## download CDS
download.file(ARATHA.cds.url, ARATHA.cds.file, quiet=FALSE)
ARATHA.cds <- Biostrings::readDNAStringSet(ARATHA.cds.file)
## get gene position
ARATHA.cds.genepos <- cds2genepos(ARATHA.cds, "ENSEMBL")
} # }