This function computes the partial transcriptome evolutionary index (TEI) values for each single gene.
In detail, each gene gets a TEI contribution profile.
$$TEI_is = f_is * ps_i$$
where TEI_is is the partial TEI value of gene i, \(f_is = e_is / \sum e_is\) and \(ps_i\) is the phylostratum of gene i.
pMatrixTEI(
ExpressionSet,
Phylostratum = NULL,
split = 1e+05,
showprogress = TRUE,
threads = 1
)
expression object with rownames as GeneID (dgCMatrix) or standard PhyloExpressionSet object.
a named vector representing phylostratum per GeneID with names as GeneID (not used if Expression is PhyloExpressionSet).
specify number of columns to split
boolean if progressbar should be shown
specify number of threads
a numeric sparse matrix storing the partial TEI values for each gene.
The partial TEI matrix can be used to perform different cluster
analyses and also gives an overall impression of the contribution of each
gene to the global TEI
pattern.
Domazet-Loso T. and Tautz D. (2010). A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature (468): 815-818.
Quint M et al. (2012). A transcriptomic hourglass in plant embryogenesis. Nature (490): 98-101.
Drost HG et al. (2015) Mol Biol Evol. 32 (5): 1221-1231 doi:10.1093/molbev/msv012
## get Seurat object
celegans<-readRDS(file=system.file("extdata",
"celegans.embryo.SeuratData.rds", package="scTEI")
)
## load Caenorhabditis elegans gene age estimation
celegans_ps<-readr::read_tsv(
file=system.file("extdata",
"Sun2021_Orthomap.tsv", package="scTEI")
)
#> Rows: 20040 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): GeneID
#> dbl (1): Phylostratum
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## define Phylostratum
ps_vec<-setNames(
as.numeric(celegans_ps$Phylostratum),
celegans_ps$GeneID
)
## get partial TEI values
Seurat::Idents(celegans)<-"embryo.time.bin"
pM<-pMatrixTEI(
ExpressionSet=Seurat::GetAssayData(celegans, assay="RNA", layer="counts"),
Phylostratum=ps_vec
)