scTEI
vignettes/scTEI.Rmd
scTEI.Rmd
Abstract
scTEI add any phylogenetically based transcriptome evolutionary index (TEI) to single-cell data objects
The goal of scTEI
is to provide easy functionality to
add phylogenetically based transcriptome evolutionary index (TEI) to
single-cell data objects.
For a comprehensive overview about the topic of gene age assignments, transcriptome age index (TAI) and its derivates (TDI, TPI, Adjusted SD, PhastCons, …) see e.g.:
Since all of these values deal with the topic of weigthing transcriptome data with an evolutionary index, I combine all of them under one term transcriptome evolutionary index, short TEI.
The following sections introduce main single-cell gene
expression data analysis techniques implemented in scTEI
,
which basically update basic myTAI
functions to deal with
large single-cell data objects:
Adding transcriptome evolutionary index (TEI) to single-cell data objects
Seurat
monocle3
Visualizing TEI values
A variety of data objects are used to deal with sparse count data
(dgCMatrix
) from single-cell RNA sequencing
(scRNA-seq).
Here, we focus on the common used R packages Seurat
and
monocle3
, which both use a sparseMatrix
object
to store scRNA count data.
Note: Apparently, one important note is that count data or subsequent normalized counts or scaled data needs to be positive.
One should consider transforming scaled data to positive scale prior applying TEI calculations.
In this section we will introduce how to add TEI values to either a
seurat
or monocle3
cell data set.
Note that prior calculating TEI one needs to retrieve phylogenetic or taxonomic information for your focal species.
This might be a phylostratigraphic map (as introduced by (Domazet-Lošo, Brajković, and Tautz 2007)) or an ortho map (I call them, see e.g. (Julca et al. 2021) or (Cazet et al. 2022)), which can be obtained by assigning to each orthogroup (OG) or hierachical orthogroup (HOG) along a given species tree the ancestral node.
Please have a look at the introduction of the great
myTAI
package for possible sources, how to get such
phylostratigraphic maps (myTAI
- Introduction).
To create an ortho map, one simply needs to get OGs or HOGs with e.g. OrthoFinder ((Emms and Kelly 2019)) or Proteinortho ((Lechner et al. 2011)) or any other ortholog prediction tool (see (Linard et al. 2021)) using a set of species that cover the species range of your interest. Parse each OG or HOG for the oldest clade as compared to a species tree and your focal species of interest.
Or e.g. use pre-calculated OGs from e.g. https://omabrowser.org/oma/home ((Schneider, Dessimoz, and Gonnet 2007)) or https://bioinformatics.psb.ugent.be/plaza/ for plants ((Proost et al. 2009)).
Please visit the following documentation to install
Seurat
and monocole3
org.Hs.eg.db
and org.Mm.eg.db
to
be able to convert ensembl IDs into gene name alias
BiocManager::install(
c(
"org.Hs.eg.db",
"org.Mm.eg.db")
)
ggplot2 viridis cowplot ComplexHeatmap
install.packages("ggplot2")
install.packages("viridis")
install.packages("cowplot")
BiocManager::install("ComplexHeatmap")
For Seurat
, we use scRNA data obtained from
SeuratData
for the model organism Caenorhabditis
elegans (Packer et al. 2019) and
phylogenetic data obtained from (Sun,
Rödelsperger, and Sommer 2021)
(Sun2021_Supplemental_Table_S6).
Here, for demonstration purposes, we will be using the 6k
Caenorhabditis elegans embryos object that is available via the
SeuratData
package.
# load Packer and Zhu et al (2019) data set
SeuratData::InstallData("celegans.embryo.SeuratData")
library(celegans.embryo.SeuratData)
data("celegans.embryo")
celegans <- UpdateSeuratObject(celegans.embryo)
#celegans <- SeuratData::LoadData("celegans.embryo")
dim(celegans)
## [1] 20222 6188
## [1] "WBGene00010957" "WBGene00010958" "WBGene00010959" "WBGene00010960"
## [5] "WBGene00010961" "WBGene00000829"
# preprocess scRNA
all.genes <- rownames(celegans)
celegans <- Seurat::NormalizeData(
celegans,
normalization.method = "LogNormalize",
scale.factor = 10000) |>
Seurat::FindVariableFeatures(selection.method = "vst",
nfeatures = 2000) |>
Seurat::ScaleData(features = all.genes) |>
Seurat::RunPCA(dims=50) |>
Seurat::RunUMAP(dims = 1:10)
# overwrite NA in cell.type + add embryo.time.bin x cell.type
celegans@meta.data$cell.type[is.na(celegans@meta.data$cell.type)] <- "notClassified"
celegans@meta.data["embryo.time.bin.cell.type"] <- paste0(
unlist(celegans@meta.data["embryo.time.bin"]),
"-",
unlist(celegans@meta.data["cell.type"]))
# load Caenorhabditis elegans gene age estimation
celegans_ps <- readr::read_tsv(file = system.file("extdata",
"Sun2021_Orthomap.tsv", package = "scTEI"))
table(celegans_ps$Phylostratum)
##
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 1290 5434 4666 603 1039 808 274 884 590 511 384 1113 1277 1167
# define Phylostratum
ps_vec <- setNames(as.numeric(celegans_ps$Phylostratum),
celegans_ps$GeneID)
# add TEI values
celegans@meta.data["TEI"] <- TEI(
ExpressionSet = GetAssayData(celegans, assay="RNA", layer="counts"),
Phylostratum = ps_vec
)
## Use multiple threads to calculate TEI on sparseMatrix
celegans@meta.data["TEI"] <- TEI(
ExpressionSet = GetAssayData(celegans, assay="RNA", layer="counts"),
Phylostratum = ps_vec,
split = 1000,
threads = 2)
# make FeaturePlot
p2 <- FeaturePlot(
object = celegans,
features = "TEI",
min.cutoff='q05',
max.cutoff='q95',
cols = viridis(3))
print(p2)
# make RidgePlot
p3 <- RidgePlot(object = celegans,
features = "TEI",
group.by = "embryo.time.bin")
print(p3)
## Picking joint bandwidth of 0.0512
# make RidgePlot by cell type
Seurat::Idents(celegans) <- "cell.type"
p4 <- RidgePlot(object = celegans,
features = "TEI",
group.by = "cell.type") +
Seurat::NoLegend()
print(p4)
## Picking joint bandwidth of 0.054
# subset to specific cell type - ADF + notClassified
ADF <- subset(celegans,
cells = c(grep("ADF",
celegans@meta.data$cell.type),
grep("notClassified",
celegans@meta.data$cell.type)))
p5 <- DimPlot(object = ADF)
Seurat::Idents(ADF) <- "embryo.time.bin"
p6 <- DimPlot(object = ADF)
p7 <- RidgePlot(ADF, "TEI")
Seurat::Idents(ADF) <- "embryo.time.bin.cell.type"
p8 <- RidgePlot(object = ADF, features = "TEI") +
Seurat::NoLegend()
# make grid plot
print(plot_grid(p5, p6))
print(plot_grid(p7, p8))
## Picking joint bandwidth of 0.0608
## Picking joint bandwidth of 0.0704
# use pMatrix as data to cluster
# Seurat v4
#celegans.TEI <- Seurat::CreateSeuratObject(counts = celegans@assays$RNA@counts,
# meta.data = celegans@meta.data, row.names = rownames(celegans@assays$RNA@counts))
#celegans.TEI@assays$RNA@data <- pMatrixTEI(
# ExpressionSet = celegans.TEI@assays$RNA@counts,
# Phylostratum = ps_vec
#)
# Seurat v5
celegans.TEI <- Seurat::CreateSeuratObject(
counts = GetAssayData(celegans, assay="RNA", layer="counts"),
meta.data = celegans@meta.data)
celegans.TEI <- SetAssayData(
object = celegans.TEI,
layer = "data",
new.data = pMatrixTEI(
ExpressionSet = GetAssayData(celegans.TEI, assay="RNA", layer="counts"),
Phylostratum = ps_vec
)
)
all.genes <- rownames(GetAssayData(celegans.TEI, assay="RNA", layer="data"))
celegans.TEI <- Seurat::FindVariableFeatures(
celegans.TEI,
selection.method = "vst",
nfeatures = 2000) %>%
Seurat::ScaleData(do.scale = FALSE, do.center = FALSE,
features = all.genes) %>%
Seurat::RunPCA(dims=50) %>%
Seurat::RunUMAP(dims = 1:20)
## Finding variable features for layer counts
## Warning in PrepDR5(object = object, features = features, layer = layer, : The
## following features were not available: WBGene00018440, WBGene00219374,
## WBGene00008328, WBGene00008881, WBGene00020009, WBGene00010424, WBGene00015276,
## WBGene00219372.
## PC_ 1
## Positive: WBGene00006399, WBGene00194983, WBGene00012214, WBGene00017584, WBGene00022648, WBGene00020261, WBGene00021804, WBGene00269431, WBGene00012254, WBGene00045208
## WBGene00015774, WBGene00012301, WBGene00219421, WBGene00006051, WBGene00219280, WBGene00016724, WBGene00044562, WBGene00018234, WBGene00044502, WBGene00045272
## WBGene00022395, WBGene00018850, WBGene00194680, WBGene00206383, WBGene00022339, WBGene00195086, WBGene00043054, WBGene00077773, WBGene00020775, WBGene00206507
## Negative: WBGene00002085, WBGene00009212, WBGene00018352, WBGene00015610, WBGene00022112, WBGene00044066, WBGene00018606, WBGene00016071, WBGene00012899, WBGene00013553
## WBGene00002253, WBGene00014149, WBGene00020537, WBGene00016711, WBGene00020205, WBGene00016457, WBGene00012420, WBGene00021299, WBGene00017713, WBGene00011156
## WBGene00019039, WBGene00000217, WBGene00077531, WBGene00009861, WBGene00014156, WBGene00000468, WBGene00010100, WBGene00013098, WBGene00009180, WBGene00019918
## PC_ 2
## Positive: WBGene00002085, WBGene00002086, WBGene00015081, WBGene00019516, WBGene00001534, WBGene00006047, WBGene00006399, WBGene00194983, WBGene00012214, WBGene00022648
## WBGene00020261, WBGene00017584, WBGene00021804, WBGene00269431, WBGene00012254, WBGene00045208, WBGene00015774, WBGene00012301, WBGene00219280, WBGene00006051
## WBGene00219421, WBGene00044562, WBGene00016724, WBGene00018234, WBGene00044502, WBGene00022395, WBGene00045272, WBGene00018850, WBGene00206383, WBGene00194680
## Negative: WBGene00018352, WBGene00018606, WBGene00009212, WBGene00015610, WBGene00016071, WBGene00016711, WBGene00021915, WBGene00017611, WBGene00014149, WBGene00007641
## WBGene00017326, WBGene00019039, WBGene00016463, WBGene00016297, WBGene00007883, WBGene00012420, WBGene00011763, WBGene00020772, WBGene00044066, WBGene00016457
## WBGene00045386, WBGene00019482, WBGene00013553, WBGene00021272, WBGene00022112, WBGene00020537, WBGene00009983, WBGene00020205, WBGene00002253, WBGene00012899
## PC_ 3
## Positive: WBGene00018352, WBGene00009212, WBGene00018606, WBGene00016071, WBGene00015610, WBGene00019039, WBGene00014149, WBGene00044066, WBGene00020205, WBGene00020537
## WBGene00022112, WBGene00013553, WBGene00017611, WBGene00009861, WBGene00077531, WBGene00002253, WBGene00021299, WBGene00019035, WBGene00017713, WBGene00012899
## WBGene00000217, WBGene00014156, WBGene00007641, WBGene00045183, WBGene00016578, WBGene00020362, WBGene00022557, WBGene00007934, WBGene00019591, WBGene00019703
## Negative: WBGene00045386, WBGene00016463, WBGene00003890, WBGene00007883, WBGene00194921, WBGene00009983, WBGene00045336, WBGene00011500, WBGene00016711, WBGene00017068
## WBGene00003642, WBGene00021915, WBGene00010864, WBGene00011763, WBGene00010848, WBGene00008484, WBGene00021272, WBGene00004346, WBGene00003934, WBGene00011869
## WBGene00016254, WBGene00044020, WBGene00022441, WBGene00003851, WBGene00019177, WBGene00005246, WBGene00019482, WBGene00021395, WBGene00002085, WBGene00015901
## PC_ 4
## Positive: WBGene00018352, WBGene00002085, WBGene00017611, WBGene00021915, WBGene00007641, WBGene00016297, WBGene00011763, WBGene00020772, WBGene00016463, WBGene00019482
## WBGene00007883, WBGene00015062, WBGene00044068, WBGene00016254, WBGene00007490, WBGene00007675, WBGene00011500, WBGene00044419, WBGene00021272, WBGene00016532
## WBGene00009983, WBGene00019177, WBGene00022419, WBGene00010632, WBGene00017184, WBGene00044232, WBGene00003642, WBGene00020196, WBGene00016005, WBGene00004346
## Negative: WBGene00009212, WBGene00044066, WBGene00015610, WBGene00022112, WBGene00018606, WBGene00002253, WBGene00016071, WBGene00012899, WBGene00013553, WBGene00016457
## WBGene00020537, WBGene00021299, WBGene00014149, WBGene00011156, WBGene00019039, WBGene00020205, WBGene00017713, WBGene00014156, WBGene00007934, WBGene00010100
## WBGene00019918, WBGene00016578, WBGene00077531, WBGene00009861, WBGene00045183, WBGene00011475, WBGene00020362, WBGene00022557, WBGene00008510, WBGene00016520
## PC_ 5
## Positive: WBGene00045386, WBGene00018352, WBGene00003890, WBGene00016071, WBGene00015610, WBGene00010848, WBGene00018606, WBGene00019039, WBGene00020205, WBGene00020537
## WBGene00005246, WBGene00194921, WBGene00019416, WBGene00013553, WBGene00009861, WBGene00022112, WBGene00010693, WBGene00017713, WBGene00019035, WBGene00077531
## WBGene00021299, WBGene00006778, WBGene00045183, WBGene00021773, WBGene00045247, WBGene00022617, WBGene00001242, WBGene00020785, WBGene00009862, WBGene00022063
## Negative: WBGene00016463, WBGene00045336, WBGene00016711, WBGene00007883, WBGene00021915, WBGene00009983, WBGene00011763, WBGene00016254, WBGene00017068, WBGene00044020
## WBGene00011500, WBGene00003934, WBGene00003642, WBGene00016297, WBGene00004346, WBGene00002105, WBGene00011869, WBGene00020582, WBGene00019177, WBGene00010864
## WBGene00022441, WBGene00019482, WBGene00003745, WBGene00018164, WBGene00017076, WBGene00164973, WBGene00017236, WBGene00003851, WBGene00044232, WBGene00010632
## 13:17:23 UMAP embedding parameters a = 0.9922 b = 1.112
## 13:17:23 Read 6188 rows and found 20 numeric columns
## 13:17:23 Using Annoy for neighbor search, n_neighbors = 30
## 13:17:23 Building Annoy index with metric = cosine, n_trees = 50
## 0% 10 20 30 40 50 60 70 80 90 100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 13:17:23 Writing NN index file to temp file /var/folders/wy/d_ysgzdd3_bdgz02jfdy12z00000gp/T//RtmpqwpWAk/filee94538642fb1
## 13:17:23 Searching Annoy index using 1 thread, search_k = 3000
## 13:17:24 Annoy recall = 100%
## 13:17:24 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
## 13:17:25 Initializing from normalized Laplacian + noise (using RSpectra)
## 13:17:25 Commencing optimization for 500 epochs, with 252246 positive edges
## 13:17:25 Using rng type: pcg
## 13:17:31 Optimization finished
## Use multiple threads to calculate pMatrix
# Seurat v4
#celegans.TEI@assays$RNA@data <- pMatrixTEI(
# ExpressionSet = celegans.TEI@assays$RNA@counts,
# Phylostratum = ps_vec,
# split = 1000,
# threads = 2)
# Seurat v5
celegans.TEI <- SetAssayData(
object = celegans.TEI,
layer = "data",
new.data = pMatrixTEI(
ExpressionSet = GetAssayData(celegans.TEI, assay="RNA", layer="counts"),
Phylostratum = ps_vec,
split = 1000,
threads = 2
)
)
# make FeaturePlot
p10 <- FeaturePlot(
object = celegans.TEI,
features = "TEI",
min.cutoff='q05',
max.cutoff='q95',
cols = viridis(3))
print(p10)
Seurat::Idents(celegans) <- "cell.type"
p11 <- DimPlot(celegans)
Seurat::Idents(celegans.TEI) <- "cell.type"
p12 <- DimPlot(celegans.TEI)
# make grid plot
print(plot_grid(p2, p11))
print(plot_grid(p10, p12))
# get TEI per strata
pS <- pStrataTEI(
ExpressionSet = GetAssayData(celegans, assay="RNA", layer="counts"),
Phylostratum = ps_vec
)
## Use multiple threads to calculate pStrata
# Seurat v4
#pS <- pStrataTEI(
# ExpressionSet = celegans@assays$RNA@counts,
# Phylostratum =
# setNames(as.numeric(celegans_ps$Phylostratum),
# celegans_ps$GeneID),
# split = 1000,
# threads = 2)
# Seurat v5
pS <- pStrataTEI(
ExpressionSet = GetAssayData(celegans, assay="RNA", layer="counts"),
Phylostratum =
setNames(as.numeric(celegans_ps$Phylostratum),
celegans_ps$GeneID),
split = 1000,
threads = 2)
# get permutations
# Seurat v4
#bM <- bootTEI(
# ExpressionSet = celegans@assays$RNA@counts,
# Phylostratum = ps_vec,
# permutations = 100
#)
# Seurat v5
bM <- bootTEI(
ExpressionSet = GetAssayData(celegans, assay="RNA", layer="counts"),
Phylostratum = ps_vec,
permutations = 100
)
## Use multiple threads to get permutations
# Seurat v4
#bM <- bootTEI(
# ExpressionSet = celegans@assays$RNA@counts,
# Phylostratum = ps_vec,
# permutations = 100,
# split = 1000,
# threads = 2
#)
# Seurat v5
bM <- bootTEI(
ExpressionSet = GetAssayData(celegans, assay="RNA", layer="counts"),
Phylostratum = ps_vec,
permutations = 100,
split = 1000,
threads = 2
)
# get mean expression matrix
# Seurat v4
#meanMatrix <- REMatrix(
# ExpressionSet = celegans@assays$RNA@data,
# Phylostratum = ps_vec
#)
# Seurat v5
meanMatrix <- REMatrix(
ExpressionSet = GetAssayData(celegans, assay="RNA", layer="data"),
Phylostratum = ps_vec
)
# get mean expression matrix with groups
cell_groups <- setNames(
lapply(names(table(celegans@meta.data$cell.type)),
function(x){which(celegans@meta.data$cell.type==x)}),
names(table(celegans@meta.data$cell.type))
)
# Seurat v4
#meanMatrix_by_cell.type <- REMatrix(
# ExpressionSet = celegans@assays$RNA@scale.data,
# Phylostratum = ps_vec,
# groups = cell_groups
#)
# Seurat v5
meanMatrix_by_cell.type <- REMatrix(
ExpressionSet = GetAssayData(celegans, assay="RNA", layer="scale.data"),
Phylostratum = ps_vec,
groups = cell_groups
)
ComplexHeatmap::Heatmap(meanMatrix_by_cell.type,
cluster_rows = FALSE, cluster_columns = FALSE,
col = viridis::viridis(3))
# computing relative expression profile over cell types
cell_groups <- setNames(
lapply(names(table(celegans@meta.data$cell.type)),
function(x){which(celegans@meta.data$cell.type==x)}),
names(table(celegans@meta.data$cell.type))
)
# Seurat v4
#reMatrix_by_cell.type <- REMatrix(
# ExpressionSet = celegans@assays$RNA@scale.data,
# Phylostratum = ps_vec,
# groups = cell_groups,
# by = "row"
#)
# Seurat v5
reMatrix_by_cell.type <- REMatrix(
ExpressionSet = GetAssayData(celegans, assay="RNA", layer="scale.data"),
Phylostratum = ps_vec,
groups = cell_groups,
by = "row"
)
ComplexHeatmap::Heatmap(reMatrix_by_cell.type,
cluster_rows = FALSE)
For Monocle3
, we again use scRNA data obtained from
(Packer et al. 2019) for the model
organism Caenorhabditis elegans (Packer
et al. 2019) and phylogenetic data obtained from (Sun, Rödelsperger, and Sommer 2021)
(Sun2021_Supplemental_Table_S6).
# load Packer and Zhu et al (2019) data set
#expression_matrix <- readRDS(
# url(
# paste0("http://staff.washington.edu/hpliner/data/",
# "packer_embryo_expression.rds")
# )
#)
#cell_metadata <- readRDS(
# url(
# paste0("http://staff.washington.edu/hpliner/data/",
# "packer_embryo_colData.rds")
# )
#)
#gene_annotation <- readRDS(
# url(
# paste0("http://staff.washington.edu/hpliner/data/",
# "packer_embryo_rowData.rds")
# )
#)
packer_embryo_expression_path <- system.file("extdata",
"packer_embryo_expression.rds",
package = "scTEI")
expression_matrix <- readRDS(packer_embryo_expression_path)
packer_embryo_colData_path <- system.file("extdata",
"packer_embryo_colData.rds",
package = "scTEI")
cell_metadata <- readRDS(packer_embryo_colData_path)
packer_embryo_rowData_path <- system.file("extdata",
"packer_embryo_rowData.rds",
package = "scTEI")
gene_annotation <- readRDS(packer_embryo_rowData_path)
cds <- new_cell_data_set(
expression_data = expression_matrix,
cell_metadata = cell_metadata,
gene_metadata = gene_annotation
)
# preprocess scRNA
cds <- preprocess_cds(cds, num_dim = 50)
cds <- align_cds(cds, alignment_group = "batch",
residual_model_formula_str = "~ bg.300.loading +
bg.400.loading + bg.500.1.loading + bg.500.2.loading +
bg.r17.loading + bg.b01.loading + bg.b02.loading")
cds <- reduce_dimension(cds)
cds <- cluster_cells(cds)
cds <- learn_graph(cds)
## | | | 0% | |======================================================================| 100%
p1 <- plot_cells(cds,
label_groups_by_cluster=FALSE,
color_cells_by = "embryo.time.bin",
group_label_size = 5,
label_cell_groups = FALSE,
label_leaves = TRUE,
label_branch_points = TRUE,
graph_label_size=1.5)
print(p1)
# order cells - select youngest time point according to embryo.time
# - center of the plot
cds <- order_cells(cds,
root_cells = colnames(cds)[which(
colData(cds)$embryo.time == min(colData(cds)$embryo.time))])
colData(cds)["pseudotime"] <- pseudotime(cds)
# plot by pseudotime
p2 <- plot_cells(cds,
label_groups_by_cluster=FALSE,
color_cells_by = "pseudotime",
group_label_size = 5,
label_cell_groups = FALSE,
label_leaves = TRUE,
label_branch_points = TRUE,
graph_label_size=1.5)
print(p2)
# load Caenorhabditis elegans gene age estimation
celegans_ps <- readr::read_tsv(file = system.file("extdata",
"Sun2021_Orthomap.tsv", package = "scTEI"))
## Rows: 20040 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (1): GeneID
## dbl (1): Phylostratum
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
table(celegans_ps$Phylostratum)
##
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 1290 5434 4666 603 1039 808 274 884 590 511 384 1113 1277 1167
# define Phylostratum
ps_vec <- setNames(as.numeric(celegans_ps$Phylostratum),
celegans_ps$GeneID)
# add TEI values
colData(cds)["TEI"] <- TEI(
ExpressionSet = counts(cds),
Phylostratum = ps_vec
)
## Use multiple threads to calculate TEI on sparseMatrix
colData(cds)["TEI"] <- TEI(ExpressionSet = counts(cds),
Phylostratum =
setNames(as.numeric(celegans_ps$Phylostratum),
celegans_ps$GeneID),
split = 1000,
threads = 2)
# make FeaturePlot
p3 <- plot_cells(cds,
label_groups_by_cluster=FALSE,
color_cells_by = "TEI",
group_label_size = 5,
label_cell_groups = FALSE,
label_leaves = TRUE,
label_branch_points = TRUE,
graph_label_size=1.5,
cell_size = 1)
print(p3)
# make Boxplot
p4 <- ggplot2::ggplot(data.frame(colData(cds)),
aes(x=embryo.time.bin, y=TEI, fill=embryo.time.bin)) +
geom_violin() +
geom_boxplot(width=0.1)
print(p4)
# make scatter plot - TEI vs pseudotime
p5 <- ggplot(data.frame(colData(cds)),
aes(x=pseudotime, y=TEI, col=pseudotime)) +
geom_point()
print(p5)
p6 <- ggplot(data.frame(colData(cds)),
aes(x=pseudotime, y=TEI, col=embryo.time.bin)) +
geom_point()
print(p6)
# make grid plot
print(plot_grid(p1, p2, p3, p4))
## R version 4.4.2 (2024-10-31)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.2
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Europe/Berlin
## tzcode source: internal
##
## attached base packages:
## [1] stats4 grid stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] monocle3_1.3.7 SingleCellExperiment_1.28.1
## [3] SummarizedExperiment_1.36.0 GenomicRanges_1.58.0
## [5] GenomeInfoDb_1.42.3 IRanges_2.40.1
## [7] S4Vectors_0.44.0 MatrixGenerics_1.18.1
## [9] matrixStats_1.5.0 Biobase_2.66.0
## [11] BiocGenerics_0.52.0 celegans.embryo.SeuratData_0.1.0
## [13] SeuratData_0.2.2.9001 Seurat_5.2.1
## [15] SeuratObject_5.0.2 sp_2.2-0
## [17] ComplexHeatmap_2.22.0 ggplot2_3.5.1
## [19] cowplot_1.1.3 viridis_0.6.5
## [21] viridisLite_0.4.2 readr_2.1.5
## [23] plyr_1.8.9 dplyr_1.1.4
## [25] scTEI_0.1.1 BiocStyle_2.34.0
##
## loaded via a namespace (and not attached):
## [1] fs_1.6.5 spatstat.sparse_3.1-0
## [3] httr_1.4.7 RColorBrewer_1.1-3
## [5] doParallel_1.0.17 tools_4.4.2
## [7] sctransform_0.4.1 ResidualMatrix_1.16.0
## [9] R6_2.6.1 lazyeval_0.2.2
## [11] uwot_0.2.3 GetoptLong_1.0.5
## [13] withr_3.0.2 gridExtra_2.3
## [15] myTAI_2.0.0 progressr_0.15.1
## [17] taxizedb_0.3.1 cli_3.6.4
## [19] textshaping_1.0.0 Cairo_1.6-2
## [21] spatstat.explore_3.3-4 fastDummies_1.7.5
## [23] labeling_0.4.3 sass_0.4.9
## [25] spatstat.data_3.1-4 proxy_0.4-27
## [27] ggridges_0.5.6 pbapply_1.7-2
## [29] pkgdown_2.1.1 systemfonts_1.2.1
## [31] parallelly_1.42.0 limma_3.62.2
## [33] rstudioapi_0.17.1 RSQLite_2.3.9
## [35] generics_0.1.3 shape_1.4.6.1
## [37] ica_1.0-3 spatstat.random_3.3-2
## [39] vroom_1.6.5 Matrix_1.7-2
## [41] ggbeeswarm_0.7.2 abind_1.4-8
## [43] lifecycle_1.0.4 yaml_2.3.10
## [45] SparseArray_1.6.2 Rtsne_0.17
## [47] blob_1.2.4 promises_1.3.2
## [49] crayon_1.5.3 miniUI_0.1.1.1
## [51] lattice_0.22-6 beachmat_2.22.0
## [53] magick_2.8.5 pillar_1.10.1
## [55] knitr_1.49 rjson_0.2.23
## [57] boot_1.3-31 future.apply_1.11.3
## [59] codetools_0.2-20 glue_1.8.0
## [61] leidenbase_0.1.32 spatstat.univar_3.1-1
## [63] data.table_1.17.0 vctrs_0.6.5
## [65] png_0.1-8 spam_2.11-1
## [67] Rdpack_2.6.2 gtable_0.3.6
## [69] assertthat_0.2.1 cachem_1.1.0
## [71] xfun_0.51 rbibutils_2.3
## [73] S4Arrays_1.6.0 mime_0.12
## [75] reformulas_0.4.0 survival_3.8-3
## [77] iterators_1.0.14 statmod_1.5.0
## [79] fitdistrplus_1.2-2 ROCR_1.0-11
## [81] nlme_3.1-167 bit64_4.6.0-1
## [83] RcppAnnoy_0.0.22 bslib_0.9.0
## [85] irlba_2.3.5.1 vipor_0.4.7
## [87] KernSmooth_2.23-26 colorspace_2.1-1
## [89] DBI_1.2.3 ggrastr_1.0.2
## [91] tidyselect_1.2.1 bit_4.5.0.1
## [93] compiler_4.4.2 curl_6.2.1
## [95] BiocNeighbors_2.0.1 desc_1.4.3
## [97] DelayedArray_0.32.0 plotly_4.10.4
## [99] bookdown_0.42 scales_1.3.0
## [101] lmtest_0.9-40 rappdirs_0.3.3
## [103] stringr_1.5.1 digest_0.6.37
## [105] goftest_1.2-3 spatstat.utils_3.1-2
## [107] minqa_1.2.8 rmarkdown_2.29
## [109] XVector_0.46.0 htmltools_0.5.8.1
## [111] pkgconfig_2.0.3 lme4_1.1-36
## [113] sparseMatrixStats_1.18.0 dbplyr_2.5.0
## [115] fastmap_1.2.0 rlang_1.1.5
## [117] GlobalOptions_0.1.2 htmlwidgets_1.6.4
## [119] UCSC.utils_1.2.0 shiny_1.10.0
## [121] DelayedMatrixStats_1.28.1 farver_2.1.2
## [123] jquerylib_0.1.4 zoo_1.8-13
## [125] jsonlite_1.9.0 BiocParallel_1.40.0
## [127] BiocSingular_1.22.0 magrittr_2.0.3
## [129] scuttle_1.16.0 GenomeInfoDbData_1.2.13
## [131] dotCall64_1.2 patchwork_1.3.0
## [133] munsell_0.5.1 Rcpp_1.0.14
## [135] reticulate_1.41.0 stringi_1.8.4
## [137] zlibbioc_1.52.0 MASS_7.3-64
## [139] parallel_4.4.2 listenv_0.9.1
## [141] ggrepel_0.9.6 deldir_2.0-4
## [143] splines_4.4.2 tensor_1.5
## [145] hms_1.1.3 circlize_0.4.16
## [147] igraph_2.1.4 spatstat.geom_3.3-5
## [149] RcppHNSW_0.6.0 ScaledMatrix_1.14.0
## [151] reshape2_1.4.4 evaluate_1.0.3
## [153] hoardr_0.5.5 BiocManager_1.30.25
## [155] batchelor_1.22.0 nloptr_2.1.1
## [157] tzdb_0.4.0 foreach_1.5.2
## [159] httpuv_1.6.15 RANN_2.6.2
## [161] tidyr_1.3.1 purrr_1.0.4
## [163] polyclip_1.10-7 future_1.34.0
## [165] clue_0.3-66 scattermore_1.2
## [167] rsvd_1.0.5 xtable_1.8-4
## [169] RSpectra_0.16-2 later_1.4.1
## [171] ragg_1.3.3 tibble_3.2.1
## [173] memoise_2.0.1 beeswarm_0.4.0
## [175] cluster_2.1.8 globals_0.16.3