有趣的数据可视化:R语言ggplot2包画云雨图展示不同地质时代恐龙的体长

数据来源

https://www.kaggle.com/datasets/kjanjua/jurassic-park-the-exhaustive-dinosaur-dataset?resource=download部分数据截图
图片

image.png
读取数据对数据进行预处理

library(tidyverse)

read_csv(“2024.data/20240421/data.csv”) %>%
mutate(X1=str_extract(period,”^[A-z]+ [A-z]+”),
X2=str_extract(period,”[0-9]+”) %>% as.numeric(),
X3=str_extract(period,”-[0-9]+”) %>%
str_replace(“-“,””) %>% as.numeric()) %>%
mutate(X4=(X2+X3)/2) %>%
mutate(X5=str_replace(length,”m”,””) %>%
as.numeric()) %>%
na.omit() -> dat

dat %>% pull(X1) %>% table()
这里的地质时期有
Late Triassic 三叠纪晚期 Early Jurassic 侏罗纪早期 Mid Jurassic Late Jurassic Early Cretaceous 白垩纪早期 Late Cretaceous 白垩纪晚期
箱线图展示这几个地质时期的时间
dat %>%
mutate(X6=”A”) %>%
mutate(X1=factor(X1,levels = c(“Late Triassic”,”Early Jurassic”,
“Mid Jurassic”,”Late Jurassic”,
“Early Cretaceous”,”Late Cretaceous”))) %>%
ggplot(aes(x=X6,y=X4))+
geom_boxplot(aes(fill=X1),position = position_dodge(0),
show.legend = FALSE) +
scale_y_continuous(breaks = dat %>%
group_by(X1) %>%
summarise(median_value=median(X4)) %>%
arrange(median_value) %>%
ungroup() %>%
pull(median_value),
labels = dat %>%
group_by(X1) %>%
summarise(median_value=median(X4)) %>%
arrange(median_value) %>%
mutate(new_col=paste0(X1,”n”,median_value)) %>%
pull(new_col))+
theme_bw(base_size = 20)+
theme(panel.grid = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank())+
geom_segment(data=dat %>%
group_by(X1) %>%
summarise(median_value=median(X4)) %>%
arrange(median_value) %>%
ungroup(),
aes(x=-Inf,xend=Inf,y=median_value,yend=median_value),
lty=”dashed”)+
# geom_rect(data = dat %>%
# group_by(X1) %>%
# summarise(max_value=max(X4),
# min_value=min(X4)) %>%
# ungroup(),
# aes(xmin=-Inf,xmax=Inf,ymin=min_value,ymax=max_value,fill=X1),
# inherit.aes = FALSE,
# alpha=0.4)+
labs(x=NULL,y=”Million years ago”)

image.png
云雨图展示恐龙的体长

dat %>%
mutate(X1=factor(X1,levels = c(“Late Triassic”,”Early Jurassic”,
“Mid Jurassic”,”Late Jurassic”,
“Early Cretaceous”,”Late Cretaceous”))) -> dat
dat
ggplot(dat, aes(x = X1, y = X5, fill = X1)) +
sm_raincloud() +
theme(text = element_text(size = 13),
axis.text.x = element_text(angle=30,hjust=1,vjust=1))+
labs(x=NULL,y=NULL)+
ggwater2(text = “小明的数据分析笔记本”,
scale=0.6)

image.png

image.png
欢迎大家关注我的公众号
小明的数据分析笔记本

声明:文中观点不代表本站立场。本文传送门:https://eyangzhen.com/415152.html

(0)
联系我们
联系我们
分享本页
返回顶部