ggstatsplot-专为学术绘图而生(一)
美图神器ggstatsplot-专为学术论文而生
在CRAN(comprehensive R Achive Netwokrk)中已有13000多个R包了
简单讲ggstatsplot能够提供更为丰富信息的包,其实就是画出高质量的图
不需要我们花费过多的精力去调整绘图细节;举个例子,一般的探索性数据分过程析包括数据可视化与数据统计两个部分,而ggstatsplot正是达到两者结合的目的
举例说明
组间比较-ggbetweenstats
1library(ggstatsplot)
2library(ggplot2)
p代表参数检验,np代表非参数
mpaa是分类变量,y是数值型变量
1head(movies_long)
2## # A tibble: 6 x 8
3## title year length budget rating votes mpaa genre
4## <chr> <int> <int> <dbl> <dbl> <int> <fct> <fct>
5## 1 Shawshank Redemption, The 1994 142 25 9.1 149494 R Drama
6## 2 Lord of the Rings: The Ret~ 2003 251 94 9 103631 PG-13 Acti~
7## 3 Lord of the Rings: The Fel~ 2001 208 93 8.8 157608 PG-13 Acti~
8## 4 Lord of the Rings: The Two~ 2002 223 94 8.8 114797 PG-13 Acti~
9## 5 Pulp Fiction 1994 168 8 8.8 132745 R Drama
10## 6 Schindler's List 1993 195 25 8.8 97667 R Drama
11ggbetweenstats(
12 data = movies_long,
13 x = mpaa, # > 2 groups
14 y = rating,
15 type = "p", # default
16 messages = FALSE
17)
默认参数绘图
1ggbetweenstats(
2 data = movies_long,
3 x = mpaa,
4 y = rating
5)
配对比较
pairwise.display参数控制曾现的比较,ns无意义,all,所有,s有意义的
1ggbetweenstats(
2 data = movies_long,
3 x = mpaa,
4 y = rating,
5 type = "np",
6 mean.ci = TRUE,
7 pairwise.comparisons = TRUE,
8 pairwise.display = "s",
9 p.adjust.method = "fdr",
10 messages = FALSE
11)
调整颜色,主题,可信区间调整,突出值标记
confi.level:可信区间调整,ggtheme主题,pallete:颜色调用
outlier:超出界限标记
1ggbetweenstats(
2 data = movies_long,
3 x = mpaa,
4 y = rating,
5 type = "r",
6 conf.level = 0.99,
7 pairwise.comparisons = TRUE,
8 pairwise.annotation = "p",
9 outlier.tagging = TRUE,
10 outlier.label = title,
11 outlier.coef = 2,
12 ggtheme = hrbrthemes::theme_ipsum_tw(),
13 palette = "Darjeeling2",
14 package = "wesanderson",
15 messages = FALSE
16)
ggwithinstats组内比较
图还是非常美观,就不去细讲每个参数了,需要时调用即可,这也是作者的意图
1ggwithinstats(
2 data = WRS2::WineTasting,
3 x = Wine, # > 2 groups
4 y = Taste,
5 pairwise.comparisons = TRUE,
6 pairwise.annotation = "p",
7 ggtheme = hrbrthemes::theme_ipsum_tw(),
8 ggstatsplot.layer = FALSE,
9 messages = FALSE
10)
相关性图-ggscatterstats
代码简介,细节丰富
1ggscatterstats(
2 data = movies_long,
3 x = budget,
4 y = rating,
5 type = "p", # default #<<<
6 conf.level = 0.99,
7 marginal=F,
8 messages = TRUE
9)
其实还可以画很多其它的图,颜值都非常高,这里不再过多介绍,真正做到一图胜千言
总结一下这个包的局限性:
虽然图的信息量大,但有时比如presentation,时间不够,图信息过多反而不利于简明扼要的传达信息
另外就是计算的统计量比较单一
参考资料:[官方文档]档]https://indrajeetpatil.github.io/ggstatsplot_slides/slides/ggstatsplot_presentation.html#35