Two-way ANOVA in R

Artificial Intelligence

Two-way ANOVA in R

admin

June 21, 2023

The two-way ANOVA (evaluation of variance) is a statistical method that enables to evaluate the simultaneous effect of two categorical variables on a quantitative continuous variable.

The 2-way ANOVA is an extension of the one-way ANOVA because it allows to guage the results on a numerical response of two categorical variables as an alternative of 1. The advantage of a two-way ANOVA over a one-way ANOVA is that we test the connection between two variables, while considering the effect of a 3rd variable. Furthermore, it also allows to incorporate the possible interaction of the 2 categorical variables on the response.

The advantage of a two-way over a one-way ANOVA is sort of just like the advantage of a correlation over a multiple linear regression:

The correlation measures the connection between two quantitative variables. The multiple linear regression also measures the connection between two variables, but this time considering the potential effect of other covariates.
The one-way ANOVA tests whether a quantitative variable is different between groups. The 2-way ANOVA also tests whether a quantitative variable is different between groups, but this time considering the effect of one other qualitative variable.

Previously, we now have discussed about one-way ANOVA in R. Now, we show when, why, and methods to perform a two-way ANOVA in R.

Before going further, I would really like to say and briefly describe some related statistical methods and tests to be able to avoid any confusion:

A Student’s t-test is used to guage the effect of 1 categorical variable on a quantitative continuous variable, when the specific variable has exactly 2 levels:

Student’s t-test for independent samples if the observations are independent (for instance: if we compare the age between men and women)
Student’s t-test for paired samples if the observations are dependent, that’s, once they are available pairs (it’s the case when the identical subjects are measured twice, at two different time limits, before and after a treatment for instance)

To judge the effect of 1 categorical variable on a quantitative variable, when the specific variable has 3 or more levels:1

one-way ANOVA (often simply referred as ANOVA) if the groups are independent (for instance a bunch of patients who received treatment A, one other group of patients who received treatment B, and the last group of patients who received no treatment or a placebo)
repeated measures ANOVA if the groups are dependent (when the identical subjects are measured thrice, at three different time limits, before, during and after a treatment for instance)

A two-way ANOVA is used to guage the results of two categorical variables (and their potential interaction) on a quantitative continuous variable. That is the subject of the post.

Linear regression is used to guage the connection between a quantitative continuous dependent variable and one or several independent variables:

easy linear regression if there is simply one independent variable (which could be quantitative or qualitative)
multiple linear regression if there’s not less than two independent variables (which could be quantitative, qualitative, or a combination of each)

An ANCOVA (evaluation of covariance) is used to guage the effect of a categorical variable on a quantitative variable, while controlling for the effect of one other quantitative variable (often called covariate). ANCOVA is definitely a special case of multiple linear regression with a combination of 1 qualitative and one quantitative independent variable.

On this post, we start by explaining when and why a two-way ANOVA is beneficial, we then do some preliminary descriptive analyses and present methods to conduct a two-way ANOVA in R. Finally, we show methods to interpret and visualize the outcomes. We also briefly mention and illustrate methods to confirm the underlying assumptions.

As an example methods to perform a two-way ANOVA in R, we use the penguins dataset, available from the {palmerpenguins} package.

We don’t must import the dataset, but we want to load the package first after which call the dataset:

# install.packages("palmerpenguins")
library(palmerpenguins)

dat <- penguins # rename dataset
str(dat) # structure of dataset

## tibble [344 × 8] (S3: tbl_df/tbl/data.frame)
##  $ species          : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ island           : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ bill_length_mm   : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
##  $ bill_depth_mm    : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
##  $ flipper_length_mm: int [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
##  $ body_mass_g      : int [1:344] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...
##  $ sex              : Factor w/ 2 levels "female","male": 2 1 1 NA 1 2 1 2 NA NA ...
##  $ 12 months             : int [1:344] 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...

The dataset incorporates 8 variables for 344 penguins, summarized below:

summary(dat)

##       species          island    bill_length_mm  bill_depth_mm  
##  Adelie   :152   Biscoe   :168   Min.   :32.10   Min.   :13.10  
##  Chinstrap: 68   Dream    :124   1st Qu.:39.23   1st Qu.:15.60  
##  Gentoo   :124   Torgersen: 52   Median :44.45   Median :17.30  
##                                  Mean   :43.92   Mean   :17.15  
##                                  third Qu.:48.50   third Qu.:18.70  
##                                  Max.   :59.60   Max.   :21.50  
##                                  NA's   :2       NA's   :2      
##  flipper_length_mm  body_mass_g       sex           12 months     
##  Min.   :172.0     Min.   :2700   female:165   Min.   :2007  
##  1st Qu.:190.0     1st Qu.:3550   male  :168   1st Qu.:2007  
##  Median :197.0     Median :4050   NA's  : 11   Median :2008  
##  Mean   :200.9     Mean   :4202                Mean   :2008  
##  third Qu.:213.0     third Qu.:4750                third Qu.:2009  
##  Max.   :231.0     Max.   :6300                Max.   :2009  
##  NA's   :2         NA's   :2

On this post, we’ll concentrate on the next three variables:

species: the species of the penguin (Adelie, Chinstrap or Gentoo)
sex: sex of the penguin (female and male)
body_mass_g: body mass of the penguin (in grams)

If needed, more details about this dataset could be found by running ?penguins in R.

body_mass_g is the quantitative continuous variable and will probably be the dependent variable, whereas species and sex are each qualitative variables.

Those two last variables will probably be our independent variables, also referred as aspects. Be sure that that they’re read as aspects by R. If it is just not the case, they’ll must be transformed to aspects.

As mentioned above, a two-way ANOVA is used to evaluate concurrently the effect of two categorical variables on one quantitative continuous variable.

It’s referred as two-way ANOVA because we’re comparing groups that are formed by two independent categorical variables.

Here, we would really like to know if body mass is determined by species and/or sex. Particularly, we’re thinking about:

measuring and testing the connection between species and body mass,
measuring and testing the connection between sex and body mass, and
potentially check whether the connection between species and body mass is different for females and males (which is equivalent than checking whether the connection between sex and body mass is determined by the species)

The primary two relationships are referred as predominant effects, while the third point is often called the interaction effect.

The predominant effects test whether not less than one group is different from one other one (while controlling for the opposite independent variable). However, the interaction effect goals at testing whether the connection between two variables differs depending on the extent of a 3rd variable.

When performing a two-way ANOVA, testing the interaction effect is just not mandatory. Nevertheless, omitting an interaction effect may result in erroneous conclusions if the interaction effect is present.

If we return to our example, we now have the next hypothesis tests:

Predominant effect of sex on body mass:

H0: mean body mass is equal between females and males
H1: mean body mass is different between females and males

Predominant effect of species on body mass:

H0: mean body mass is equal between all 3 species
H1: mean body mass is different for not less than one species

Interaction between sex and species:

H0: there isn’t a interaction between sex and species, meaning that the connection between species and body mass is identical for females and males (similarly, the connection between sex and body mass is identical for all 3 species)
H1: there’s an interaction between sex and species, meaning that the connection between species and body mass is different for females than for males (similarly, the connection between sex and body mass is determined by the species)

Most statistical tests require some assumptions for the outcomes to be valid, and a two-way ANOVA is just not an exception.

Assumptions of a two-way ANOVA are similar than for a one-way ANOVA. To summarize:

Variable type: the dependent variable have to be quantitative continuous, while the 2 independent variables have to be categorical (with not less than two levels).
Independence: the observations must be independent between groups and inside each group.
Normality:
For small samples, data should follow roughly a normal distribution
For big samples (normally n ≥ 30 in each group/sample), normality is just not required (because of the central limit theorem)
Equality of variances: variances must be equal across groups.
Outliers: There must be no significant outliers in any group.

More details about these assumptions could be present in the assumptions of a one-way ANOVA.

Now that we now have seen the underlying assumptions of the two-way ANOVA, we review them specifically for our dataset before applying the test and interpreting the outcomes.

The dependent variable body mass is quantitative continuous, while each independent variables sex and species are qualitative variables (with not less than 2 levels).

Subsequently, this assumption is met.

Independence is normally checked based on the design of the experiment and the way data have been collected.

To maintain it easy, observations are frequently:

independent if each experimental unit (here a penguin) has been measured just once and the observations are collected from a representative and randomly chosen portion of the population, or
dependent if each experimental unit has been measured not less than twice (because it is commonly the case within the medical field for instance, with two measurements on the identical subjects; one before and one after the treatment).

In our case, body mass has been measured just once on each penguin, and on a representative and random sample of the population, so the independence assumption is met.

We’ve a big sample in all subgroups (each combination of the degrees of the 2 aspects, called cell):

table(dat$species, dat$sex)

##            
##             female male
##   Adelie        73   73
##   Chinstrap     34   34
##   Gentoo        58   61

so normality doesn’t must be checked.

For completeness, we still show methods to confirm normality, as if we had a small samples.

There are several methods to check the normality assumption. Probably the most common methods being:

a QQ-plot by group or on the residuals, and/or
a histogram by group or on the residuals, and/or
a normality test (Shapiro-Wilk test for example) by group or on the residuals.

The simplest/shortest way is to confirm the normality with a QQ-plot on the residuals. To attract this plot, we first need to save lots of the model:

# save model
mod <- aov(body_mass_g ~ sex * species,
data = dat
)

This piece of code will probably be explained further.

Now we are able to draw the QQ-plot on the residuals. We show two ways to achieve this, first with the plot() function and second with the qqPlot() function from the {automotive} package:

# method 1
plot(mod, which = 2)

# method 2
library(automotive)

qqPlot(mod$residuals,
id = FALSE # remove point identification
)

Code for method 1 is barely shorter, but it surely misses the arrogance interval across the reference line.

If points follow the straight line (called Henry’s line) and fall inside the arrogance band, we are able to assume normality. That is the case here.

In case you prefer to confirm the normality based on a histogram of the residuals, here is the code:

# histogram
hist(mod$residuals)

The histogram of the residuals show a gaussian distribution, which is consistent with the conclusion from the QQ-plot.

Although the QQ-plot and histogram is basically enough to confirm the normality, if you must test it more formally with a statistical test, the Shapiro-Wilk test could be applied on the residuals as well:

# normality test
shapiro.test(mod$residuals)

## 
## 	Shapiro-Wilk normality test
## 
## data:  mod$residuals
## W = 0.99776, p-value = 0.9367

⇒ We don’t reject the null hypothesis that the residuals follow a standard distribution (p-value = 0.937).

From the QQ-plot, histogram and Shapiro-Wilk test, we conclude that we don’t reject the null hypothesis of normality of the residuals.

The normality assumption is thus verified, we are able to now check the equality of the variances.2

Equality of variances, also referred as homogeneity of variances or homoscedasticity, could be verified visually with the plot() function:

plot(mod, which = 3)

Because the spread of the residuals is constant, the red smooth line is horizontal and flat, so it looks just like the constant variance assumption is satisfied here.

The diagnostic plot above is sufficient, but when you prefer it may possibly even be tested more formally with the Levene’s test (also from the {automotive} package):3

leveneTest(mod)

## Levene's Test for Homogeneity of Variance (center = median)
##        Df F value Pr(>F)
## group   5  1.3908 0.2272
##       327

⇒ We don’t reject the null hypothesis that the variances are equal (p-value = 0.227).

Each the visual and formal approaches give the identical conclusion; we don’t reject the hypothesis of homogeneity of the variances.

The simplest and commonest technique to detect outliers is visually because of boxplots by groups.

For females and males:

library(ggplot2)

# boxplots by sex
ggplot(dat) +
aes(x = sex, y = body_mass_g) +
geom_boxplot()

For the three species:

# boxplots by species
ggplot(dat) +
aes(x = species, y = body_mass_g) +
geom_boxplot()

There are, as defined by the interquartile range criterion, two outliers for the species Chinstrap. These points are, nonetheless, not extreme enough to bias results.

Subsequently, we consider that the belief of no significant outliers is met.

We’ve shown that each one assumptions are met, so we are able to now proceed to the implementation of the two-way ANOVA in R.

It will allow us to reply the next research questions:

Controlling for the species, is body mass significantly different between the 2 sexes?
Controlling for the sex, is body mass significantly different for not less than one species?
Is the connection between species and body mass different between female and male penguins?

Before performing any statistical test, it’s an excellent practice to make some descriptive statistics to be able to have a primary overview of the information, and maybe, have a glimpse of the outcomes to be expected.

This could be done via descriptive statistics or plots.

If we wish to maintain it easy, we are able to compute only the mean for every subgroup:

# mean by group
aggregate(body_mass_g ~ species + sex,
data = dat,
FUN = mean
)

##     species    sex body_mass_g
## 1    Adelie female    3368.836
## 2 Chinstrap female    3527.206
## 3    Gentoo female    4679.741
## 4    Adelie   male    4043.493
## 5 Chinstrap   male    3938.971
## 6    Gentoo   male    5484.836

Or eventually, the mean and standard deviation for every subgroup using the {dplyr} package:

# mean and sd by group
library(dplyr)

group_by(dat, sex, species) %>%
summarise(
mean = round(mean(body_mass_g, na.rm = TRUE)),
sd = round(sd(body_mass_g, na.rm = TRUE))
)

## # A tibble: 8 × 4
## # Groups:   sex [3]
##   sex    species    mean    sd
##           
## 1 female Adelie     3369   269
## 2 female Chinstrap  3527   285
## 3 female Gentoo     4680   282
## 4 male   Adelie     4043   347
## 5 male   Chinstrap  3939   362
## 6 male   Gentoo     5485   313
## 7    Adelie     3540   477
## 8    Gentoo     4588   338

In case you are a frequent reader of the blog, you understand that I wish to draw plots to visualise the information at hand before interpreting results of a test.

Probably the most appropriate plot when we now have one quantitative and two qualitative variables is a boxplot by group. This could easily be made with the {ggplot2} package:

# boxplot by group
library(ggplot2)

ggplot(dat) +
aes(x = species, y = body_mass_g, fill = sex) +
geom_boxplot()

Some observations are missing for the sex, we are able to remove them to have a more concise plot:

dat %>%
filter(!is.na(sex)) %>%
ggplot() +
aes(x = species, y = body_mass_g, fill = sex) +
geom_boxplot()

Note that we could even have made the next plot:

dat %>%
filter(!is.na(sex)) %>%
ggplot() +
aes(x = sex, y = body_mass_g, fill = species) +
geom_boxplot()

But for a more readable plot, I are inclined to prefer putting the variable with the smallest variety of levels as color (which is in actual fact the argument fill within the aes() layer) and the variable with the biggest variety of categories on the x-axis (i.e., the argument x within the aes() layer).

From the means and the boxplots by subgroup, we are able to already see that, in our sample:

female penguins are inclined to have a lower body mass than males, and that’s the case for all of the considered species, and
body mass is higher for Gentoo penguins than for the opposite two species.

Keep in mind that these conclusions are only valid inside our sample! To generalize these conclusions to the population, we want to perform the two-way ANOVA and check the importance of the explanatory variables. That is the aim of the subsequent section.

As mentioned earlier, including an interaction effect in a two-way ANOVA is just not compulsory. Nevertheless, to be able to avoid flawed conclusions, it is suggested to first check whether the interaction is important or not, and depending on the outcomes, include it or not.

If the interaction is just not significant, it’s secure to remove it from the ultimate model. Quite the opposite, if the interaction is important, it must be included in the ultimate model which will probably be used to interpret results.

We thus start with a model which incorporates the 2 predominant effects (i.e., sex and species) and the interaction:

# Two-way ANOVA with interaction
# save model
mod <- aov(body_mass_g ~ sex * species,
data = dat
)

# print results
summary(mod)

##              Df    Sum Sq  Mean Sq F value   Pr(>F)    
## sex           1  38878897 38878897 406.145  < 2e-16 ***
## species       2 143401584 71700792 749.016  < 2e-16 ***
## sex:species   2   1676557   838278   8.757 0.000197 ***
## Residuals   327  31302628    95727                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 11 observations deleted on account of missingness

The sum of squares (column Sum Sq) shows that the species explain a big a part of the variability of body mass. It’s an important think about explaining this variability.

The p-values are displayed within the last column of the output above (Pr(>F)). From these p-values, we conclude that, on the 5% significance level:

controlling for the species, body mass is significantly different between the 2 sexes,
controlling for the sex, body mass is significantly different for not less than one species, and
the interaction between sex and species (displayed at the road sex:species within the output above) is important.

So from the numerous interaction effect, we now have just seen that the connection between body mass and species is different between men and women. Because it is important, we now have to maintain it within the model and we must always interpret results from that model.

If, quite the opposite, the interaction was not significant (that’s, if the p-value ≥ 0.05) we’d have removed this interaction effect from the model. For illustrative purposes, below the code for a two-way ANOVA without interaction, referred as an additive model:

# Two-way ANOVA without interaction
aov(body_mass_g ~ sex + species,
data = dat
)

For the readers who’re used to perform linear regressions in R, you’ll notice that the structure of the code for a two-way ANOVA is in actual fact similar:

the formula is dependent variable ~ independent variables
the + sign is used to incorporate independent variables without an interaction4
the * sign is used to incorporate independent variables with an interaction

The resemblance with a linear regression is just not a surprise because a two-way ANOVA, like all ANOVA, is definitely a linear model.

Note that the next code works as well, and provides the identical results:

# method 2
mod2 <- lm(body_mass_g ~ sex * species,
data = dat
)

Anova(mod2)

## Anova Table (Type II tests)
## 
## Response: body_mass_g
##                Sum Sq  Df F value    Pr(>F)    
## sex          37090262   1 387.460 < 2.2e-16 ***
## species     143401584   2 749.016 < 2.2e-16 ***
## sex:species   1676557   2   8.757 0.0001973 ***
## Residuals    31302628 327                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note that the aov() function assumes a balanced design, meaning that we now have equal sample sizes inside levels of our independent grouping variables.

For unbalanced design, that’s, unequal numbers of subjects in each subgroup, the advisable methods are:

the Type-II ANOVA when there’s no significant interaction, which could be done in R with Anova(mod, type = "II"),5 and
the Type-III ANOVA when there’s a major interaction, which could be done in R with Anova(mod, type = "III").

That is beyond the scope of the post and we assume a balanced design here. For the interested reader, see this detailed discussion about type I, type II and sort III ANOVA.

Through the 2 predominant effects being significant, we concluded that:

controlling for the species, body mass is different between females and males, and
controlling for the sex, body mass is different for not less than one species.

If body mass is different between the 2 sexes, provided that there are exactly two sexes, it have to be because body mass is significantly different between females and males.

If one desires to know which sex has the best body mass, it may possibly be deduced from the means and/or boxplots by subgroup. Here, it is obvious that males have a significantly higher body mass than females.

Nevertheless, it is just not so straightforward for the species. Let me explain why it is just not as easy as for the sexes.

There are three species (Adelie, Chinstrap and Gentoo), so there are 3 pairs of species:

Adelie and Chinstrap
Adelie and Gentoo
Chinstrap and Gentoo

If body mass is significantly different for not less than one species, it could possibly be that:

body mass is significantly different between Adelie and Chinstrap but not significantly different between Adelie and Gentoo, and never significantly different between Chinstrap and Gentoo, or
body mass is significantly different between Adelie and Gentoo but not significantly different between Adelie and Chinstrap, and never significantly different between Chinstrap and Gentoo, or
body mass is significantly different between Chinstrap and Gentoo but not significantly different between Adelie and Chinstrap, and never significantly different between Adelie and Gentoo.

Or, it may be that:

body mass is significantly different between Adelie and Chinstrap, and between Adelie and Gentoo, but not significantly different between Chinstrap and Gentoo, or
body mass is significantly different between Adelie and Chinstrap, and between Chinstrap and Gentoo, but not significantly different between Adelie and Gentoo, or
body mass is significantly different between Chinstrap and Gentoo, and between Adelie and Gentoo, but not significantly different between Adelie and Chinstrap.

Last, it may be that body mass is significantly different between all species.

As for a one-way ANOVA, we cannot, at this stage, know precisely which species is different from which one when it comes to body mass. To know this, we want to match each species two by two because of post-hoc tests (also often called pairwise comparisons).

There are several post-hoc tests, probably the most common one being the Tukey HSD, which tests all possible pairs of groups. As mentioned earlier, this test only must be done on the species variable because there are only two levels for the sex.

As for the one-way ANOVA, the Tukey HSD could be done in R as follows:

# method 1
TukeyHSD(mod,
which = "species"
)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = body_mass_g ~ sex * species, data = dat)
## 
## $species
##                        diff       lwr       upr     p adj
## Chinstrap-Adelie   26.92385  -80.0258  133.8735 0.8241288
## Gentoo-Adelie    1377.65816 1287.6926 1467.6237 0.0000000
## Gentoo-Chinstrap 1350.73431 1239.9964 1461.4722 0.0000000

or using the {multcomp} package:

# method 2
library(multcomp)

summary(glht(
aov(body_mass_g ~ sex + species,
data = dat
),
linfct = mcp(species = "Tukey")
))

## 
## 	 Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: aov(formula = body_mass_g ~ sex + species, data = dat)
## 
## Linear Hypotheses:
##                         Estimate Std. Error t value Pr(>|t|)    
## Chinstrap - Adelie == 0    26.92      46.48   0.579     0.83    
## Gentoo - Adelie == 0     1377.86      39.10  35.236   <1e-05 ***
## Gentoo - Chinstrap == 0  1350.93      48.13  28.067   <1e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

or using the pairwise.t.test() function using the p-value adjustment approach to your alternative:6

# method 3
pairwise.t.test(dat$body_mass_g, dat$species,
p.adjust.method = "BH"
)

## 
## 	Pairwise comparisons using t tests with pooled SD 
## 
## data:  dat$body_mass_g and dat$species 
## 
##           Adelie Chinstrap
## Chinstrap 0.63   -        
## Gentoo    <2e-16 <2e-16   
## 
## P value adjustment method: BH

Note that when using the second method, it’s the model without the interaction that should be specified into the glht() function, even when the interaction is important. Furthermore, don’t forget to interchange mod and species in my code with the name of your model and the name of your independent variable.

Each methods give the identical results, that’s:

body mass is not significantly different between Chinstrap and Adelie (adjusted p-value = 0.83),
body mass is significantly different between Gentoo and Adelie (adjusted p-value < 0.001), and
body mass is significantly different between Gentoo and Chinstrap (adjusted p-value < 0.001).

Do not forget that it’s the adjusted p-values which are reported, to stop the issue of multiple testing which occurs when comparing several pairs of groups.

In case you would really like to match all combos of groups, it may possibly be done with the TukeyHSD() function and specifying the interaction within the which argument:

# all combos of sex and species
TukeyHSD(mod,
which = "sex:species"
)

##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = body_mass_g ~ sex * species, data = dat)
## 
## $`sex:species`
##                                      diff       lwr       upr     p adj
## male:Adelie-female:Adelie        674.6575  527.8486  821.4664 0.0000000
## female:Chinstrap-female:Adelie   158.3703  -25.7874  342.5279 0.1376213
## male:Chinstrap-female:Adelie     570.1350  385.9773  754.2926 0.0000000
## female:Gentoo-female:Adelie     1310.9058 1154.8934 1466.9181 0.0000000
## male:Gentoo-female:Adelie       2116.0004 1962.1408 2269.8601 0.0000000
## female:Chinstrap-male:Adelie    -516.2873 -700.4449 -332.1296 0.0000000
## male:Chinstrap-male:Adelie      -104.5226 -288.6802   79.6351 0.5812048
## female:Gentoo-male:Adelie        636.2482  480.2359  792.2606 0.0000000
## male:Gentoo-male:Adelie         1441.3429 1287.4832 1595.2026 0.0000000
## male:Chinstrap-female:Chinstrap  411.7647  196.6479  626.8815 0.0000012
## female:Gentoo-female:Chinstrap  1152.5355  960.9603 1344.1107 0.0000000
## male:Gentoo-female:Chinstrap    1957.6302 1767.8040 2147.4564 0.0000000
## female:Gentoo-male:Chinstrap     740.7708  549.1956  932.3460 0.0000000
## male:Gentoo-male:Chinstrap      1545.8655 1356.0392 1735.6917 0.0000000
## male:Gentoo-female:Gentoo        805.0947  642.4300  967.7594 0.0000000

Or with the HSD.test() function from the {agricolae} package, which denotes subgroups that usually are not significantly different from one another with the identical letter:

library(agricolae)

HSD.test(mod,
trt = c("sex", "species"),
console = TRUE # print results
)

## 
## Study: mod ~ c("sex", "species")
## 
## HSD Test for body_mass_g 
## 
## Mean Square Error:  95726.69 
## 
## sex:species,  means
## 
##                  body_mass_g      std  r  Min  Max
## female:Adelie       3368.836 269.3801 73 2850 3900
## female:Chinstrap    3527.206 285.3339 34 2700 4150
## female:Gentoo       4679.741 281.5783 58 3950 5200
## male:Adelie         4043.493 346.8116 73 3325 4775
## male:Chinstrap      3938.971 362.1376 34 3250 4800
## male:Gentoo         5484.836 313.1586 61 4750 6300
## 
## Alpha: 0.05 ; DF Error: 327 
## Critical Value of Studentized Range: 4.054126 
## 
## Groups in accordance with probability of means differences and alpha level( 0.05 )
## 
## Treatments with the identical letter usually are not significantly different.
## 
##                  body_mass_g groups
## male:Gentoo         5484.836      a
## female:Gentoo       4679.741      b
## male:Adelie         4043.493      c
## male:Chinstrap      3938.971      c
## female:Chinstrap    3527.206      d
## female:Adelie       3368.836      d

If you’ve many groups to match, plotting them could be easier to interpret:

# set axis margins so labels don't get cut off
par(mar = c(4.1, 13.5, 4.1, 2.1))

# create confidence interval for every comparison
plot(TukeyHSD(mod, which = "sex:species"),
las = 2 # rotate x-axis ticks
)

From the outputs and plot above, we conclude that each one combos of sex and species are significantly different, except between female Chinstrap and feminine Adelie (p-value = 0.138) and male Chinstrap and male Adelie (p-value = 0.581).

These results, that are by the best way consistent with the boxplots shown above and which will probably be confirmed with the visualizations below, concludes the two-way ANOVA in R.

In case you would really like to visualise ends in a unique technique to what has already been presented within the preliminary analyses, below are some ideas of useful plots.

First, with the mean and standard error of the mean by subgroup using the allEffects() function from the {effects} package:

# method 1
library(effects)

plot(allEffects(mod))

Or using the {ggpubr} package:

# method 2
library(ggpubr)

ggline(subset(dat, !is.na(sex)), # remove NA level for sex
x = "species",
y = "body_mass_g",
color = "sex",
add = c("mean_se") # add mean and standard error
) +
labs(y = "Mean of body mass (g)")

Alternatively, using {Rmisc} and {ggplot2}:

library(Rmisc)

# compute mean and standard error of the mean by subgroup
summary_stat <- summarySE(dat,
measurevar = "body_mass_g",
groupvars = c("species", "sex")
)# plot mean and standard error of the mean
ggplot(
subset(summary_stat, !is.na(sex)), # remove NA level for sex
aes(x = species, y = body_mass_g, color = sex)
) +
geom_errorbar(aes(ymin = body_mass_g - se, ymax = body_mass_g + se), # add error bars
width = 0.1 # width of error bars
) +
geom_point() +
labs(y = "Mean of body mass (g)")

Second, when you prefer to attract only the mean by subgroup:

with(
dat,
interaction.plot(species, sex, body_mass_g)
)

Last but not least, for those of you who’re conversant in GraphPad, you might be more than likely conversant in plotting means and error bars as follows:

# plot mean and standard error of the mean as barplots
ggplot(
subset(summary_stat, !is.na(sex)), # remove NA level for sex
aes(x = species, y = body_mass_g, fill = sex)
) +
geom_bar(position = position_dodge(), stat = "identity") +
geom_errorbar(aes(ymin = body_mass_g - se, ymax = body_mass_g + se), # add error bars
width = 0.25, # width of error bars
position = position_dodge(.9)
) +
labs(y = "Mean of body mass (g)")

On this post, we began with just a few reminders of different tests that exist to match a quantitative variable across groups. We then focused on the two-way ANOVA, ranging from its goal and hypotheses to its implementation in R, along with the interpretations and a few visualizations. We also briefly mentioned its underlying assumptions and one post-hoc test to match all subgroups.

All this was illustrated with the penguins dataset available from the {palmerpenguins} package.

Thanks for reading.

I hope this text will allow you to in conducting a two-way ANOVA together with your data.

As at all times, if you’ve an issue or a suggestion related to the subject covered in this text, please add it as a comment so other readers can profit from the discussion.

LEAVE A REPLY Cancel reply