2. Icelandic vowels

This set is based on (Coretta 2017, https://goo.gl/NrfgJm). This dissertation deals with the relation between vowel duration and aspiration in consonants. Author carried out a data collection with 5 natives speakers of Icelandic. Then he extracted the duration of vowels followed by aspirated versus non-aspirated consonants. Check out whether the vowels before consonants of different places of articulation are significantly different.

Use read.csv(“https://goo.gl/7gIjvK”) for downloading data.

2.1

Calculate mean values for vowel duration in your data grouped by place (of articulation) and speaker.

df <- read.csv("https://goo.gl/7gIjvK")

df %>% 
  group_by(place, speaker) %>% 
  summarise(mean(vowel.dur))
# 

Let’s do some visualization

df %>% 
  ggplot(aes(place, vowel.dur)) + 
    geom_point(alpha = 0.2) + 
    facet_wrap(~ speaker) + 
    xlab("place of articulation") + 
    ylab("vowel duration")

2.2 Calculate mean values for vowel duration in your data grouped by word.

df <- read.csv("https://goo.gl/7gIjvK")

df %>% 
  group_by(word) %>% 
  summarise(mean(vowel.dur))
# 

2.3 Fit mixed-effect linear regression model

taking into account speaker as a random effect. Plot

fit <- lmer(vowel.dur ~ place + (1|speaker), data = df)
summary(fit)
## Linear mixed model fit by REML ['lmerMod']
## Formula: vowel.dur ~ place + (1 | speaker)
##    Data: df
## 
## REML criterion at convergence: 7292.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.9226 -0.5708 -0.0926  0.4450  5.0208 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  speaker  (Intercept) 111.6    10.57   
##  Residual             495.8    22.27   
## Number of obs: 806, groups:  speaker, 5
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  90.9320     4.8361  18.803
## placelabial -13.3303     1.7503  -7.616
## placevelar   -0.7663     2.5516  -0.300
## 
## Correlation of Fixed Effects:
##             (Intr) plclbl
## placelabial -0.126       
## placevelar  -0.086  0.238
plot(fit)

qqnorm(resid(fit))
qqline(resid(fit))

2.3 Fit mixed-effect linear regression model

taking into account speaker and word as random effects. Note that random factors can be nested.

If our groups are nested (as it happens with speakers and words), the following model should be wrong:

fit2.WRONG <- lmer(vowel.dur ~ place + (1|speaker) + (1|word), data = df)  # treats the two random effects as if they are crossed

To avoid future confusion let us create a new variable that is explicitly nested. Let???s call it sample:

df <- within(df, sample <- factor(speaker:word))
head(summary(df$sample))
## bte03:d\303\266gg  tt01:d\303\266gg        tt01:kampa       brs02:detta 
##                 6                 4                 4                 3 
##        brs02:duld        brs02:dult 
##                 3                 3

Now let’s fit the nested mixed-effect model properly:

fit2 <- lmer(vowel.dur ~ place + (1|speaker) + (1|sample), data = df)  # treats the two random effects as if they are nested
summary(fit2)
## Linear mixed model fit by REML ['lmerMod']
## Formula: vowel.dur ~ place + (1 | speaker) + (1 | sample)
##    Data: df
## 
## REML criterion at convergence: 6745.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.6053 -0.4657 -0.0336  0.4225  5.7479 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  sample   (Intercept) 390.9    19.77   
##  speaker  (Intercept) 105.6    10.28   
##  Residual             111.5    10.56   
## Number of obs: 806, groups:  sample, 273; speaker, 5
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  91.0944     4.8795  18.669
## placelabial -13.4685     2.7927  -4.823
## placevelar   -0.9022     4.1740  -0.216
## 
## Correlation of Fixed Effects:
##             (Intr) plclbl
## placelabial -0.197       
## placevelar  -0.132  0.231
plot(fit2)

qqnorm(resid(fit2))
qqline(resid(fit2))





© О. Ляшевская, И. Щуров, Г. Мороз, code on GitHub