Pavel Duryagin ran an experiment on perception of vowel reduction in Russian language. The dataset shva includes the following variables:
time1 - reaction time 1duration - duration of the vowel in the stimuly (in milliseconds, ms)time2 - reaction time 2f1, f2, f3 - the 1st, 2nd and 3rd formant of the vowel measured in Hz (for a short introduction into formants, see here)vowel - vowel classified according the 3-fold classification (A - a under stress, a - a/o as in the first syllable before the stressed one, y (stands for shva) - a/o as in the second etc. syllable before the stressed one or after the stressed syllable, cf. g[y]g[a]t[A]l[y] gogotala `guffawed’).shva.f1 and f2 using ggplot().Design it to look like the following:
f1 and f2 for each vowel using ggplot().f1 can be considered outliers in a vowel?We assume outliers to be those observations that lie outside 1.5 * IQR, where IQR, the ‘Inter Quartile Range’, is the difference between the 1st and the 3rd quartile (= 25% and 75% percentile).
f1 and f2 (all data)f1 and f2 for each vowelf2 by f1.f2 by f1 using vowel intercept as a random effect880 nouns, adjectives and verbs from the English Lexicon Project data (Balota et al. 2007).
Format – A data frame with 880 observations on the following 5 variables.Word – a factor with lexical stimuli.Length – a numeric vector with word lengths.SUBTLWF – a numeric vector with frequencies in film subtitles.POS – a factor with levels JJ (adjective) NN (noun) VB (verb)Mean_RT – a numeric vector with mean reaction times in a lexical decision taskSource (http://elexicon.wustl.edu/WordStart.asp)
Data from Natalya Levshina’s RLing package available (here)[https://raw.githubusercontent.com/agricolamz/2018-MAG_R_course/master/data/ELP.csv]
elp.I’ve used scale_color_continuous(low = "lightblue", high = "red")
Mean_RT by log(SUBTLWF) using POS intercept as a random effectA data set with examples of two Dutch periphrastic causatives from newspaper corpora.
A data frame with 100 observations on the following 7 variables.
Cx – a factor with levels doen_V and laten_VCrSem – a factor that contains the semantic class of the Causer with levels Anim (animate) and Inanim (inanimate).CeSem – a factor that describes the semantic class of the Causee with levels Anim (animate) and Inanim (inanimate).CdEv – a factor that describes the semantic domain of the caused event expressed by the Effected Predicate. The levels are Ment (mental), Phys (physical) and Soc (social).Neg – a factor with levels No (absence of negation) and Yes (presence of negation).Coref – a factor with levels No (no coreferentiality) and Yes (coreferentiality).Poss – a factor with levels No (no overt expression of possession) Yes (overt expression of possession)Data from Natalya Levshina’s RLing package available (here)[https://raw.githubusercontent.com/agricolamz/2018-MAG_R_course/master/data/dutch_causatives.csv]
d_caus.Aux and other categorical variables (Aux ~ CrSem, Aux ~ CeSem, etc) is statistically significant. The assiciation with which variable should be analysed using Fisher’s Exact Test and not using Pearson’s Chi-squared Test? Is this association statistically significant?Aux and EPTrans are not independent with the help of Pearson’s Chi-squared Test.Aux and EPTrans variables.Use mosaic() function from vcd library.
Below is an example of how to use mosaic() with three variables.
vcd::mosaic(~ Aux + CrSem + Country, data=d_caus, shade=TRUE, legend=TRUE)