class: center, middle, inverse, title-slide # lingtypology: ## easy mapping for linguistic typology in R
goo.gl/Xi57pu ### G. Moroz
NRU HSE, Laboratory of the languages of the Caucasus ### ConCorT (15–17 October 2017)
HSE Voronovo Learning Center --- class: inverse, center, middle # Why lingtypology? --- ## Why lingtypology? * create uniform access to data across publications * [Glottolog](http://glottolog.org/) (Hammarström, Forkel, Haspelmath 2017) * [Cross-Linguistic Linked Data project](http://clld.org/) * easy map creation * multiple visualisations for typological features * API to typological databases `lingtypology` is based on the `leaflet` (Cheng et al. 2017) and `leaflet.minicharts` (Guillem 2017) R packages. --- class: inverse, center, middle # Installation --- ## Installation Install from CRAN (better): ```r install.packages(lingtypology) ``` Install from GitHub: ```r install.packages("devtools", dependencies = TRUE) devtools::install_github("ropensci/lingtypology") ``` Load package: ```r library(lingtypology) ``` In this presentation I use the following version: ```r packageVersion("lingtypology") ``` ``` ## [1] '1.0.9' ``` --- class: inverse, center, middle # Glottolog functions --- ## Glottolog functions Structure of functions: **what you need - what you have** --- ## Glottolog functions: by language ```r aff.lang("Adyghe") ``` ``` ## Adyghe ## "North Caucasian, West Caucasian, Circassian" ``` ```r area.lang(c("Basque", "Ewe", "Dyirbal", "Tobo")) ``` ``` ## Basque Ewe Dyirbal Tobo ## "Eurasia" "Africa" "Australia" "Papua" ``` ```r gltc.lang(c("Au", "Bau", "Iau", "Lau")) ``` ``` ## Au Bau Iau Lau ## "auuu1241" "bauu1244" "iauu1242" "lauu1247" ``` ```r iso.lang(c("Wan", "Han", "Dan")) ``` ``` ## Wan Han Dan ## "wan" "haa" "daf" ``` --- ## Glottolog functions: get languages ```r lang.aff("Kartvelian") ``` ``` ## [1] "Mingrelian" "Judeo-Georgian" "Laz" "Georgian" ## [5] "Svan" ``` ```r lang.iso(c("wan", "haa", "daf")) ``` ``` ## wan haa daf ## "Wan" "Han" "Dan" ``` ```r lang.gltc(c("auuu1241", "bauu1244", "iauu1242", "lauu1247")) ``` ``` ## auuu1241 bauu1244 iauu1242 lauu1247 ## "Au" "Bau" "Iau" "Lau" ``` --- ## Glottolog functions: ISO 639-3 ↔ Glottocodes ```r iso.gltc(c("auuu1241", "bauu1244", "iauu1242", "lauu1247")) ``` ``` ## auuu1241 bauu1244 iauu1242 lauu1247 ## "avt" "bbd" "tmu" "llu" ``` ```r gltc.iso(c("wan", "haa", "daf")) ``` ``` ## wan haa daf ## "wann1242" "hann1241" "dann1241" ``` -- It is possible to use the output of one function as the input for another: ```r gltc.lang(lang.aff("Circassian")) ``` ``` ## Adyghe Kabardian ## "adyg1241" "kaba1278" ``` --- ## Glottolog functions: Abbreviations ```r lang.country("Cape Verde") ``` ``` ## [1] "Kabuverdianu" "Portuguese" ``` ```r lang.country("Cabo Verde") ``` ``` ## [1] "Kabuverdianu" "Portuguese" ``` ```r head(lang.country("USA")) ``` ``` ## [1] "Holikachuk" "Hopi" "Palewyami Yokuts" ## [4] "Finnish" "Mbum" "Lower Sorbian" ``` --- ## Glottolog functions: Spell Checker ```r iso.gltc(c("auuu1241 ", " lau u1247")) ``` ``` ## auuu1241 lauu1247 ## "avt" "llu" ``` ```r aff.lang(c("Adyge", "Katalan")) ``` ``` ## Warning: Language Adyge is absent in our version of the Glottolog database. ## Did you mean Adyghe, Aduge? ``` ``` ## Warning: Language Katalan is absent in our version of the Glottolog ## database. Did you mean Kavalan, Catalan? ``` ``` ## Adyge Katalan ## NA NA ``` --- class: inverse, center, middle # Creating a map --- ## Map: base map ```r map.feature("Estonian") ```
--- ## Map: multiple languages ```r map.feature(lang.aff("Turkic")) ``` ``` ## Warning: There is no coordinates for languages Pecheneg ```
--- ## Built-in datasets ```r head(ejective_and_n_consonants) ``` ``` ## language ejectives consonants vowels ## 1 Turkish no 25 8 ## 2 Korean no 21 11 ## 3 Tiwi no 22 4 ## 4 Liberia Kpelle no 22 12 ## 5 Tulu no 24 13 ## 6 Mapudungun no 20 6 ``` ```r head(circassian) ``` ``` ## longitude latitude village district dialect language ## 1 40.23000 45.02000 Khakurinokhabl ra Abadzex Adyghe ## 2 43.11143 43.82294 Dzhenal kbr Baksan Kabardian ## 3 44.36366 43.48558 Inarkoy kbr Baksan Kabardian ## 4 43.71569 43.75747 Psynshoko kbr Baksan Kabardian ## 5 43.93194 43.52444 Nizhny Cherek kbr Baksan Kabardian ## 6 43.91274 43.34232 Anzorey kbr Baksan Kabardian ``` --- ## Color languages by a categorical variable ```r map.feature(ejective_and_n_consonants$language, features = ejective_and_n_consonants$ejectives) ```
--- ## Color languages by a numeric variable ```r map.feature(ejective_and_n_consonants$language, features = ejective_and_n_consonants$consonants) ```
--- ## Change colors ```r map.feature(ejective_and_n_consonants$language, features = ejective_and_n_consonants$ejectives, color = c("yellow", "navy")) ```
--- ## Change colors ```r map.feature(ejective_and_n_consonants$language, features = ejective_and_n_consonants$consonants, color = "magma") ```
--- ## Add labels ```r map.feature(ejective_and_n_consonants$language, features = ejective_and_n_consonants$ejectives, label = ejective_and_n_consonants$language) ```
--- ## Add labels: permanent labels ```r map.feature(ejective_and_n_consonants$language, features = ejective_and_n_consonants$ejectives, label = ejective_and_n_consonants$language, label.hide = FALSE) ```
--- ## Add labels: emphasize ```r map.feature(ejective_and_n_consonants$language, features = ejective_and_n_consonants$ejectives, label = ejective_and_n_consonants$language, label.hide = FALSE, label.emphasize = list(17:19, "red")) ```
--- ## Add minicharts: bar ```r map.feature(ejective_and_n_consonants$language, minichart.data = ejective_and_n_consonants[, 3:4], minichart = "bar") ```
--- ## Add minicharts: pie ```r map.feature(ejective_and_n_consonants$language, minichart.data = ejective_and_n_consonants[, 3:4], minichart = "pie") ```
--- ## Add minicharts: pie ```r map.feature(ejective_and_n_consonants$language, minichart.data = ejective_and_n_consonants[, 3:4], minichart = "pie", minichart.labels = TRUE) ```
--- ## Add your own coordinates ```r map.feature(circassian$language, features = circassian$dialect, latitude = circassian$latitude, longitude = circassian$longitude) ```
--- ## Add strokes ```r map.feature(circassian$language, features = circassian$dialect, latitude = circassian$latitude, longitude = circassian$longitude, stroke.features = circassian$language) ```
--- ## Add a rectangle ```r map.feature(circassian$language, features = circassian$dialect, latitude = circassian$latitude, longitude = circassian$longitude, rectangle.lng = c(42.7, 45), rectangle.lat = c(42.7, 44.3)) ```
--- ## Add a density estimation ```r map.feature(circassian$language, features = circassian$dialect, latitude = circassian$latitude, longitude = circassian$longitude, density.estimation = circassian$language) ```
--- ## Add a density estimation without dots ```r map.feature(circassian$language, features = circassian$dialect, latitude = circassian$latitude, longitude = circassian$longitude, density.estimation = circassian$language, density.points = FALSE) ```
--- ## Change tiles ```r map.feature(ejective_and_n_consonants$language, features = ejective_and_n_consonants$ejectives, tile = "OpenStreetMap.BlackAndWhite") ```
--- ## Add minimaps ```r map.feature(ejective_and_n_consonants$language, features = ejective_and_n_consonants$ejectives, minimap = TRUE) ```
--- class: inverse, center, middle # Databases API --- ## Databases API * `wals.feature()` World Atlas of Language Structures * `autotyp.feature()` AUTOTYP * `phoible.feature()` PHOIBLE * `afbo.feature()` Affix Borrowing database * `sails.feature()` South American Indigenous Language Structures * `abvd.feature()` Austronesian Basic Vocabulary Database --- ## WALS (Dryer, Haspelmath 2013) ```r df <- wals.feature("1a") head(df) ``` ``` ## wals.code 1a latitude longitude glottocode language ## 1 abi Moderately small -29.00000 -61.00000 abip1241 Abipon ## 2 abk Large 43.08333 41.00000 abkh1244 Abkhaz ## 3 abm Small 32.33333 -87.41667 alab1237 Alabama ## 4 ach Small -25.25000 -55.16667 ache1246 Ache (Tupian) ## 5 acm Moderately small 41.50000 -121.00000 achu1247 Achumawi ## 6 aco Large 34.91667 -107.58333 west2632 Western Keres ``` --- ## AUTOTYP (Bickel et. al. 2017) ```r df <- autotyp.feature(c('Numeral classifiers')) head(df) ``` ``` ## LID NumClass.n NumClass.Presence Glottocode language ## 1 6 0 FALSE ambu1247 Ambulas ## 2 7 0 FALSE abkh1244 Abkhaz ## 3 9 9 TRUE achi1257 Achinese ## 4 10 0 FALSE west2632 Western Keres ## 5 12 2 TRUE ainu1240 Hokkaido Ainu ## 6 14 0 FALSE alam1246 Alamblak ``` --- class: inverse, center, middle # About package development --- ## About package development .pull-left[ * I've created * 15 issues in [ropensci/lingtypology](https://github.com/ropensci/lingtypology) * 4 issues in [clld/glottolog-data](https://github.com/clld/glottolog-data) * 1 issue in [clld/glottolog](https://github.com/clld/glottolog) * 1 issue in [clld/wals3](https://github.com/clld/wals3) * 1 issue in [clld/sails](https://github.com/clld/sails) * 1 issue in [rstudio/leaflet](https://github.com/rstudio/leaflet) * 1 issue in [yihui/xaringan](https://github.com/yihui/xaringan) ] -- .pull-right[ * I received some comments and suggestions from * [Samira Verhees](https://github.com/sverhees) * [Calle Börstell](https://github.com/borstell) * [Robert Forkel](https://github.com/xrotwang) * Natalia Levshina * Timo Roettger * and from [rOpenSci](https://github.com/ropensci) peer review * [Scott Chamberlain](https://github.com/sckott) * [Kent Russell](https://github.com/timelyportfolio/) * [Taras Tzakharko](https://github.com/tzakharko) * [Languagespacelabs](https://github.com/languagespacelabs) ] --- class: middle * Don't hesitate to write (<agricolamz@gmail.com>) or open [an issue](https://github.com/ropensci/lingtypology/issues). * `lingtypology` tutorial: <https://ropensci.github.io/lingtypology/> * Slides on **GitHub**: <https://agricolamz.github.io/2017_ConCorT_lingtypology> * Slides created with the beautiful R package [**xaringan**](https://github.com/yihui/xaringan)