Рец. на: Gard B. Jenset, Barbara McGillivray. Quantitative historical linguistics: A corpus framework. Oxford: Oxford University Press, 2017. 256 p. ISBN 9780198718178

 
Код статьиS0373658X0008306-5-1
DOI10.31857/S0373658X0008306-5
Тип публикации Рецензия
Источник материала для отзыва Gard B. Jenset, Barbara McGillivray. Quantitative historical linguistics: A corpus framework. Oxford: Oxford University Press, 2017. 256 p. ISBN 9780198718178
Статус публикации Опубликовано
Авторы
Аффилиация:
Российский государственный гуманитарный университет
Российская академия народного хозяйства и государственной службы
Стокгольмский университет
Адрес: Российская Федерация, Москва; Королевство Швеция, Стокгольм
Название журналаВопросы языкознания
ВыпускНомер 1
Страницы155-160
Аннотация

  

Ключевые слова
Источник финансированияРабота написана при поддержке проекта РГГУ «Тексты и практики фольклора: Типология, семиотика, новые методы исследования».
Дата публикации02.03.2020
Кол-во символов20356
Цитировать  
100 руб.
При оформлении подписки на статью или выпуск пользователь получает возможность скачать PDF, оценить публикацию и связаться с автором. Для оформления подписки требуется авторизация.

Оператором распространения коммерческих препринтов является ООО «Интеграция: ОН»

1 The publication under review is a comparatively rare specimen in contemporary linguistics: it is essentially a book-length argument in favour of a particular approach to doing historical-linguistics research. The authors aim “to introduce the framework for quanitative historical linguistics, and to provide some examples of how this framework can be applied in research” (p. 1; emphasis in the original); and then they do precisely this. Along the way, however, they also spend a great deal of effort to persuade the reader that their framework is actually the best possible way of doing historical linguistics and to refute alternative takes on the matter.
2 Jenset and McGillivray begin their argument by positing that the family of statistical models developed in corpus linguistics must be adopted by the historical-linguistics community. They note that even though historical linguistics is known to be highly “data-centric”, “quantitative corpus methods are still underused and often misused in historical linguistics, and an overarching methodological structure inside which to place such methods is missing” (p. 4). The book therefore endeavours to show “what it means to be empirical in historical linguistics research and how to go about doing it.” (ibid.).
3 The authors then pose and resolve several methodological questions, the most important of which are
4 Why should historical linguistics be corpus-based and quantitative? (Because otherwise it is impossible to reproduce other people’s research, properly formulate and refute claims, and compare models.)
5 And
6 Why should historical linguistics be probabilistic? (Because rigid symbolic models tend to be vulnerable to linguistic variation and performance factors. Jenset and McGillivray underline that it is possible to adhere to strict symbolic models of grammar on the theoretical level but still investigate their realisations using probabilistic methods.)
7 The scholars also note that the methods used to analyse corpus data must be adequate to the task. This boils down to the postulates that (i) presenting uncontextualised raw frequencies of occurrence of different phenomena is not enough; and that (ii) as historical-linguistic trends are usually shaped by an array of factors, researchers should use multivariate methods to model them (multivariate models also being useful to directly estimate explanatory power of competing hypotheses).
8 Jenset and McGillivray then explore a sociological angle. They survey the current state of the art in historical lingusitics by counting the number of quantitative and corpus-based articles in the latest issues of several historical-linguistics journals. They then compare the proportion of quantitative articles in each journal with the proportion of quantitative articles in Language, used as a baseline representing best practices in general linguistics. The scholars note that publications in Language tend on average to be more quantitative and empirical in nature than those from historical-linguistics journals and conclude that historical linguisics is still not a truly empirical, data-driven discipline.
9 They contextualise this issue using the Moore-ian technology-adoption life cycle. In this perspective, the adoption of corpus-based quantitative historical linguistics has reached a perilous “chasm” between the “early adopter” and “early majority” stages. The failure to cross this adoption threshold due to the general community’s refusal or hesitance to embrace empirical methods may become lethal to the discipline or at least seriously set back its development.
10 In order to push quantitative historical linguistics forward at this crucial juncture and propel it over the chasm, in Chapter 2 Jenset and McGillivray propose a new framework in which to conduct research in historical linguistics.
11 First, they solidify the terminology needed for such a framework. The following are regarded as the foundational terms:
  • Evidence: things that can be independently observed and verified by different researchers. Evidence can be quantiative (i.e. count-based) or distributional in nature; both types of evidence should be quantified in a way that makes independent verification feasible.
  • Claim: any statement based on the evidence, which does not repeat the evidence itself. Claims can be used as constituent elements for making further claims.
  • Probability. The researchers argue in favour of following the Bayesian approach, where probabilistic statements reflect the degree of their authors’ certainty, as this approach “is explicitly made contingent on our knowledge and our argumentation in a manner that is different and better than in the [frequentist] case” (p. 41).
  • Historical corpus: a machine-readable systematically sampled collection of natural-language texts representative of some state of the language. The scholars note that non-systematic samples, such as collections of examples, can be biased and should not be regarded as corpora.
  • Linguistic annotation scheme: a consistent way to annotate texts from a corpus.
  • Hypothesis: a claim that can be empirically verified.
  • Model: a representation of some linguistic phenomenon derived from statistical verification of hypotheses on corpus data.
  • Trend: a directional change in the probability of some linguistic phenomenon over time detectable and verifiable using statistical methods on corpus data.

Цена публикации: 100

Всего подписок: 0, всего просмотров: 729

Оценка читателей: голосов 0

Система Orphus

Загрузка...
Вверх