Word Frequency Distributions

Author:   R. Harald Baayen
Publisher:   Springer
Edition:   2001 ed.
Volume:   18
ISBN:  

9780792370178


Pages:   335
Publication Date:   31 July 2001
Format:   Hardback
Availability:   Out of print, replaced by POD   Availability explained
We will order this item for you from a manufatured on demand supplier.

Our Price $578.16 Quantity:  
Add to Cart

Share |

Word Frequency Distributions


Add your own review!

Overview

This work is a comprehensive introduction to the statistical analysis of word frequency distributions, intended for computational linguists, corpus linguists, psycholinguists, and researchers in the field of quantitative stylistics. Word frequency distributions are characterized by very large numbers of rare words. This property leads to strange phenomena such as mean frequencies that systematically change as the number of observations is increased, relative frequencies that even in large samples are not fully reliable estimators of population probabilities, and model parameters that vary with text or corpus size. Special statistical techniques for the analysis of distributions with large numbers of rare events can be found in various technical journals. The aim of this book is to make these techniques more accessible for non-specialists, both theoretically, by means of a careful introduction to the underlying probabilistic and statistical concepts, and practically, by providing a program library implementing the main models for word frequency distributions.

Full Product Details

Author:   R. Harald Baayen
Publisher:   Springer
Imprint:   Springer
Edition:   2001 ed.
Volume:   18
Dimensions:   Width: 15.50cm , Height: 2.00cm , Length: 23.50cm
Weight:   1.520kg
ISBN:  

9780792370178


ISBN 10:   0792370171
Pages:   335
Publication Date:   31 July 2001
Audience:   College/higher education ,  Professional and scholarly ,  Postgraduate, Research & Scholarly ,  Professional & Vocational
Format:   Hardback
Publisher's Status:   Active
Availability:   Out of print, replaced by POD   Availability explained
We will order this item for you from a manufatured on demand supplier.

Table of Contents

1 Word Frequencies.- 1.1 Introduction.- 1.2 The frequency spectrum.- 1.3 Zipf.- 1.4 The quest for characteristic constants.- 1.5 The lognormal distribution.- 1.6 Discussion.- 1.7 Bibliographical Comments.- 1.8 Questions.- 2 Non-parametric models.- 2.1 Basic concepts.- 2.2 The Urn model.- 2.3 The Structural Type Distribution.- 2.4 The LNRE zone.- 2.5 Good-Turing estimates.- 2.6 Interpolation and Extrapolation.- 2.7 Discussion.- 2.8 Bibliographical Comments.- 2.9 Questions.- 3 Parametric models.- 3.1 Introduction.- 3.2 LNRE models.- 3.3 Evaluating Goodness of Fit.- 3.4 Parameter estimation.- 3.5 A comparative study.- 3.6 Comparing Lexical Measures Across Texts.- 3.7 Discussion.- 3.8 Bibliographical Comments.- 3.9 Questions.- 4 Mixture distributions.- 4.1 Introduction.- 4.2 Expectations, variances, and covariances.- 4.3 Examples of mixture distributions.- 4.4 Morphological Productivity.- 4.5 Discussion.- 4.6 Bibliographical Comments.- 4.7 Questions.- 5 The Randomness Assumption.- 5.1 The Randomness Assumption.- 5.2 Adjusted LNRE models.- 5.3 Discussion.- 5.4 Bibliographical Comments.- 6 Examples of Applications.- 6.1 Distributional properties of the lexicon.- 6.2 Morphological productivity.- 6.3 Authorship and Style.- 6.4 Beyond word frequency distributions.- 6.5 Some practical guidelines.- A List of Symbols.- B Solutions to the exercises.- C Software.- D Data sets.

Reviews

From the reviews: <p> Baayen's book must surely in the future become the standard point of departure for statistical studies of vocabulary. (Geoffrey Sampson, Computational Linguistics, 28: 04)


From the reviews: Baayen's book must surely in the future become the standard point of departure for statistical studies of vocabulary. (Geoffrey Sampson (Computational Linguistics, 28:04)


From the reviews: Baayen's book must surely in the future become the standard point of departure for statistical studies of vocabulary. (Geoffrey Sampson (Computational Linguistics, 28:04)


From the reviews: Baayen's book must surely in the future become the standard point of departure for statistical studies of vocabulary. (Geoffrey Sampson (Computational Linguistics, 28:04)


Author Information

Tab Content 6

Author Website:  

Customer Reviews

Recent Reviews

No review item found!

Add your own review!

Countries Available

All regions
Latest Reading Guide

Aorrng

Shopping Cart
Your cart is empty
Shopping cart
Mailing List