Read the following text, paying particular attention to the highlighted words.

## Statistical Inference## 4.1 The problemUntil now we have been considering how to describe or summarise a set of data considered simply as an object in its own right. Very often we want to do more than this: we wish to use a collection of observed values to make inferences about a larger set of potential values; we would like to consider a particular set of data we have obtained as representing a larger class. It turns out that to accomplish this is by no means straightforward. What is more, an exhaustive treatment of the difficulties involved is beyond the scope of this book. In this chapter we can only provide the reader with a general outline of the problem of making inferences from observed values. A full understanding of this exposition will depend to some degree on familiarity with the content of later chapters. For this reason we suggest that this chapter is first read to obtain a general grasp of the problem, and returned to later for re-reading in the light of subsequent chapters. We will illustrate the problem of
## 4.2 PopulationsA
A population then, for statistical
purposes, is a set of Common sense would suggest that a sample should be representative
of the population, that is, it should not, by overt or covert
bias, have a structure which is too different from the
target population. But more
technically (remembering that the
statistical population is a set of values), we
need to be sure that the values that constitute
the sample somehow reflect the target
statistical population. So, for example, if the
possible range of values for length of
utterance for 3-year-olds is 1 to 11 morphemes, with larger utterances possible
but very unusual, we need to ensure that we do
not introduce bias into the sample by only
collecting data from a conversational setting
in which an excessive number of Fortunately there is a method of
sampling, known as While we can never he entirely sure that a sample is
representative (that it has roughly the characteristics of the population
relevant to our investigation), our best defence against the
introduction of experimenter bias is to follow
a procedure that ensures random
samples (one such procedure will be described
in chapter 5). This can give us reasonable
confidence that our inferences from sample
values to population values are valid.
Conversely, if our sample is not
constructed according to a
random procedure we How are the samples constructed in the studies we consider in this book? Is generalisation possible from them to a target population? ## 4. 3 The theoretical solutionIt will perhaps help us to answer these questions if we introduce
the notion of a Suppose researchers are interested in the birth weights of children born in Britain in 1984 (with a view ultimately to comparing birth weights in that year with those of 1934). As is usual with any investigation, their resources will only allow them to collect a subset of these measurements - but a fairly large subset. They have to decide where and how this subset of values is to be collected. The first decision they have to make concerns the sources of their information. Maternity hospital records are the most obvious choice, but this leaves out babies born at home. Let us assume that health visitors (who are required to visit all new-born children and their mothers) have accessible records which can be used. What is now required is some well-motivated limits on these records, to constitute a sampling frame within which a random sample of birth weights can be constructed. The most common type of sampling frame is a list (actual or notional) of all the subjects in the group to which generalisation is intended. Here, for example, we could extract a list of all the babies with birth-dates in the relevant year from the records of all health visitors in Great Britain. We could then choose a simple random sample (chapter 5) of n of these babies and note the birth weights in their record. If n is large, the mean weight of the sample should be very similar (chapter 7) to the mean for all the babies born in that year. At the very least we will be able to say how big the discrepancy is likely to be (in terms of what is known as a 'confidence interval' - see chapter 7). The problem with this solution is that the
construction of the sampling frame would be
extremely time-consuming and costly. Other options are available. For example, a sampling frame could be
constructed in two or more stages. The country
(Britain) could be divided into large regions,
Scotland, Wales, North-East, West Midlands, etc., and a few
regions chosen from this The major constraint is of course resources - the time and money available for data collection and analysis. In the light of this, sensible decisions have to be made about, for example, the number of Health Districts in Britain to be included in the frame; or it may be necessary to limit the inquiry to children born in four months in the year instead of a complete year. In this example, the sampling frame mediates between the population of interest (which is the birth weights of all children born in Britain in 1984) and the sample, and allows us to generalise from the sample values to those in the population of interest. If we now return to an earlier linguistic example, we can see how
the sampling frame would enable us to
link our sample with a population of interest.
Take word-initial VOTs. Our interest will
always be in the individuals of a relatively
large group and in the measurements we derive
from their behaviour. In the present case we are likely to be concerned with
English children between 1;6 and 2;6, because this seems to be the time when
they are learning to differentiate voiceless
from voiced initial stops using VOT as a
crucial phonetic dimension. Our resources will be limited. We should, however, at
least have a sampling frame which sets time limits (for
instance, we could choose for the lower limit
of our age-range children who are 1;6 in a
particular week in 1984) ; we would like it to be geographically
well-distributed (we might again use Health Districts) ; within the sampling
frame we must select a random sample of a reasonable size. That is how we might go about selecting children for such a study. But how are
language samples to be selected from a child?
Changing the example, consider the problem of selecting utterances from a young child to measure
his mean length of utterance (mlu - see chapter
13 In a similar way, it will always be possible to imagine how a sampling frame could be drawn up for any finite population if time and other resources were unlimited. The theory which underlies all the usual statistical methods assumes that, if the results obtained from a sample are to be generalised to a wider population, a suitable sampling frame has been established and the sample chosen randomly from the frame. In practice, however, it is frequently impossible to draw up an acceptable sampling frame - so what, then, can be done? ## 4.4 The pragmatic solutionIn any year a large number of linguistic studies of an empirical nature are carried out by many researchers in many different locations. The great majority of these studies will be exploratory in nature; they will be designed to test a new hypothesis which has occurred to the investigator or to look at a modification of some idea already studied and reported on by other researchers. Most investigators have very limited resources and, in any case, it would be extravagant to carry out a large and expensive study unless it was expected to confirm and give more detailed information on a hypothesis which was likely to be true and whose implications had deep scientific, social or economic significance. Of necessity, each investigator will have to make use of the experimental material (including human subjects) to which he can gain access easily. This will almost always preclude the setting up of sampling frames and the random selection of subjects. At first sight it may look as if there is an unbridgeable gap here. Statistical theory requires that sampling should be done in a special way before generalisation can be made formally from a sample to a population. Most studies do not involve samples selected in the required fashion. Does this mean that statistical techniques will be inapplicable to these studies? Before addressing this question directly, let us step back for a moment and ask what it is, in the most general sense, that the discipline of statistics offers to linguistics if its techniques are applicable. What the techniques of statistics offer is a common ground, a common measuring stick by which experimenters can measure and compare the strength of evidence for one hypothesis or another that can be obtained from a sample of subjects, language tokens, etc. This is worth striving after. Different studies will measure quantities which are more or less variable and will include different numbers of subjects and language tokens. Language researchers may find several different directions from which to tackle the same issue. Unless a common ground can be established on which the results of different investigations can be compared using a common yardstick it would be almost impossible to assess the quality of the evidence contained in different studies or to judge how much weight to give to conflicting claims. Returning to the question of applicability, we would suggest that
a sensible way to proceed is to accept the
results of each study, in the first place, as though any sampling had been
carried out in a theoretically
'correct' fashion. If these results are interesting - suggesting some
new hypothesis or contradicting a previously accepted one, for example - then is time
enough to question how the sample was obtained
and In chapter 11 we discuss a study
by Hughes & Lascaratou (1981) on the gravity of errors in written English as perceived by two different groups: native
English-speaking teachers of English and Greek teachers of English. We
conclude that there seems to be a difference in
the way that the two groups judge errors, the
Greek teachers tending to be more severe in their judgements. How much does
this tell us about possible differences between the complete population of
native-speaking English teachers and Greek teachers of English? The results of
the experiment would carry over to those populations - in the sense to be
explained in the following four chapters - if
the samples had been selected carefully from
complete sampling frames. This was certainly not done. Hughes and Lascaratou
had to gain the co-operation of those teachers
to whom they had ready access. The formally
correct alternative would have been
prohibitively expensive. However, both samples
of teachers at least contained individuals from
different institutions. If all the English
teachers had come from a single English institution and all the Greek teachers from a single
Greek school of languages then it could be argued that the difference in
error gravity scores could be due to the
attitudes of the This then seems a reasonable way to proceed. Judge the results When the subjects themselves determine to which experimental group they belong, whether deliberately or accidentally, the sampling needs to be done rather more carefully. An important objective of the Fletcher & Peters (1984) study mentioned earlier was to compare the speech of language-normal with that of language-impaired children. In this case the investigators could not randomly assign children to one of these groups - they had already been classified before they were selected. It is important in this kind of study to try to avoid choosing either of the samples such that they belong obviously to some special subgroup. There is one type of investigation
for which proper random sampling is
With this brief introduction to some of the problems of the relation between sample and population, we now turn in chapter 5 to the concept of probability as a crucial notion in providing a link between the properties of a sample and the structure of its parent population. In the final section of that chapter we outline a procedure for random sampling. Chapter 6 deals with the modelling of statistical populations, and introduces the normal distribution, an important model for our purposes in characterising the relation between sample and population.
## SUMMARYIn this chapter the basic problem
of the **A statistical population**was defined as a set of all the values which might ever be included in a particular study. The**target population**is the set to which generalisation is intended from a study based on a sample.- Generalisation from a sample to a population can be made
formally only if the sample is collected
**randomly**from a**sampling frame**which allows each element of the population a known chance of being selected. - The point was made that, for the majority of linguistic investigations, resource constraints prohibit the collection of data in this way. It was argued that statistical theory and methodology still have an important role to play in the planning and analysis of language studies.
( |

Now try the exercises: Exercise a, Exercise b, Exercise c, Exercise d, Exercise e, Exercise f, Exercise g, Exercise h.