Surname Letter Analysis
An Interactive Guide to Mean, Median, and Mode for Grouped Data
Before We Start...
Let's check our understanding of some key concepts. Try to answer these questions:
- What are the three main measures of central tendency?
- What is the difference between grouped and ungrouped data?
- How do you find the 'class mark' for a class interval?
- What does 'cumulative frequency' mean?
The Problem
100 surnames were randomly picked from a local telephone directory and the frequency distribution of the number of letters in the English alphabets in the surnames was obtained as follows:
Number of letters | Number of surnames (Frequency) |
---|---|
1 - 4 | 6 |
4 - 7 | 30 |
7 - 10 | 40 |
10 - 13 | 16 |
13 - 16 | 4 |
16 - 19 | 4 |
1. What is Given?
- A frequency distribution table of 100 surnames.
- Class intervals representing the number of letters.
- The number of surnames (frequency) for each class.
- Total number of surnames (Σf) = 100.
2. What Do We Need to Find?
- The mean number of letters.
- The median number of letters.
- The modal size of the surnames.
3. How Can We Find It?
Click on each button below to reveal the step-by-step calculation.
We'll use the Direct Method. The formula is:
Mean (x̄) = ΣfᵢxᵢΣfᵢ
Where fᵢ is the frequency and xᵢ is the class mark.
Number of letters | Frequency (fᵢ) | Class Mark (xᵢ) | fᵢxᵢ |
---|---|---|---|
1 - 4 | 6 | 2.5 | 15.0 |
4 - 7 | 30 | 5.5 | 165.0 |
7 - 10 | 40 | 8.5 | 340.0 |
10 - 13 | 16 | 11.5 | 184.0 |
13 - 16 | 4 | 14.5 | 58.0 |
16 - 19 | 4 | 17.5 | 70.0 |
Total | Σfᵢ = 100 | Σfᵢxᵢ = 832.0 |
Mean = 832100 = 8.32
The formula for the median is:
Median = l + (n⁄2 - cf)f × h
First, we need the cumulative frequency (cf).
Number of letters | Frequency (fᵢ) | Cumulative Frequency (cf) |
---|---|---|
1 - 4 | 6 | 6 |
4 - 7 | 30 | 36 |
7 - 10 | 40 | 76 |
10 - 13 | 16 | 92 |
13 - 16 | 4 | 96 |
16 - 19 | 4 | 100 |
Step 1: Find n⁄2. Here n = 100, so n⁄2 = 50.
Step 2: Find the median class. This is the class where the cumulative frequency just passes 50, which is 7 - 10.
Step 3: Identify the values:
- l (lower limit of median class) = 7
- h (class size) = 3
- f (frequency of median class) = 40
- cf (cumulative frequency of class before median class) = 36
Step 4: Substitute into the formula:
Median = 7 + (50 - 36)40 × 3
= 7 + 1440 × 3
= 7 + 4240 = 7 + 1.05 = 8.05
The formula for the mode is:
Mode = l + f₁ - f₀2f₁ - f₀ - f₂ × h
Number of letters | Frequency (fᵢ) |
---|---|
1 - 4 | 6 |
4 - 7 | 30 (f₀) |
7 - 10 | 40 (f₁) |
10 - 13 | 16 (f₂) |
13 - 16 | 4 |
16 - 19 | 4 |
Step 1: Find the modal class. This is the class with the highest frequency, which is 7 - 10.
Step 2: Identify the values:
- l (lower limit of modal class) = 7
- h (class size) = 3
- f₁ (frequency of modal class) = 40
- f₀ (frequency of preceding class) = 30
- f₂ (frequency of succeeding class) = 16
Step 3: Substitute into the formula:
Mode = 7 + (40 - 30)(2 × 40) - 30 - 16 × 3
= 7 + 1080 - 46 × 3
= 7 + 1034 × 3 = 7 + 3034 ≈ 7 + 0.88 = 7.88
4. Best Conclusions
Mean: 8.32
On average, the number of letters in a surname from this directory is 8.32.
Median: 8.05
If we were to list all 100 surnames by the number of letters, the middle value would be 8.05. This means half the surnames have fewer than 8.05 letters and half have more.
Mode: 7.88
The most frequently occurring length of surnames is around 7.88 letters. This value falls within the most common group (7-10 letters).
Notice how all three values are close to each other! This suggests a fairly symmetrical distribution of data.