Suppose we want to estimate the mean age of students in a college. So our population comprise of all the students in the college. Since noting down the age of each student is very time consuming, we will take a sample of some students (say 10), and estimate the mean age of students using the sample mean of those 10 students.
There are two types of estimates:
1) age of a student is 20 years --------Point Estimate
2) age lies between 18- 22 years ---------interval Estimate
Now let’s see the formal definitions:
Point Estimate: Point estimation involves the use of Sample Data to calculate a single value (known as a point estimate since it identifies a point in some parameter space ) which is to serve as a "best guess" or "best estimate" of an unknown population parameter (for example, the population mean)
Interval Estimate: Interval estimation is the use of sample data to calculate an interval of possible values of an unknown population parameter; this is in contrast to point estimation, which gives a single value.
Suppose we take 100 samples each of size 10.For each of these samples we will calculate sample mean and form a sampling distribution. The sampling distribution turns out to be approximately Normal distribution.
Now let’s see what confidence intervals are.
Confidence interval is an estimated range of values that seem to be reasonable based on what we have observed. Its center is still the sample mean , but we have got some room on either side for uncertainty. So if we say the age of a student is 18-22 years, there is uncertainty attached to it.
A 95% confidence interval means that if we calculate confidence interval from 100 different samples, about 95 of them would contain the true population mean.
So if we estimate the mean age of a students ,we may be correct 95% of the times.(But we may be wrong 5% of times too !!!)
So why don’t we take 100% confidence intervals?
To get a 100% confidence interval we need to examine the entire population. We can always take large samples so that our estimated value is very close to true population parameter. But it requires a lot of time and money.
A 100% confidence interval means the values from -∞ to +∞ i.e. the entire real line and therefore it will always contain the true population parameter. So, it is not of much use. We should find a confidence interval which is narrow enough to be useful and wide enough to contain the population parameter. We need to balance accuracy and precision. We need to sacrifice a little bit of accuracy to gain more precision, a 95% confidence interval will give us more useful range which is not infinitely long.
Margin of error: Reflects the uncertainty that surrounds sample estimates of population parameters.
In our example of estimating the age of students in a college, our sample mean is 20 years.
We have taken a confidence interval of 18-22 years i.e. 20±2 years .So here our margin of error is 2 years.
Author: Tanya Gupta
https://www.linkedin.com/in/tanya-gupta-805407160
Interested