Testing of hypothesis and Prerequisite Knowledge
A hypothesis is a statement or assumption about a population parameter (e.g., mean, proportion) that can be tested using statistical methods. It is the foundation of hypothesis testing, which determines whether there is enough statistical evidence in a sample to infer a conclusion about the entire population.
Karl Poppers
Whenever there is a conjecture which is a statement that is not yet proved According to Karl Poppers it is easier to disprove it by showing Empirical Evidence. The conjecture is called the Null Hypothesis and the opposite of it is called the Alternative Hypothesis.
Attributes
- Population : In testing of hypothesis the number of subjects that are measured is called the population.
- Sampling: The reduced amount of data.
- It is used largely by government, industries., etc. when the data is too hard to collect and we collect a sample of the data.
- : Null Hypothesis
- : Alternative Hypothesis
graph LR A(Formulate Hypothesis) --> B(Collect Sample Data) B --> C(Analyze Data) C --> D{Is there sufficient evidence?} D -- Yes --> E(Reject Null Hypothesis) D -- No --> F(Fail to Reject Null Hypothesis) E --> G(Make an Inference About the Population) F --> G
Surveys
- Surveys are used to collect data from a population.
- The data is collected from a sample of the population.
- The data is then analyzed to make a statement about the population.
Parameters & Statistics
We use greek for population and english for sample. Population measures like mean () and variance are called parameters. The sample measures like mean () and variance are called statistics.
Statistical Hypothesis
-
Hypothesis
- A new drug significantly reducing blood pressure.
-
Null Hypothesis
- Definition
- A definite statement about a population parameter which is tested for possible rejection under the assumption that it is true. It is usually a hypothesis of no difference. Represented by
- The new drug does not reduce blood pressure compared to a placebo.
- Definition
-
Testing Process
- Researchers would conduct a clinical trial, and if the data shows a statistically significant decrease in blood pressure in the drug then the hypothesis could be accepted Else it is rejected
-
Alternative Hypothesis
- Any hypothesis that is a complementary to null hypothesis is called an alternative hypothesis and is denoted by .
Types of errors
- Type 1 Error
- Rejecting a true null hypothesis. The probability of making a type 1 error is denoted by
P(Rejecting | ) = )
- Example
- Convicting an innocent person.
- 100 Phones, 10 phones were sampled for test. produced we found 1 defected box so it was a type 1 error.
- Example
- Rejecting a true null hypothesis. The probability of making a type 1 error is denoted by
P(Rejecting | ) = )
- Type 2 Error
- Accepting a false null hypothesis.. The probability of making a type 2 error is denoted by
P(Accepting | ) = )
- Example
- Acquitting a guilty person.
- 100 Phones produced, 10 phones were sampled for test. no defected phones were point we have a type 2 error.
- Example
- Accepting a false null hypothesis.. The probability of making a type 2 error is denoted by
P(Accepting | ) = )
- and is referred to as Producer’s Risk and Consumer’s Risk respectively.
Example Problems
Example 1
Average Marks of boys are not same as average marks of girls
Let average marks for boys be and average marks of girls be
Example 2
Average Height of boys is more than average height of girls
Let average height for boys be and average height of girls be
One Tailed & Two Tailed Test
Out of a sample size and the average then we took a different sample and the average was Like this we took
- :
- : Right Tailed Test
- : < Left Tailed Test
- Two Tailed
- Against :


Level of significance
The probability lets say of rejecting a true null hypothesis is called the level of significance. .
The level of significance is the probability of rejecting a true null hypothesis. It is denoted by .
If we know the probability of then we can calculate the Value. The value is the number of standard deviations a data point is from the mean.
Example
If the we look closely at the Z table and find that the value of is 1.96.
Confidence Interval
The confidence interval is the range of values within which the true value of the parameter is expected to lie with a certain level of confidence. The confidence interval is denoted by where is the level of significance.
Tests of significance Problems
Question 1
Test of significance between population mean and sample mean. A sample size is considered larger if the sample size is greater than 30. If the sample size is less than 30 then the sample size is considered small.
Question 1
Sample size = 100 Standard Deviation = 10cm Sample Mean = 160cm Mean Height = 165cm
( Two Tailed Test )
Solving Test statistics of
Reject
Question 2
A random sample of 200 measurements from a large population has a mean of 50 and a standard deviation of 10. Test the hypothesis that the population mean is 52 against the alternative hypothesis that the population mean is not 52. Use a level of significance of 0.05.
T Test
If the sample size is less than 30 then the sample size is considered small. The test statistic is calculated using the t-distribution. degrees of freedom = n-1 is the t value for the level of significance and degrees of freedom .
When is known we can ignore only one value thus degree of freedom is .
Properties of t-distribution
- The t-distribution is symmetric about the mean.
- The t-distribution has a mean of 0.
- The t-distribution is more spread out than the standard normal distribution.
Problems Small Dataset
Question 1
A machine solvses a problem in 1.75 seconds. A new machine is introduced and the time taken to solve the problem is 1.85 seconds. The standard deviation is 0.1. Test the hypothesis that the new machine is inferior to the old machine. Use a level of significance of 0.05. machine is not inferior Two tailed Test Machine is inferior
Question 2
A certain injection is administered will it always
There There is a significant difference
TOS of difference between two large sample means
Testing of significance of difference between two large samples means. We will now have two values of and two values of and two values of . We will also calculate the value for the two samples. If student 1 is asked to get a sample of college students with marks and student 2 is asked to get another sample. The standard deviation will remain the same. This is because student 1 and student 2
When the samples are too large it will follow standard normal distribution. When the samples are too small it will follow the t-distribution. Assumption will be made on the basis of sample size.
Cases:
- Case 1: and known
- Case 2: and unknown
- Case 3: and known
- Case 4: and unknown
Formulas:
Case 4 Formula:
Case 3 Formula:
Case 1 Formula:
Case 2 Formula:
Test of significance of difference between two sample means ( Small Samples )
Questions
Samples of two types of electric bulbs is given
Size | Mean | Standard Deviation | |
---|---|---|---|
Sample1 | 8 | 1214 | 36 |
Sample2 | 7 | 1036 |
Questions based on TOS
The average marks scored by 32 boys is 72 with an sd of 8 while that for 36 girls is 70 with an SD of 6. Test at 1% LOS whether the boys perform better than girls.
Paired Testing
When there are two different instances of same sample we can use paired testing. In the case of the example where students first exam and second exam Let be the marks of the first exam and be the marks of the second exam. is the difference between the two exams. or The test statistic is giveen by
F Test
We move from comparing mean to comparing variance. Proportions cannot be compared for very small samples. So for samples of large size we use F test that is variance. Test of significance of difference between two small sample variance. For this we use the F test and the F distribution table. It is always right tailed. It is defined only for positive values
Step 1 : Sample Size ( F Test is only for Small Sample) Step 2 : Step 3: The F- Test Statistic is calculated as
Where and are the sample variances of the two samples. If the calculated value of F is greater than the critical value of F then we reject the null hypothesis. If the calculated value of F is less than the critical value of F then we fail to reject the null hypothesis.
Larger Variance is taken as numerator and taken as The degrees of freedom for the F distribution are and where and are the sample sizes of the two samples. If alpha changes the table changes
Step 4 : Critical Value Example
Step 5 is conclusion
Notice how we wrote the large variance above
Questions on F Test
A company wants to compare the variability in the productivity of two machines A and B. The company takes a sample of 10 observations from machine A and 12 observations from machine B. The sample variances are 4.5 and 2.5 respectively. Test the hypothesis that the variability in the productivity of the two machines is the same at 5% level of significance.
References
Information
- date: 2025.03.12
- time: 14:05