# Statistical Tests

A telecom company had taken a **survey** of smartphone owners in a certain town 5 years back and found **73%** of the population own a **smartphone**, and have been since using this data to make their business decisions.

Now a new marketing manager has joined, and he believes this value is not valid anymore. Thus he conducts a survey of **500 people** and finds that **420** of them responded with affirmation as to owning a smartphone. Which **statistical test** would you use to **compare** these two survey data?

**Test of proportions, z-test:**Applicability: This is the correct option. The z-test for proportions is suitable when comparing the proportions of two independent samples. In this case, you are comparing the proportion of smartphone owners in the town based on the data from 5 years ago (73%) and the recent survey (where 420 out of 500 respondents own a smartphone).

Reasoning: The z-test for proportions allows you to assess whether the observed difference in proportions is statistically significant. It is appropriate when you have a large sample size (which is often the case in surveys) and when the conditions for using a z-test are met.

**Test of independence, chi-square test:**Applicability: The chi-square test of independence is used when you have categorical data and want to test if there is a significant association between two variables.

Reasoning: While the chi-square test is useful in certain scenarios, it is not the best choice for comparing proportions between two independent samples. It is more suitable for analyzing contingency tables with categorical data.

**Test of means, t-test:**Applicability: The t-test is used when comparing means of two independent samples, not proportions.

Reasoning: Since you are interested in comparing the proportion of smartphone owners, the t-test is not the appropriate choice. The t-test is used for continuous data (such as comparing the means of two groups) and is not suitable for proportions.

## Kolmogorov–Smirnov Test (KS Test)

This test checks if two sets of data have the same type of distribution

## Kruskal Wallis Test

This test does not assume that the data are normal, it does assume that the different groups have the same distribution, and groups with different standard deviations have different distributions

## ANOVA

## Levene

This test checks if data arrays passed to it has equal variance

## Shapiro-Wilk Test

This test checks whether data has normal distribution

## CHI Square

This is used to check if 2 categorical variable are related

Null Hypothesis : 2 groups are independent

Alternate Hypothesis: 2 groups are dependent

This means that the expected value table received from following calculations have independent values

## Different combinations and corresponding test

1 sample z test for mean 1 sample t test for mean

1 sample z test for proportion 1 sample t test for proportion

2 sample independent test for mean 2 sample independent test for proportion

Paired test

### Cheat sheet for different test

### Hopkins test to check clustering tendency

If output is close to 1, that means data does not have clusters. If 0 then data has clusters

Last updated