Correlation is used to indicate dependence or association is any statistical relationship, whether causal or not, between two random variables or bivariate data. It is a measure of relationship between two mathematical variables or measured data values, which includes the Pearson correlation coefficient as a special case.Correlation is any of a broad class of statistical relationships involving dependence, though in common usage it most often refers to how close two variables are to having a linear relationship with each other.

The strength of the linear association between two variables is quantified by the correlation coefficient.

Formula for correlation is as below

- The correlation coefficient always takes a value between -1 and 1,
- Value of 1 or -1 indicating perfect correlation (all points would lie along a straight line in this case).
- A correlation value close to 0 indicates no association between the variables.The closer the value of r to 0 the greater the variation around the line of best fit.
- A positive correlation indicates a positive association between the variables (increasing values in one variable correspond to increasing values in the other variable),
- while a negative correlation indicates a negative association between the variables (increasing values is one variable correspond to decreasing values in the other variable).

The square of the correlation coefficient, r², is a useful value in linear regression. This value represents the fraction of the variation in one variable that may be explained by the other variable. Thus, if a correlation of 0.8 is observed between two variables (say, height and weight, for example), then a linear regression model attempting to explain either variable in terms of the other variable will account for 64% of the variability in the data^{1}

the least-squares regression line will always pass through the means of x and y, the regression line may be entirely described by the means, standard deviations, and correlation of the two variables under investigation.

## Pearson correlation coefficient

Pearsons correlation coefficient is a measure of the linear correlation between two variables X and Y. It has a value between +1 and −1 ^{2}.t is obtained by dividing the covariance of the two variables by the product of their standard deviations.

Formula for Pearson Correlation Coefficient

## Rank correlation coefficients

### Spearman’s rank correlation coefficient

The Spearman correlation coefficient is defined as the Pearson correlation coefficient between the ranked variables.^{3}

### Kendall rank correlation coefficient

the Kendall correlation between two variables will be high when observations have a similar (or identical for a correlation of 1) rank (i.e. relative position label of the observations within the variable: 1st, 2nd, 3rd, etc.) between the two variables, and low when observations have a dissimilar (or fully different for a correlation of -1) rank between the two variables.^{4}

## Goodman and Kruskal’s gamma

Goodman and Kruskal’s gamma is a measure of rank correlation, i.e., the similarity of the orderings of the data when ranked by each of the quantities. ^{5}

You can find his report here

Now let us try to calculate these correlations using python, you can find code below

output is as below: