2 types of chi-squared test

Most people have heard of chi-squared test, but not many know that there are (at least) two types of chi-squared tests.

The two most common chi-squared tests are:

  • 1-way classification: Goodness-of-fit test
  • 2-way classification: Contingency test

The goodness-of-fit chi-squared test is to test proportions, or to be precise, to test if an an observed distribution fits an expected distribution.

The contingency test (the more classical type of chi-squared test) is to test the independence or relatedness of two random variables.

The best website I found regarding how to practically code (in R) for the two chi-squared tests is: https://web.stanford.edu/class/psych252/cheatsheets/chisquare.html

I created a PDF copy of the above site, in case it becomes unavailable in the future:

Chi-squared Stanford PDF

Best Videos on each type of Chi-squared test

Goodness of fit Chi-squared test video by Khan Academy:

Contingency table chi-square test:

Advertisement

Calculate Cronbach Alpha using Python

R has the package “psych” which allows one to calculate the Cronbach’s alpha very easily just by one line:

psych::alpha(your_data, column_list)

For Python, the situation is more tricky since there does not seem to exist any package for calculating Cronbach’s alpha. Fortunately, the formula is not very complicated and it can be calculated in a few lines.

An existing code can be found on StackOverflow, but it has some small “bugs”. The corrected version is:

def CronbachAlpha(itemscores):
    itemscores = np.asarray(itemscores)
    itemvars = itemscores.var(axis=0, ddof=1)
    tscores = itemscores.sum(axis=1)
    nitems = itemscores.shape[1]

    return (nitems / (nitems-1)) * (1 - (itemvars.sum() / tscores.var(ddof=1)))

The input “itemscores” can be your Pandas DataFrame or any numpy array. (Note that this method requires you to “import numpy as np”).

Python code for PCA Rotation “varimax” matrix

The R programming language has an excellent package “psych” that Python has no real equivalent of.

For example, R can do the following code using the principal() function:

principal(r=dat, nfactors=num_pcs, rotate="varimax")

to return the “rotation matrix” in principal component analysis based on the data “dat” and the number of principal components “num_pcs”, using the “varimax” method.

The closest equivalent in Python is to first use the factor_analyzer package:

from factor_analyzer import FactorAnalyzer

Then, we use the following code to get the “rotation matrix”:

fa = FactorAnalyzer(n_factors=3, method='principal', rotation="varimax")
fa.fit(dat)
print(fa.loadings_.round(2))