CrossTabs is also called Contingency Tables. This tool is used to examine the existence or the strength of any association between variables.
Define
are distinct values of row variable in ascending order, i.e. /math-afa34668ee4178285b425ccef9790f80.png?v=0)
are distinct values of column variable in ascending order, i.e. /math-443ff437c68afb202ac14bf0d0046751.png?v=0)
is the frequency with respect to cell /math-5270ae675fac24f97e172dcd9b18fa92.png?v=0)
is subtotal of the
th row
is subtotal of the
th column
is the total number.| Statistics | Formula and Explanation |
|---|---|
| Count |
|
| Expected Count |
|
| Row Percent |
|
| Column Percent |
|
| Total Percent |
|
| Residual |
|
| Std. Residual |
|
| Adj. Residual |
|
| Statistics | Formula and Explanation | Degree of Freedom |
|---|---|---|
| Pearson Chi-Square |
|
|
| Likelihood Ratio |
|
|
| Linear Association | , where is the Pearson correlation coefficient.
|
|
| Continuity Correction | , which is calculated only for 2 x 2 table
|
|
This test is useful when some expected cell count is low (less than 5). It's calculated only for 2 x 2 table. Suppose we have the table in the following:
|
|
Subtotal/Total | |
|---|---|---|---|
|
|
|
|
|
|
|
|
| Subtotal/Total |
|
|
|
Under the null hypothesis (Independence), the count of the first cell
is a hypergeometric distribution with probability given by
,
.
The one-sided test significance level is calculated by
/math-86e6e38543be7730a5d275879331853b.png?v=0)
/math-a0da9ea8cfd87134a3bc676fc188f06e.png?v=0)
The two-tail significance is
where
, if /math-ad6e799464206e320a500025fbb12f7f.png?v=0)
, if /math-dff4b325fc0edc3fc4cf9134afe80cd6.png?v=0)
/math-fb4549dbe0388469b28f1e469e8dbccc.png?v=0)
Define
/math-e1447424a87a72997354115e74f780d2.png?v=0)
/math-ae03d6a4fc3d6bde417a7352ed6c0d76.png?v=0)
is subtotal of the
th row
is subtotal of the
th column
is the total number.| Statistics | Formula and Explanation | Standard Error | |
|---|---|---|---|
| Phi Coefficient | , which is calculated for not 2 x 2 table. For a 2 x 2 table, it is equal to
The value ranges from |
||
| Cramer's V |
|
||
| Contingency Coefficient |
|
||
| Gamma |
|
| |
| Kendall | Tau-b |
|
|
| Tau-c | , where
|
| |
| Somer's D | C R
|
|
|
R C
|
|
| |
| Symmetric |
|
| |
| Lambda | C R
|
, where is the largest count in ith row, and is the largest column subtotal.
|
,where |
R C
|
,
where |
,where | |
| Symmetric |
|
![]() where | |
| Uncertainty | C R
|
, where , and , and
|
, where
|
R C
|
|
| |
| Symmetric |
|
| |
This table is calculated only when two conditions are satisfied (1) square table, i.e.
, and (2) the row variable and column variable have same values.
The Kappa statistic is calculated by
/math-5254a7c3d0d060337a3cbbbdf6875b25.png?v=0)
The standard error is estimated by:
.where
,
,
and
.
The corresponding asymptotic standard error under the null hypothesis
is given by
![SE_0 = \sqrt{\frac{1}{N\left(N^2 - \sum_{i=1}^{R}r_ic_i\right)^2} \left[N^2\sum_{i=1}^{R}r_ic_i + \left(\sum_{i=1}^{R}r_ic_i\right)^2 - N \sum_{i=1}^{R}r_ic_i(r_i+c_i)\right]} SE_0 = \sqrt{\frac{1}{N\left(N^2 - \sum_{i=1}^{R}r_ic_i\right)^2} \left[N^2\sum_{i=1}^{R}r_ic_i + \left(\sum_{i=1}^{R}r_ic_i\right)^2 - N \sum_{i=1}^{R}r_ic_i(r_i+c_i)\right]}](/origin-help/en/images/Algorithm(CrossTabs)/math-2fb11a39b32089a9fb7b374c23a68abc.png?v=0)
Another related statistic is Bowker, which is used to test
for all pairs. If
, the statistic is calculated as
/math-d430e26fecf56fc1e12fb1c885b0977d.png?v=0)
For lager samples,
is asymptotically chi-square distribution with degree of freedom
.
Note that for 2 x 2 table, Bowker's test is equal to McNemar's test. So we only give Bowker's test.
These statistics are calculated only for 2 x 2 table.
The Odds Ratio is calculated as
The Relative Risks are given by
/math-ddef868fb9aede240998288fda3c3aa7.png?v=0)
/math-d2ceb3919433fdc636aaa24dbf97eff3.png?v=0)
/math-beb356acfebcc913c83cc436c4c1a294.png?v=0)
/math-dccda1547a25ca3e806a9ca71e7f18b8.png?v=0)
Define
be the number of layers
be the frequency in the ith row, jth column and kth layer
be the jth column, kth layer subtotal
be the ith row, kth layer subtotal
be the kth layer subtotal
be the expected frequency of the ith row jth column kth layer cell/math-5f5f9e9f589b49daa28976461688df13.png?v=0)
The Mantel-Haenszel statistic is given by
where sgn is the sign function
.
The Breslow-Day statistic is
where
.
The Tarone’s Statistic is
![T = \sum_{k=1}^{K} V_k \left[f_{11k}-\hat{f}_{11k}\right]^2- \frac{\sum_{k=1}^{K}\left[f_{11k}-\hat{f}_{11k}\right]^2}{\sum_{k=1}^{K}\frac {1}{V_k} } T = \sum_{k=1}^{K} V_k \left[f_{11k}-\hat{f}_{11k}\right]^2- \frac{\sum_{k=1}^{K}\left[f_{11k}-\hat{f}_{11k}\right]^2}{\sum_{k=1}^{K}\frac {1}{V_k} }](/origin-help/en/images/Algorithm(CrossTabs)/math-2d5715c073380230f9458c06eed9823c.png?v=0)
where
.
For a 2×2×K table, the odds ratio at the kth layer is
.
Assuming that the true common odds ratio exists,taht is
, Mantel-Haenszel's estimator of the common odds ratio is
/math-488db329f6cab4e8b965215f34861f44.png?v=0)
The asymptotic variance for
is:
![\hat Var[ln(\hat OR_{MH})]=\frac{\sum_{k=1}^{K}\frac{(f_{11k}+f_{22k})f_{11k} f_{22k}}{n_{k}^2}}{2\sum_{k=1}^{K}\frac{f_{11k} f_{22k}}{n_{k}}}+\frac{\sum_{k=1}^{K}\frac{(f_{11k}+f_{22k})f_{12k} f_{21k}+(f_{12k}+f_{21k})f_{11k} f_{22k}}{n_{k}^2}}{2\sum_{k=1}^{K}\frac{f_{11k} f_{22k}}{n_{k}}\sum_{k=1}^{K}\frac{f_{12k} f_{21k}}{n_{k}}}+\frac{\sum_{k=1}^{K}\frac{(f_{12k}+f_{21k})f_{12k} f_{21k}}{n_{k}^2}}{2\sum_{k=1}^{K}\frac{f_{12k} f_{21k}}{n_{k}}} \hat Var[ln(\hat OR_{MH})]=\frac{\sum_{k=1}^{K}\frac{(f_{11k}+f_{22k})f_{11k} f_{22k}}{n_{k}^2}}{2\sum_{k=1}^{K}\frac{f_{11k} f_{22k}}{n_{k}}}+\frac{\sum_{k=1}^{K}\frac{(f_{11k}+f_{22k})f_{12k} f_{21k}+(f_{12k}+f_{21k})f_{11k} f_{22k}}{n_{k}^2}}{2\sum_{k=1}^{K}\frac{f_{11k} f_{22k}}{n_{k}}\sum_{k=1}^{K}\frac{f_{12k} f_{21k}}{n_{k}}}+\frac{\sum_{k=1}^{K}\frac{(f_{12k}+f_{21k})f_{12k} f_{21k}}{n_{k}^2}}{2\sum_{k=1}^{K}\frac{f_{12k} f_{21k}}{n_{k}}}](/origin-help/en/images/Algorithm(CrossTabs)/math-fe6d99fcba77f28d2e62c52d30d014f4.png?v=0)
The lower confidence limit(LCL) and upper confidence limit(UCL) for
is:
and ![ln(\hat OR_{MH})+z(alpha/2)\sqrt{\hat Var[ln(\hat OR_{MH})]} ln(\hat OR_{MH})+z(alpha/2)\sqrt{\hat Var[ln(\hat OR_{MH})]}](/origin-help/en/images/Algorithm(CrossTabs)/math-e8dcfecb9336a5c778bcff6588b5c278.png?v=0)