17.1.1.1 The Statistics on Columns Dialog Box
Contents
Supporting Information
Recalculate
See Recalculating Analysis Results for details of the recalculation options.
Input
Exclude Empty Dataset
Check this check box to exclude empty dataset in calculation.
Exclude Text Dataset
Check this check box to exclude Text dataset in calculation.
Input Data
Specify the input data mode, indexed or raw.
Quantities
Moments
Let \(x_i\,\) be the \(i\,\)th sample and \(w_i\,\) be the \(i\,\)th weight:
| N Total |
Total number of data points, denoted by n |
|---|---|
| N Missing |
Number of missing values |
| Mean |
The mean (average) score \(\bar{x}=\frac 1w\sum_{i=1}^n x_iw_i\). If there is no WEIGHT variable, the formula reduces to \(\frac 1n\sum_{i=1}^n x_i\). |
| Standard deviation |
\[s=\sqrt{\sum_{i=1}^n w_i(x_i-\bar{x})^2/d}\] where \(d=n-1 \,\) Note: In OriginPro, \(d\) has 4 more options, which are defined in the Variance Divisor of Moment branch. |
| SE of Mean | Standard error of mean:
\[\frac s{\sqrt{w}}\] |
| Lower 95% CI of Mean |
Lower limit of the 95% confidence interval of mean \[\bar{x}-t_{(1-\alpha /2)}\frac s{\sqrt{n}}\] where \(t_{(1-\alpha /2)}\) is the \((1-\alpha /2)\) critical value of the Student's t-statistic with n-1 degrees of freedom |
| Upper 95% CI of Mean |
Upper limit of the 95% confidence interval of mean \[\bar{x}+t_{(1-\alpha /2)}\frac s{\sqrt{n}}\] where \(t_{(1-\alpha /2)}\) is the \((1-\alpha /2)\) critical value of the Student's t-statistic with n-1 degrees of freedom |
| Variance |
\[ s^2\ \] |
| Sum | \(\sum_{i=1}^n x_iw_i\). If there is no WEIGHT variable, the formula reduces to \(\sum_{i=1}^n x_i\). |
| Skewness |
Skewness measures the degree of asymmetry of a distribution. It is defined as \[\gamma_1=\frac n{(n-1)(n-2)}\sum_{i=1}^n w_i^{\frac 32}(\frac{x_i-\bar{x}}s)^3 ,\mbox{for DF}\] \[\gamma_1=\frac 1n\sum_{i=1}^n w_i^{\frac 32}(\frac{x_i-\bar{x}}s)^3,\mbox{for N}\] \[\gamma_1=\frac 1d\sum_{i=1}^n w_i^{\frac 32}(\frac{x_i-\bar{x}}s)^3,\mbox{for WVR}\] Note: When the WDF or WS methods are chosen, skewness is returned as a missing value. |
| Kurtosis |
Kurtosis depicts the degree of peakedness of a distribution. \[\gamma_2=\frac{n(n+1)}{(n-1)(n-2)(n-3)}\sum_{i=1}^n w_i^2(\frac{x_i-\bar{x}}s)^4-\frac{3(n-1)^2}{(n-2)(n-3)},\mbox{for DF}\] \[\gamma_2=\frac 1n\sum_{i=1}^n w_i^2(\frac{x_i-\bar{x}}s)^4 -3,\mbox{for N}\] \[\gamma_2=\frac 1d\sum_{i=1}^n w_i^2(\frac{x_i-\bar{x}}s)^4 -3,\mbox{for WVR}\] Note: When the WDF or WS methods are chosen, kurtosis is returned as a missing value. |
| Uncorrected Sum of Squares |
\[\sum_{i=1}^n w_ix_i^2\] |
| Corrected Sum of Squares |
\[\sum_{i=1}^n w_i(x_i-\bar{x})^2\] |
| Coefficient of Variance |
\[\frac s{\bar{x}}\] |
| Mean absolute Deviation |
\[\frac{ \sum_{i=1}^n w_i|x_i-\bar{x}|}w\] |
| SD times 2 |
Standard deviation times 2. \[2s \,\] |
| SD times 3 |
Standard deviation times 3. \[3s \,\] |
| Geometric Mean |
\[\bar{x}_g=\left( \prod_{i=1}^n x_i\right) ^{\frac 1n}\] Note:: Weights are ignored for the geometric mean. |
| Geometric SD |
The geometric standard deviation \(e^{std(\log x_i)}\) Where std is the unweighted sample standard deviation. Note: Weights are ignored for the geometric standard deviation. |
| Mode |
The mode is the element that appears most often in the data range. If multiple modes are found, the smallest will be chosen. |
| Sum of Weights |
\[w=\sum_{i=1}^n w_i\] |
| Harmonic Mean |
harmonic mean (sometimes called the subcontrary mean)
with weight: \(\frac {\sum_{i=1}^n w_i}{\sum_{i=1}^n \frac {w_i}{x_i}}=\left(\frac {\sum_{i=1}^n w_i x_i^{-1}}{\sum_{i=1}^n w_i}\right)^{-1}\) if any \(x_i\) or weight is negative, return missing; if any \(x_i\) or weight is 0, return 0. |
Quantiles
Quantiles are values from the data, below which is a given proportion of the data points in a given set. For example, 25% of data points in any set of data lay below the first quartile, and 50% of data points in a set lay below the second quartile, or median.
Sort the input dataset in ascending order. Let \(x(i)\,\)be the \(i\,\)th element of the reordered dataset
| Minimum |
\[x_{(1)}\,\] |
|---|---|
| Index of Minimum |
The index number of Minimum in the original (input) dataset |
| 1st Quartile (Q1) |
First (25%) quantile, Q1. See Interpolation of quantiles for computational methods |
| Median |
Median or second (50%) quantile, Q2. See Interpolation of quantiles for computational methods |
| 3rd Quartile (Q3) |
Third (75%) quantile, Q3. Interpolation of quantiles for computational methods |
| Maximum |
\[x_{(n)}\,\] |
| Index of Maximum |
The index number of Maximum in the original (input) dataset |
| Interquartile Range (Q3-Q1) |
\[Q_3-Q_1\,\] |
| Range (Maximum-Minimum) |
Maximum - Minimum |
| Custom Percentile(s) |
Request computation of custom percentiles. |
| Percentile list |
This option is only available when Custom Percentile(s) is checked. Percentiles are computed for all the values listed. |
| Median Absolute Deviation | For a univariate data set X1, X2, ..., Xn, the MAD is defined as the median of the absolute deviations from the data's median:
\[MAD = median(|{X_i} - median(X)|)\,\] that is, starting with the residuals (deviations) from the data's median, the MAD is the median of their absolute values. |
| Robust Coefficient of Variation |
\[(MAD/norminv(0.75))/Median\,\] |
Extreme Values
Return extreme values. Extreme values are the lst highest and the lst lowest values.
\(l =
\begin{cases}
5,& \mbox{if }\ n\geq 10 \\
n/2, & \mbox{otherwise }
\end{cases}\)
where n is the length of the dataset.
Computation Control
Weight Method
Choose weighting methods for input data.
| Direct Weighting |
\(w_{i}=c_{i}\,\!\), where \(c_{i}\,\!\) is the ith value of weighting dataset. |
|---|---|
| Instrumental |
\(w_{i}=\frac 1{\sigma _{i}^2}\,\!\), where \(\sigma_{i}\,\!\) is the value in a designated error bar column. |
| Statistical |
\(w_{i}=\frac 1{x_{i}}\,\!\), where \(x_{i}\,\!\) is the input data. |
Variance Divisor of Moment
Controls computation of variance divisor d.
| DF | Degrees of freedom
\[d=n-1\,\] |
|---|---|
| N | Number of non-missing observations
\[d=n\,\] |
| WDF | Sum of weights DF
\[d=w-1\,\] |
| WS | Sum of weights
\[d=w\,\] |
| WVR | \[d=w-\sum_{i=1}^n w_i^2/w\] |
Interpolation of quantiles
Methods for calculating Q1, Q2, and Q3:
Let the \(i\,\)th percentile be y, set\(p=i/100\,\) , and let
\[\begin{cases} (n+1)p=j+g, & \mbox{for Weighted Average Right } \\ np=(j+g),& \mbox{for other methods } \end{cases} \]
where j is the integer part of np, and g is the fractional part of np, then different methods define the \(i^{th}\,\) percentile, y, as described by the following:
| Empirical Distribution with Averaging | \[y = \begin{cases} \frac 12(x_{(j)}+x_{(j+1)}),& \mbox{if }\ g=0 \\ x_{(j+1)},& \mbox{if }\ g>0 \end{cases}\] |
|---|---|
| Nearest Neighbor | Observation numbered closest to \(np\,\)
\(y = \begin{cases} x_k,& \mbox{if }\ g\neq \frac 12 \\ x_j, & \mbox{if }\ g=\frac 12 \mbox{ and j is even} \\ x_{(j+1)},& \mbox{if }\ g=\frac 12 \mbox{ and j is odd} \end{cases}\) where k is the integer part of \(np+\frac 12\,\) |
| Empirical Distribution | \[y= \begin{cases} x_{(j)}, & \mbox{if }\ g=0 \\ x_{(j+1)},& \mbox{if }\ g>0 \end{cases} \] |
| Weighted Average Right | weighted average aimed at \(x_{((n+1)p)}\,\)
\(y=(1-g)x_{(j)}+gx_{(j+1)}\,\) where \(x_{(n+1)}\,\)is taken to be \(x_{(n)}\,\) |
| Weighted Average Left | weighted average aimed at \(x_{(np)}\,\)
\(y=(1-g)x_{(j)}+gx_{(j+1)}\,\) where \(x_0\,\)is taken to be \(x_1\,\) |
| Tukey Hinges | Let:
\(m = \begin{cases} \frac n2,& \mbox{if n is even} \\ \frac{(n+1)}2,& \mbox{if n is odd} \end{cases} \) \( k = \begin{cases} \frac m2,& \mbox{if m is even} \\ \frac{(m+1)}2,& \mbox{if m is odd} \end{cases}\) Then we have: \[Minimun=x_{(1)}\,\] \( Q_1= \begin{cases} x_k,& \mbox{if m is odd} \\ \frac 12(x_{(k)}+x_{(k+1)}),& \mbox{if m is even} \end{cases}\) \(Q_2= \begin{cases} x_m,& \mbox{if n is odd} \\ \frac 12(x_{(m)}+x_{(m+1)}),& \mbox{if n is even} \end{cases}\) \[Q_3= \begin{cases} x_{(n-k-1)},& \mbox{if m is odd} \\ \frac 12(x_{(n-k)}+x_{(n-k+1)}),& \mbox{if m is even} \end{cases}\] \[Maximum=x_{(n)}\,\] |
|
Note: if weights are specified, Weighted Percentiles are calculated. The pth weighted percentile y is computed from the Empirical Distribution Function with Averaging: \(y= \begin{cases} \frac 12(x_{(i)}+x_{(i+1)}),& \mbox{if } \sum_{j=1}^i w_j=pw \\ x_{(i+1)},& \mbox{if } \sum_{j=1}^{i} w_j<pw<\sum_{j=1}^{i+1}w_j\\ x_{(1)},& \mbox{if } \ pw<w_1 \\ x_{(n)},& \mbox{if } \ pw<w_n \\ \end{cases}\) |
Output
Beginning with Origin 2022, input column Format of a Group column will be preserved in the output sheet (e.g. DescStatsQuantities). For instance, when outputting stats on a column of Date-Time data, column Format will be set as Date-Time in the output sheet (previously, the column would have been formatted as Text). You can restore the old behavior by setting @SCCSF = 0. For information on changing the value of a system variable, see FAQ-708 How do I permanently change the value of a system variable? |
| Graph | Control arrangement of resulting plots.
|
|---|---|
| Dataset Identifier | Choose an identifier for the source datasets.
|
| Report Tables | Destination for report worksheet tables.
|
| Quantities | Specifies the destination of quantities
|
| Optional Report Tables | Specifies what is output to report worksheet
|
Plots
| Histograms | Outputs a histogram to the result sheet.
When this box is selected, the branch is expanded. In this branch,
| |
|---|---|---|
| Box Charts | Outputs a box chart to the report sheet. If the input data has a group column, the box chart is grouped accordingly. If the group column is Set as Categorical, the box chart will be plotted according to categorical order customized through (Column Properties) Categories tab from grouping range. |

