15.2.6 Algorithms (Fit Linear with X Error)

The Fitting Model

For given dataset \((X_i,Y_i), (\sigma_{x_i},\sigma_{y_i}), i=1,2,\ldots n\), where X is the independent variable and Y is the dependent variable, and \((\sigma_{x_i},\sigma_{y_i})\) are Errors for X, Y, respectively. -- Fit Linear with X Error fits the data to a model of the following form:

\[y=\beta _0+\beta _1x+\varepsilon\]	(1)
\[\left\{\begin{matrix} x_i=X_i+\sigma_{x_i}\\ y_i=Y_i+\sigma_{y_i} \end{matrix}\right.\]	(2)

Fit Control

Computation Method

York Method

York Method is the computation method of D. York, described in Unified equations for the slope, intercept, and standard error of the best straight line
FV Method

FV Method is the computation method of Giovanni Fasano & Roberto Vio, described in Fittng a Straight Line with Errors on Both Coordinates.
Deming Method

Deming regression is the maximum likelihood estimation of an errors-in-variables model, the X/Y errors are assumed to be independent identically distributed.
Correlation Between X and Y Errors
Correlation Between X and Y Errors \(r_i\) (For York method only)
Standard Deviation of X/Y
Standard Deviation of X/Y (For Deming method only)

Quantities (York Method)

When you perform a linear fit, you generate an listing computed quantities. The Parameters table reports model slope and intercept (numbers in parentheses show how the quantities are derived):

Fit Parameters

Fitted Value and Standard Errors

Define \(W_i\) which involves the weight (error) for both x and y;

\[W_i = \frac{\omega_{x_i}\omega_{y_i}}{\omega_{x_i}+\beta_1^2\omega_{y_i}-2\beta_1 r_ia_i} =\frac{1}{\sigma_{y_i}^2+\beta_1^2\sigma_{x_i}^2 - 2\beta_1 r_i \sigma_{x_i} \sigma_{y_i}}\]	(3)

Therein, \(\omega_{x_i}=\frac{1}{\sigma_{x_i}^2}, \ \omega_{y_i}=\frac{1}{\sigma_{y_i}^2}\) are weights of \((X_i, Y_i)\), \(r_i\) is Correlation between X and Y Errors (i.e. \(\sigma_{x_i}\) and \(\sigma_{y_i}\)), and \(\alpha_i=\sqrt{\omega_{x_i} \omega_{y_i}}\).

The slope of the fitted line for \((X_i, Y_i)\) with no weighting (errors) is the initial value for \(\beta_1\). They should be solved iteratively, until successive estimates of \(\beta_1\) agree within desired tolerance.

The concise equations which estimate parameters \(\hat{\beta_0}\) and \(\hat{\beta_1}\) for the best-fit line with X_Y errors are:

\[\hat{\beta_0}=\bar{Y}-\hat{\beta_1}\bar{X}\]	(4)

\[\hat{\beta_1}=\frac{\sum{W_i b_i V_i}}{\sum{W_i b_i U_i}}\]	(5)

where \(\bar{X} = \frac{ \sum{W_i X_i} }{ \sum{W_i} }, \ \bar{Y} = \frac{ \sum{W_i Y_i} }{ \sum{Y_i} }\).

U and V are the deviation for X and Y:

\[\left\{\begin{matrix} U=X-\bar{X}\\ V=Y-\bar{Y} \end{matrix}\right. \]

and

\[b_i=W_i \left[\frac{U_i}{\omega_{y_i}}+\frac{\hat{\beta_1}}{\omega_{x_i}}{V_i}-(\beta U_i+V_i)\frac{r_i}{\alpha_i} \right]\]	(6)

The corresponding variation \(\sigma^2\) and standard error \(s\) for parameter is:

\[\sigma_{\hat{\beta_0}}^2=\frac{1}{\sum{W_i}}+\bar{x}^2\sigma_{\hat{\beta_1}}^2\]	(7)
\[\sigma_{\hat{\beta_1}}^2=\frac{1}{\sum{W_i u_i^2}}\]	(8)

where \(\bar{x} = \frac{ \sum{W_i x_i} }{ \sum{W_i} }\), \(x_i\) is the expectation value of \(X_i\), and \(u_i=x_i - \bar{x}\).

The standard error for parameters is final given by:

\[\varepsilon_{\hat{\beta_0}}=\sqrt{\sigma _{\hat{\beta_0}}^2}\sqrt{\frac{S}{n-2}}\]	(9)

\[\varepsilon_{\hat{\beta_1}}=\sqrt{\sigma _{\hat{\beta_1}}^2}\sqrt{\frac{S}{n-2}}\]	(10)

where \(S\) is:

\[S=\sum W_i(Y_i - \beta_1 X_i- \beta_0)^2\]	(11)

t-Value and Confidence Level

If the regression assumptions hold, we have:

\(\frac{{\hat \beta _0}-\beta _0}{\varepsilon _{\hat \beta _0}}\sim t_{n^{}-1}\) and \(\frac{{\hat \beta _1}-\beta _1}{\varepsilon _{\hat \beta _1}}\sim t_{n^{}-1}\)	(12)

The t-test can be used to examine whether the fitting parameters are significantly different from zero, which means that we can test whether \(\beta _0= 0\,\!\) (if true, this means that the fitted line passes through the origin) or \(\beta _1= 0\,\!\). The hypotheses of the t-tests are:

\(H_0 : \beta _0= 0\,\! \) \(H_0 : \beta _1= 0\,\!\)

\(H_\alpha : \beta _0 \neq 0\,\!\) \(H_\alpha : \beta _1 \neq 0\,\!\)

The t-values can be computed by:

\(t_{\hat \beta _0}=\frac{{\hat \beta _0}-0}{\varepsilon _{\hat \beta _0}}\) and \(t_{\hat \beta _1}=\frac{{\hat \beta _1}-0}{\varepsilon _{\hat \beta _1}}\)	(13)

With the computed t-value, we can decide whether or not to reject the corresponding null hypothesis. Usually, for a given confidence level \(\alpha\,\!\) , we can reject \(H_0 \,\!\) when \(|t|>t_{\frac \alpha 2}\). Additionally, the p-value, or significance level, is reported with a t-test. We also reject the null hypothesis \(H_0 \,\!\) if the p-value is less than \(\alpha\,\!\).

Prob>|t|

The probability that \(H_0 \,\!\) in the t test above is true.

\[prob=2(1-tcdf(\|t\|,df_{Error}))\,\!\]	(14)

where tcdf(t, df) computes the lower tail probability for the Student's t distribution with df degree of freedom.

LCL and UCL

From the t-value, we can calculate the \((1-\alpha )\times 100\%\) Confidence Interval for each parameter by:

\[\hat \beta _j-t_{(\frac \alpha 2,n^{}-k)}\varepsilon _{\hat \beta _j}\leq \hat \beta _j\leq \hat \beta _j+t_{(\frac \alpha 2,n^{}-k)}\varepsilon _{\hat \beta _j}\]	(15)

where \(UCL\) and \(LCL\) is short for the Upper Confidence Interval and Lower Confidence Interval, respectively.

CI Half Width

The Confidence Interval Half Width is:

\[CI=\frac{UCL-LCL}2\]	(16)

where UCL and LCL is the Upper Confidence Interval and Lower Confidence Interval, respectively.

For more information .

Fit Statistics

Degrees of Freedom

\[df=n-2\]	(17)

n is total number of points

Residual Sum of Squares

\[RSS=\sum^n_{i=1} \frac{(\beta_0+\beta_1 x_i - y_i)^2}{\sigma^2_{y_i}+\beta_1^2\sigma^2_{x_i}}\]	(18)

Reduced Chi-Sqr

\[\sigma^2=\frac{RSS}{n-2}\]	(19)

Pearson's r

In simple linear regression, the correlation coefficient between x and y, denoted by r, equals to:

\(r=-R\,\!\) if \(\beta _1\,\!\) is negative
\(r=R\,\!\) if \(\beta _1\,\!\) is positive	(20)

\[R^2\] can be computed as:

\[TSS=\sum_{i=1}^n(y_i-\bar{y})^2\]
\[R^2=\frac{SXY}{SXX*TSS}=1-\frac{RSS}{TSS}\]	(21)

Root-MSE (SD)

Root mean square of the error, or residual standard deviation, which equals to:

\[df_{Error}=n-2\]
\[RootMSE=\sqrt{\frac{RSS}{df_{Error}}}\]	(22)

Covariance and Correlation Matrix

The Covariance matrix of linear regression is calculated by:

\[ \begin{pmatrix} Cov(\beta _0,\beta _0) & Cov(\beta _0,\beta _1)\\ Cov(\beta _1,\beta _0) & Cov(\beta _1,\beta _1) \end{pmatrix}=\sigma ^2\frac 1{SXX}\begin{pmatrix} \sum \frac{x_i^2}n & -\bar x \\-\bar x & 1 \end{pmatrix}\]	(23)

The correlation between any two parameters is:

\[ \rho (\beta _i,\beta _j)=\frac{Cov(\beta _i,\beta _j)}{\sqrt{Cov(\beta _i,\beta _i)}\sqrt{Cov(\beta _j,\beta _j)}} \]	(24)

Quantities (FV Method)

FV Method is the computation method of Giovanni Fasano & Roberto Vio, described in Fittng a Straight Line with Errors on Both Coordinates.

The weighting is defined as:

\[W_i=\frac{1}{\beta_1^2\sigma_{x_{i}}^2+\sigma_{y_{i}}^2}\]	(25)

The slope of the fitted line for \((X_i, Y_i)\) with no weighting (errors) is \(\beta_1\).

Let

\[\bar{x}=\frac{\sum{W_i x_i}}{\sum W_i}\]	(26)
\[\bar{y}=\frac{\sum{W_i y_i}}{\sum W_i}\]	(27)

by minimizing the sum \(K^2=\sum{W_i (y_i-\beta_0-\beta_1 x_i)^2}\), we can get the estimate value \(\beta_0\) and \(\beta_1\) by setting the partial derivatives to 0.

\[\hat{\beta_0}=\bar{y}-\hat{\beta_1}\bar{x}\]	(28)
\[a\hat{\beta_1}^2+b\hat{\beta_1}-c=0\]	(29)

where

\[a=\sum{W_i^2\sigma_{x_i}^2(y_i-\bar{y_i})(x_i-\bar{x_i})}\]	(30)
\[b=\sum{W_i^2[\sigma_{y_i}^2(x_i-\bar{x_i})^2-\sigma_{x_i}^2(y_i-\bar{y_i})^2]}\]	(31)
\[c=\sum{W_i^2\sigma_{y_i}^2(y_i-\bar{y_i})(x_i-\bar{x_i})}\]	(32)

\(\hat{\beta_1}\) should be solved iteratively, until successive estimates of \(\hat{\beta_1}\) agree within desired tolerance.

For each parameter standard error, please refer to

For more information .

Quantities (Deming Method)

When you perform a linear fit, you generate an listing computed quantities. The Parameters table reports model slope and intercept (numbers in parentheses show how the quantities are derived):

Fit Parameters

Deming regression is used for situation where both x and y are subjected to measurement error.

\[y=\beta _0+\beta _1x+\varepsilon\]
\[\left\{\begin{matrix} x_i=X_i+\sigma_{x_i}\\ y_i=Y_i+\sigma_{y_i} \end{matrix}\right.\]

Assume \(\sigma_{x_i}\) are independent identically distributed with \(\sigma_{x_i} \sim \mathcal{N}(0,\sigma^2)\), and that \(\sigma_{y_i}\) are independent identically distributed with \(\sigma_{y_i} \sim \mathcal{N}(0,\lambda \sigma^2)\), where \(\mathcal{N}(0,\sigma^2)\) denotes the normal distribution with mean 0 and standard deviation \(\sigma\). If \(\lambda=1\), it’s orthogonal regression. The weighted sum of squared residuals of the model is minimized:

\[RSS=\sum^n_{i=1}\left ((x_i-X_i)^2+\frac{(y_i-\beta_0-\beta_1X_i)^2}{\lambda}\right)\]	(33)

Fitted Value and Standard Errors

We can solve the parameters:

\[\hat{\beta_1}=\frac{SYY-\lambda SXX+\sqrt{(SYY-\lambda SXX)^2+4\lambda SXY^2}}{2SXY}\]	(34)

\[\hat{\beta_0}=\bar{y}-\hat{\beta_1}\bar{x}\]	(35)

where:

\[\bar{x}=\frac{1}{n}\sum_{i=1}^2{x_i}, \bar{y}=\frac{1}{n}\sum_{i=1}^n{y_i}\]

\[u_i=x_i-\bar{x}\]
\[v_i=y_i-\bar{y}\]

and:

\[SXX=\sum_{i=1}^n u_i^2\]
\[SYY=\sum_{i=1}^n v_i^2\]
\[SXY=\sum_{i=1}^n u_iv_i\]

The corresponding variation for parameters is:

\[\sigma^2_{\hat \beta _0}=\frac{1}{nw}+2(\bar{x}+2\bar{z})\bar{z}Q+(\bar{x}+2\bar{z})^2 \sigma_{\bar{\beta_1}}^2\]
\[\sigma^2_{\hat \beta _1}=Q^2w^2\sigma^2\sum^n_{i=1}(\lambda u_i^2+v_i^2)\]

The standard error for parameters can be estimated by:

\[\varepsilon _{\hat \beta _0}=\sqrt{\sigma^2_{\hat \beta _0}}\]	(37)
\[\varepsilon _{\hat \beta _1}=\sqrt{\sigma^2_{\hat \beta _1}}\]	(38)

and

\[w=\frac{1}{\sigma^2(\lambda+\hat{\beta_1}^2)}\]

\[z_i=w\sigma^2(\lambda u_i+\hat{\beta_1} v_i)\]

\[\bar{z}=\frac{1}{n}\sum_{i=1}^n z_i\]

\[Q=\frac{1}{w \sum_{i=1}^n \left(\frac{u_iv_i}{\hat{\beta_1}}+4(z_i-\bar{z})(z_i-u_i)\right)}\]

\[\sigma=\sqrt{\frac{\sum^n_{i=1}(x_i-X_i)^2+\frac{\sum^n_{i=1}(y_i-\hat{\beta_0}-\hat{\beta_1}x_i)^2}{\lambda}}{n-2}}\]

t-Value and Confidence Level

If the regression assumptions hold, we have:

\(\frac{{\hat \beta _0}-\beta _0}{\varepsilon _{\hat \beta _0}}\sim t_{n^{*}-1}\) and \(\frac{{\hat \beta _1}-\beta _1}{\varepsilon _{\hat \beta _1}}\sim t_{n^{*}-1}\)

\(H_0 : \beta _0= 0\,\! \) \(H_0 : \beta _1= 0\,\!\)

\(H_\alpha : \beta _0 \neq 0\,\!\) \(H_\alpha : \beta _1 \neq 0\,\!\)

The t-values can be computed by:

\(t_{\hat \beta _0}=\frac{{\hat \beta _0}-0}{\varepsilon _{\hat \beta _0}}\) and \(t_{\hat \beta _1}=\frac{{\hat \beta _1}-0}{\varepsilon _{\hat \beta _1}}\)	(38)

Prob>|t|

The probability that \(H_0 \,\!\) in the t test above is true.

\[prob=2(1-tcdf(\|t\|,df_{Error}))\,\!\]	(39)

where tcdf(t, df) computes the lower tail probability for the Student's t distribution with df degree of freedom.

LCL and UCL

From the t-value, we can calculate the \((1-\alpha )\times 100\%\) Confidence Interval for each parameter by:

\[\hat \beta _j-t_{(\frac \alpha 2,n^{}-k)}\varepsilon _{\hat \beta _j}\leq \hat \beta _j\leq \hat \beta _j+t_{(\frac \alpha 2,n^{}-k)}\varepsilon _{\hat \beta _j}\]	(40)

where \(UCL\) and \(LCL\) is short for the Upper Confidence Interval and Lower Confidence Interval, respectively.

CI Half Width

The Confidence Interval Half Width is:

\[CI=\frac{UCL-LCL}2\]	(41)

where UCL and LCL is the Upper Confidence Interval and Lower Confidence Interval, respectively.

For more information .

Fit Statistics

Degrees of Freedom

\[df=n-2\]	(42)

n is total number of points

Residual Sum of Squares

See formula (33)

Reduced Chi-Sqr

\[\sigma^2=\frac{RSS}{n-2}\]	(43)

Pearson's r

In simple linear regression, the correlation coefficient between x and y, denoted by r, equals to:

\(r=-R\,\!\) if \(\beta _1\,\!\) is negative
\(r=R\,\!\) if \(\beta _1\,\!\) is positive	(44)

\[R^2\] can be computed as:

\[TSS=\sum_{i=1}^n(y_i-\bar{y})^2\]
\[R^2=\frac{SXY}{SXX*TSS}=1-\frac{RSS}{TSS}\]	(45)

Root-MSE (SD)

Root mean square of the error, which equals to:

\[df_{Error}=n-2\]
\[RootMSE=\sqrt{\frac{RSS}{df_{Error}}}\]	(46)

Covariance and Correlation Matrix

The Covariance matrix of linear regression is calculated by:

\[ \begin{pmatrix} Cov(\beta _0,\beta _0) & Cov(\beta _0,\beta _1)\\ Cov(\beta _1,\beta _0) & Cov(\beta _1,\beta _1) \end{pmatrix}=\begin{pmatrix} \ \sigma^2_{\hat{\beta_0}} & -\bar{x}\sigma^2_{\hat \beta _1} \\-\bar{x}\sigma^2_{\hat \beta _1} &\sigma^2_{\hat{\beta_1}} \end{pmatrix}\]	(47)

The correlation between any two parameters is:

\[ \rho (\beta _i,\beta _j)=\frac{Cov(\beta _i,\beta _j)}{\sqrt{Cov(\beta _i,\beta _i)}\sqrt{Cov(\beta _j,\beta _j)}} \]	(48)

Residual Plots

Residual vs. Independent

Scatter plot of residual \(res\) vs. indenpendent variable \(x_1,x_2,\dots,x_k\), each plot is located in a seperate graphs.

Residual vs. Predicted Value

Scatter plot of residual \(res\) vs. fitted results \(\hat{y_i}\)

Residual vs. Order of the Data

\(res_i\) vs. sequence number \(i\)

Histogram of the Residual

The Histogram plot of the Residual \(res_i\)

Residual Lag Plot

Residuals \(res_i\) vs. lagged residual \(res_{(i–1)}\).

Normal Probability Plot of Residuals

A normal probability plot of the residuals can be used to check whether the variance is normally distributed as well. If the resulting plot is approximately linear, we proceed to assume that the error terms are normally distributed. The plot is based on the percentiles versus ordered residual, and the percentiles is estimated by

\[\frac{(i-\frac{3}{8})}{(n+\frac{1}{4})}\]

where n is the total number of dataset and i is the i th data. Also refer to

Reference

York D, Unified equations for the slope, intercept, and standard error of the best straight line, American Journal of Physics, Volume 72, Issue 3, pp. 367-375 (2004).
G. Fasano and R. Vio, "Fitting straight lines with errors on both coordinates", Newsletter of Working Group for Modern Astronomical Methodology, No. 7, 2-7, Sept. 1988.

15.2.6 Algorithms (Fit Linear with X Error)

Contents

The Fitting Model

Fit Control

Computation Method

Quantities (York Method)

Fit Parameters

Fitted Value and Standard Errors

t-Value and Confidence Level

Prob>|t|

LCL and UCL

CI Half Width

Fit Statistics

Degrees of Freedom

Residual Sum of Squares

Reduced Chi-Sqr

Pearson's r

Root-MSE (SD)

Covariance and Correlation Matrix

Quantities (FV Method)

Quantities (Deming Method)

Fit Parameters

Fitted Value and Standard Errors

t-Value and Confidence Level

Prob>|t|

LCL and UCL

CI Half Width

Fit Statistics

Degrees of Freedom

Residual Sum of Squares

Reduced Chi-Sqr

Pearson's r

Root-MSE (SD)

Covariance and Correlation Matrix

Residual Plots

Residual vs. Independent

Residual vs. Predicted Value

Residual vs. Order of the Data

Histogram of the Residual

Residual Lag Plot

Normal Probability Plot of Residuals

Reference