The editors will have a look at it as soon as possible. Delete template? Cancel Delete. Cancel Overwrite Save. Don't wait! Try Yumpu. Start using Yumpu now! Terms of service. Privacy policy. Cookie policy. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them. Cengage Learning. Steven M. Freund , Joy L. Gregory Mankiw. Since Free ebooks since ZLibrary app.
Please read our short guide how to send a book to Kindle The file will be sent to your email address. We would not reject the hypothesis. If they are similar enough, exit the iterations. Our iterations produce the two slope estimates Under the hypothesis,the model is the one in Section The restricted estimate is given in and the equations which follow.
To obtain them, we make a small modification in our algorithm above. The iterations produce The likelihood ratio statistic is given in This is far below the critical value of 3. See The algebraic result is a little tedious, but straightforward. But, the determinant appears in the leading matrix, which is inverted and in the trailing vector which is not.
We divide every element of the matrix to be inverted by n, then because of the inversion, divide the vector on the right by n as well. There are two special cases worthy of note, though. The second of these is similar to our finding for omitted variables in the classical regression model. The ns in the inverse and in the vector cancel.
As in the previous exercise, we replace elements of the inverse with elements from the original matrix and cancel the determinant which multiplies the matrix after inversion and divides the vector. The inverse of the matrix is straightforward. We can proceed simply by dropping the third equation.
We will eliminate the third share equation. Invariance is achieved by using a maximum likelihood estimator. The five parameters eliminated by the restrictions can be estimated after the others are obtained just by using the restrictions. The restrictions are linear, so the standard errors are also striaghtforward to obtain. The least squares estimates are shown below. Estimated standard errors appear in parentheses. The parameters in the cost function are extremely large, owing primarily to rather severe multicollinearity among the price terms.
The results of estimation of the system by direct maximum likelihood are shown. The convergence criterion is the value of Belsley discussed near the end of Section 5. The sample means are. The matrix of elasticities computed according to is k l f. This may be due to the very small sample size.
The cross elasticities however do conform to what one might expect, the primary one being the evident substitution between capital and fuel. The iterations for this restricted model are shown below. There are two restrictions since only two of the three parameters are free. The critical value from the chi-squared table is 5. Application Separate regressions and aggregation test. This saves the residuals to be used later. Constrained Sur model with one coefficient vector.
We cannot simply take logs of both sides of the equation as the disturbance is additive rather than multiplicative. So, we must treat the model as a nonlinear regression. The iteration could be continued until convergence. Starting values are always a problem.
Methods for solving nonlinear equations such as these are discussed in Appendix E.. The proof can be done by mathematical induction. For convenience, denote the ith derivative by fi. The first derivative appears in Equation This will complete the proof. Use L'Hospital's rule once again. First, the two simple regressions produce Linear Log-linear Constant Since this is not significantly different from zero, this evidence favors the linear model. Therefore, this contradicts the preceding result and favors the loglinear model.
An alternative approach is to fit the Box-Cox model in the fashion of Exercise 4. The log-likelihoods are Error Correct Std. The estimates found for Zellner and Revankar's model were. For the simple log-linear model, the corresponding values are.
The Wald test is based on the unrestricted model. The likelihood ratio statistic is based on both models. The sum of squared residuals for both unrestricted and restricted models is given above. All three statistics suggest the same conclusion, the hypothesis should be rejected. These are quite similar to the estimates given above. The sum of the two output elasticities for the states given in the example in the text are given below for the model estimated with and without transforming the dependent variable.
Note that the first of these makes the model look much more similar to the Cobb Douglas model for which this sum is constant. This is a surprising outcome. The sum of squared residuals for the restricted model is given above. The sum of the logs of the outputs is The likelihood ratio statistic is -2[ Once again, the statistic is small. Finally, to compute the Lagrange multiplier statistic, we now use the method described in Example All of these suggest that the log-linear model is not a significant restriction on the Box-Cox model.
This rather peculiar outcome would appear to arise because of the rather substantial reduction in the log-likelihood function which occurs when the dependent variable is transformed along with the right hand side. This is not a contradiction because the model with only the right hand side transformed is not a parametric restriction on the model with both sides transformed. Some further evidence is given in the next exercise. Linearized regression. There is no need for a separate proof different from the usual for OLS.
The only change here is the nonzero mean probability limit of the vector in brackets. The consistency, asymptotic normality and asymptotic covariance matrix equal to Asy. A logical solution to this one is simple. The numerator is the same. The denominator is still obviously larger, so the same result holds when both variables are measured with error. If the mean squared error matrix of the OLS estimator is smaller than that of the 2SLS estimator, then its inverse is larger.
Use A to do the inversion. Is it possible? The statement of the problem is actually a bit optimistic. The problem is that the model as stated would not be identified — the supply equation would, but the demand equation would not be.
The way out would be to assume that at least one of Ed, Union, Fem does not appear in the demand equation. Since surely education would, that leaves one or both of Union and Fem. We will assume both of them are omitted. The results show that the hypothesis is rejected. We conclude that the instruments are relevant. Fixed Parameter Thus, both equations are identified. The first equation becomes a regression which can be estimated by ordinary least squares.
However, the second equation continues to fail the order condition. To see the problem, consider that even with the restriction, any linear combination of the two equations has the same variables as the original second eqation. We may treat it as known. It is instructive to analyze this from the standpoint of false structures as done in the text.
Therefore, the restrictions identify the model. Consider again the false structure. The three restrictions on four unknown elements of F do not serve to pin down any of them. This restriction does not even partially identify the model.
The remaining model is not identified by the usual rank and order conditions. The second equation fails the order condition. All reduced form parameters are estimable directly by using least squares, so the reduced form is identified in all cases. Following the method in Example Therefore, the matrix has rank five. The second is obviously not identified. In 1 , none of the three columns can be written as a linear combination of the other two, so it has rank 3.
Although the second and last columns have nonzero elements in the same positions, for the matrix to have short rank, we would require that the third column be a multiple of the second, since the first cannot appear in the linear combination which is to replicate the second column.
By the same logic, 3 and 4 are identified. Obtain the reduced form for the model in Exercise 1 under each of the assumptions made in parts a and b1 , b6 , and b9. Substituting it into the second provides the second reduced form. The estimated standard errors are the square roots of the diagonal elements of the inverse matrix, [. To compute the limited information maximum likelihood estimator, we require the matrix of sums of squares and cross products of residuals of the regressions of y1 and y2 on x1 and on x1, x2, and x3.
Describe a procedure for estimating the model while incorporating the restrictions. We could treat the system as a nonlinear seemingly unrelated regressions model. This nonlinear system could now be estimated by nonlinear GLS. Needless to say, this would be quite involved. We would require that all three characteristic roots have modulus less than one. An intuitive guess that the diagonal element greater than one would preclude this would be correct.
Expanding this produces -. There is no need to go any further. The system is unstable. Prove that an underidentified equation cannot be estimated by two stage least squares. If the equation fails the order condition, then the number of excluded exogenous variables is less than the number of included endogenous. Read the data? For convenience, rename the variables so they correspond? Impose artifically the adding up condition on total demand. Estimate equations by 2sls and save coefficients with?
Create the coefficients of the reduced form. We only need the parts? These are in the second half of the example. Construct the matrix that governs the dynamics of the system. Note that? It is a function of y t-1 and c t-1 but not? The dominant? The largest root is larger than on in absolute value. But, the payoff to this imprecision is that the semiparametric formulation is more likely to be robust to failures of the assumptions of the parametric model.
Consider, for example, the binary probit model of Chapter 21, which makes a strong assumption of normality and homoscedasticity. If the assumptions are correct, the probit estimator is the most efficient use of the data. However, if the normality assumption or the homoscedasticity assumption are incorrect, then the probit estimator becomes inconsistent in an unknown fashion. But, it will remain consistent if the normality assumption is violated, and it is even robust to certain kinds of heteroscedasticity.
Applications 1. Using the gasoline market data in Appendix Table F2. R-squared and F can be negative! Using the probit model and the Klein and Spady semiparametric models, the two sets of coefficient estimates are somewhat similar.
Dependent variable P Weighting variable None Number of observations Iterations completed 13 Log likelihood function These cannot be computed for the Klein and Spady estimator. They are computed at the means of the Xs. Observations used for means are All Obs. Schwarz I. Criteria 1. The probit model fits better by all measures computed. The models are obviously similar, though there is substantial difference in the fitted values. Note that the two plots are based on different coefficient vectors, so it is not possible to merge the two figures.
Var[m3], which will be Asy. The lower right element will be Asy. The needed parts are Asy. Inserting these parts in the expansion, multiplying it out and collecting terms produces the lower right element equal to 24, as expected. The necessary data are given in Examples Thus, it is clear that the log likelihood is of the form for an exponential family, and the sufficient statistics are the sum and sum of squares of the observations.
The log likelihood is found by summing these functions. The third term does not factor in the fashion needed to produce an exponential family. There are no sufficient statistics for this distribution. The question is deliberately misleading. We showed in Chapter 8 and in this chapter that in the classical regression model with heteroscedasticity, the OLS estimator is the GMM estimator. The asymptotic covariance matrix of the OLS estimator is given in Section 8.
This provides an initial set of estimates that can be used to compute the optimal weighting matrix. The asymptotic covariance matrix is computed from the first order conditions for the optimization. This is the comparison between and The proof can be done by comparing the inverses of the two covariance matrices. Thus, if the claim is correct, the matrix in is larger than that in 12 , or its inverse is smaller. It might not seem obvious, but we can also derive asymptotic standard errors for these estimates by constructing them as method of moments estimators.
Observe, first, that the two estimates are based on moment estimators of the probabilities. Let xi denote one of the observations drawn from the normal distribution. Then, the two proportions are obtained as follows: Let zi 2. So, the two proportions are simply the means of functions of the sample observations. Therefore, E[ z 2.
The covariance of the two sample means is a bit trickier, but we can deduce it from the results of random sampling. Cov[ z 2. But, zi 2. It follows, then, that Cov[zi 2. Var[ p 2. Var[ x ]. Of course, this is what we found in part b. Cancelling terms and gathering f x x! We can obtain f y by summing over x in the joint density. This produces the required result. Since we found f y by factoring f x,y into f y f x y apparently, given our result , the answer follows immediately. Just divide the expression used in part e.
The critical value for a test of size. Once again, this is a small value. The LM statistic is, therefore, 9. This is also well under the critical value for the chi-squared distribution, so the hypothesis is not rejected on the basis of any of the three tests. We can solve the first order conditions in each case. The asymptotic variance obtained from the first estimator would be the negative inverse of the expected second derivative, Asy.
The negative of the expected Hessian is shown below. Unfortunately, this is an error in the text. This makes sense. The cross term has expectation zero. The second term has expectation zero. Assembling these in a block diagonal matrix, then taking the negative inverse produces the result given earlier. Inserting the covariance matrix given above produces the suggested statistic.
The asymptotic variance of the MLE is, in fact, equal to the Cramer-Rao Lower Bound for the variance of a consistent, asymptotically normally distributed estimator, so this completes the argument. In example 4. The need for the restriction on P will emerge shortly. See B and E-1 in Section E2. As we saw in Example 4.
In what follows, we will focus on the slope estimators. Note that the MLE is ill defined if P is less than 2. For the logit model, the result is very simple. Denote by H the actual second derivatives matrix derived in the previous part. The method of scoring uses the expected Hessian instead of the actual Hessian in the iterations. The methods are the same for the logit model, since the Hessian does not involve yi. The methods are different for the probit model, since the expected Hessian does not equal the actual one.
The restricted log likelihood given with the initial results equals This is the log likelihood for a model that contains only a constant term. The log likelihood for the model is Twice the difference is about 3,, which vastly exceeds the critical chi squared with 5 degrees of freedom.
The hypothesis would be rejected. The Jondrow measure was then computed and plotted against output. There does not appear to be any relationship, though the weak relationship such as it is, is indeed, negative. We will need a bivariate sample on x and y to compute the random variable, then average the draws on it.
The precise method of using a Gibbs sampler to draw this bivaraite sample is shown in Example As noted there, the Gibbs sampler is not much of a simplification for this particular problem. It is simple to draw a sample dircectly from a bivariate normal distribution. Monte Carlo Simulation? Sample size is Procedure studies the LM statistic? Three kinds of disturbances Create? Procedure studies the Wald statistic?
The posterior is p y1 , The posterior defines a two parameter gamma distribution, G n, ny. There is no need to do the integration. For economic data, this is likely to be fairly common. The covariance can be found by taking the expected product of terms with equal subscripts.
A plot of the relationship between the differenced and undifferenced series is shown in the right panel above. The horizontal axis plots the autocorrelation of the original series. The values plotted are the absolute values of the difference between the autocorrelation of the differenced series and the original series. The results are similar to those for the AR 1 model.
For most of the range of the autocorrelation of the original series, differencing increases autocorrelation. But, for most of the range of values that are economically meaningful, differencing reduces autocorrelation.
What parameter is estimated by the regression of the ordinary least squares residuals on their lagged values? Solve the disturbance process in its moving average form. Since the regression contains a lagged dependent variable, we cannot use the Durbin-Watson statistic directly. Therefore, we would reject the hypothesis of no autocorrelation.
It is commonly asserted that the Durbin-Watson statistic is only appropriate for testing for first order autoregressive disturbances. In each case, assume that the regression model does not contain a lagged dependent variable.
Comment on the impact on your results of relaxing this assumption. Autocorrelation of transformed residuals is -. The rate of inflation was computed with all observations, then observations 6 to were used to remove the missing data due to lags.
Least squares results were obtained first. The residuals were then computed and squared. Using observations , we then computed a regression of the squared residual on a constant and 8 lagged values. The chi- squared statistic with 8 degrees of freedom is We note, the problem that this reflects is probably the specific, doubtless unduly restrictive, ARCH structure assumed. Finally, we used these variance estimators to compute a weighted least squares regression accounting for the heteroscedasticity.
This regression is based on observations , again because of the lagged values. Once again, this probably results from the restrictive assumption about the lag weights in the ARCH model. For the first, the mean lag is. The impact multiplier is. For the second, the coefficient on xt is. We would regress yt on a constant, xt, xt-1, Multiply both sides of the equation by 1 -. All remaining terms, involving L2, L3, This provides a recursion for all remaining coefficients. There are two approaches possible.
Nonlinear least squares could be applied to the moving average distributed lag form. This would be fairly complicated, though a method of doing so is described by Maddala and Rao A much simpler approach would be to estimate the model in the autoregressive form using an instrumental variables estimator. The lagged variables xt-2 and xt-3 can be used for the lagged dependent variables.
The model can be estimated as an autoregressive or distributed lag equation. Consider, first, the autoregressive form. Clearly, the model cannot be estimated by ordinary least squares, since there is an autocorrelated disturbance and a lagged dependent variable. The parameters can be estimated consistently, but inefficiently by linear instrumental variables. The inefficiency arises from the fact that the parameters are overidentified.
The linear estimator estimates seven functions of the five underlying parameters. One possibility is a GMM estimator. A minimum distance estimator could then be used for estimation. Thus, values of the moving average regressors can be built up recursively. The model is a classical regression, so it can be estimated by ordinary least squares.
The estimator of the long run multiplier would be the sum of the least squares coefficients. If the sixth lag is omitted, then the standard omitted variable result applies, and all the coefficients are biased. The orthogonality result needed to remove the bias explicitly fails here, since xt is an AR 1 process. All the lags are correlated. Since the form of the relationship is, in fact, known, we can derive the omitted variable formula.
In particular, by construction, xt will have mean zero. By implication, yt will also, so we lose nothing by assuming that the constant term is zero.
Note that the column that is to the right of the inverse matrix is r times the last column matrix. Therefore, the matrix product is r times the last column of an identity matrix. For part d, we will use a similar construction. But, now there are five variables in X1 and xt-5 and xt-6 in X2. We used observations 7 - of the logged real investment and real GDP data in deviations from the means for all regressions.
Note that although there are some large changes in the estimated individual parameters, the long run multiplier is almost identical in all cases. Looking at the analytical results we can see why this would be the case. Because the model has both lagged dependent variables and autocorrelated disturbances, ordinary least squares will be inconsistent. Consistent estimates could be obtained by the method of instrumental variables.
We can use xt-1 and xt-2 as the instruments for yt-1 and yt Efficient estimates can be obtained by a two step procedure. The method of Hatanaka discussed in the text is another possibility. Using the real consumption and real disposable income data in Table F5. The estimated autocorrelation based on the IV estimates is. All three sets of estimates are based on the last observations, The autocorrelations are simple to obtain just by multiplying out vt2, vtvt-1 and so on.
The partial autocorrelations are messy, and can be obtained by the Yule Walker equations. Alternatively and much more simply , we can make use of the observation in Section Thus, the results in Section Not very plausible. ADF Test. The test statistic for the ADF is 0. The critical value in the lower part of Table Since our value is large than this, it follows that the hypothesis of a unit root cannot be rejected.
Reestimated model in example These functions can then be inverted to estimate the original parameters. The invariance of maximum likelihood estimators to transformation will justify this approach. One virtue of this approach is that the same procedure is used for both probit and logit models. These are listed in the table below.
This is larger than the 1. To compute the likelihood ratio statistic, we will require the two log-likelihoods. The restricted log-likelihood for both the probit and logit models is given in The critical value from the chi-squared distribution with one degree of freedom is 3. We now compute the Hessian for the logit model. Notice that in spite of the quite different coefficients, these are identical to the results for the probit model.
Remember that we originally estimated the probabilities, not the parameters, and these were independent of the distribution. The standard errors are. The chi-squared statistic is 4. For data in which y is a binary variable, we can decompose the numerator somewhat further. First, divide both numerator and denominator by the sample size.
Second, since only one variable need be in deviation form, drop the deviation in x. The denominator is the sample variance of x. Since yi is only 0s and 1s, y is the proportion of 1s in the sample, P. Therefore, the regression is essentially measuring how much the mean of x varies across the two groups of observations.
0コメント