[SOLVED]

> Let xt be the 1 * (k + 1) vector of explanatory variables for observation t. Show that the OLS estimator ï¢^ can be written as Dividing each summation by n shows that ï¢^ is a function of sample averages.

> Let a be an n * 1 nonrandom vector and let u be an n * 1 random vector with E(uu’) = In. Show that E[tr(auu’a’)] = Σni=1a2i .

> Let A be an n * n symmetric, positive definite matrix. Show that if P is any n * n nonsingular matrix, then P’AP is positive definite.

> (i) Use the definition of inverse to prove the following: if A and B are n * n nonsingular matrices, then (AB)-1 = B-1A-1. (ii) If A, B, and C are all n * n nonsingular matrices, find (ABC)-1 in terms of A-1, B-1, and C-1.

> Let X be any n * k matrix. Show that X’X is a symmetric matrix.

> Let X be an n * k matrix partitioned as X = (X1 X2), where X1 is n * k1 and X2 is n * k2. (i) Show that What are the dimensions of each of the matrices? (ii) Let b be a k * 1 vector, partitioned as where b1 is k1 * 1 and b2 is k2 * 1. Show that

> Use the data in DISCRIM to answer this question. (See also Computer Exercise C8 in Chapter 3.) (i) Use OLS to estimate the model Log(psoda) = 0 + 1prpblck + 2log(income) + 3prppov + u, and report the results in the usual form. Is ^1 statistically di

> (i) Find the product AB using (ii) Does BA exist?

> Suppose that a military dictator in an unnamed country holds a plebiscite (a yes/no vote of confidence) and claims that he was supported by 65% of the voters. A human rights group suspects foul play and hires you to test the validity of the dictator’s cl

> The new management at a bakery claims that workers are now more productive than they were under old management, which is why wages have âgenerally increased.â Let Wbi be Worker iâs wage under the old

> Let Y denote a Bernoulli(() random variable with 0 < ( < 1. Suppose we are interested in estimating the odds ratio,  = (/(1 - (), which is the probability of success over the probability of failure. Given a random sample {Y1, . . . , Yn}, we know that a

> Let Y denote the sample average from a random sample with mean m and variance 2. Consider two alternative estimators of m: W1 = [(n – 1)/n]Y and W2 = Y/2. (i) Show that W1 and W2 are both biased estimators of m and find the biases. What happens to the b

> Suppose that between their first and second years in college, 400 students are randomly selected and given a university grant to purchase a new computer. For student i, yi denotes the change in GPA from the first year to the second year. If the average c

> Let Y1, Y2, Y3, and Y4 be independent, identically distributed random variables from a population with mean m and variance 2. Let Y = ¼(Y1 + Y2 + Y3 + Y4) denote the average of these four random variables. (i) What are the expected value and variance of

> (i) Let X be a random variable taking on the values -1 and 1, each with probability 1/2. Find E(X) and E(X2). (ii) Now let X be a random variable taking on the values 1 and 2, each with probability 1/2. Find E(X) and E(1/X). (iii) Conclude from parts (i

> Consider the line y = 0 + 1x. (i) Let (x1, y1) and (x2, y2) be two points on the line. Show that (x, y) is also on the line, where x = (x1 + x2)/2 is the average of the two values and y = (y1 + y2)/2. (ii) Extend the result of part (i) to n points on t

> Let X denote the annual salary of university professors in the United States, measured in thousands of dollars. Suppose that the average salary is 52.3, with a standard deviation of 14.6. Find the mean and standard deviation when salary is measured in do

> In Problem 2 in Chapter 4, we added the return on the firmâs stock, ros, to a model explaining CEO salary; ros turned out to be insignificant. Now, define a dummy variable, rosneg, which is equal to one if ros Log(salary) = ï

> Suppose the yield of a certain crop (in bushels per acre) is related to fertilizer amount (in pounds per acre) as (i) Graph this relationship by plugging in several values for fertilizer. (ii) Describe how the shape of this relationship compares with a l

> If a basketball player is a 74% free throw shooter, then, on average, how many free throws will he or she make in a game with eight free throw attempts?

> Suppose the following model describes the relationship between annual salary (salary) and the number of previous years of labor market experience (exper): Log(salary) = 10.6 + .027 exper. (i) What is salary when exper = 0? When exper = 5? (ii) Use equati

> Just prior to jury selection for O. J. Simpson’s murder trial in 1995, a poll found that about 20% of the adult population believed Simpson was innocent (after much of the physical evidence in the case had been revealed to the public). Ignore the fact th

> In a random effects model, define the composite error vit = i + uit, where i is uncorrelated with uit and the uit have constant variance 2u and are serially uncorrelated. Define eit = vit - (vi, where ( is given. (i) Show that E(eit) = 0. (ii) Show th

> Suppose that the idiosyncratic errors in (14.4), {uit: t = 1, 2, . . . , T}, are serially uncorrelated with constant variance, 2u. Show that the correlation between adjacent differences, (uit and (ui,t+1, is 2.5. Therefore, under the ideal FE assumption

> The following is a simple model to measure the effect of a school choice program on standardized test performance: score = 0 + 1choice + 2faminc + u1, where score is the score on a statewide test, choice is a binary variable indicating whether a stude

> Assume that u = x, so that the population variation in the error term is the same as it is in x. Suppose that the instrumental variable, z, is slightly correlated with u: Corr(z, u) = .1. Suppose also that z and x have a somewhat stronger correlation:

> Consider the simple regression model y = 0 + 1x + u and let z be a binary instrumental variable for x. Use (15.10) to show that the IV estimator ^1 can be written as ^1 = (y1 - y0)/(x1 - x0), where y0 and x0 are the sample averages of yi and xi over

> Consider a simple time series model where the explanatory variable has classical measurement error: yt = 0 + 1x*t + ut xt = x*t + et, where ut has zero mean and is uncorrelated with x*t and et. We observe yt and xt only. Assume that et has zero mean an

> (i) The variable phsrank is the personâs high school percentile. (A higher number is better. For example, 90 means you are ranked better than 90 percent of your graduating class.) Find the smallest, largest, and average phsrank in the s

> For a large university, you are asked to estimate the demand for tickets to women’s basketball games. You can collect time series data over 10 seasons, for a total of about 150 observations. One possible model is lATTENDt = 0 + 1lPRICEt + 2WINPERCt +

> A simple model to determine the effectiveness of condom usage on reducing sexually transmitted diseases among sexually active high school students is infrate = 0 + 1conuse + 2percmale + 3avginc + 4city + u1, where; infrate = the percentage of sexual

> Use the state-level data on murder rates and executions in MURDER for the following exercise. (i) Consider the unobserved effects model mrdrteit = t + 1execit + 2unemit + i + uit, where t simply denotes different year intercepts and i is the unobs

> Use the data in SCHOOL93_98 to answer the following questions. Use the command xtset schid year to set the cross section and time dimensions. (i) How many schools are there. Does each school have a record for each of the six years? Verify that lavgrexpp

> Use the data in COUNTYMURDERS to answer this question. The data set covers murders and executions (capital punishment) for 2,197 counties in the United States. (i) Consider the model murdrateit = (t + 0execsit + 1execsi,t-1 + 2execsi,t-2 + 3execsi,t

> This question assumes that you have access to a statistical package that computes standard errors robust to arbitrary serial correlation and heteroskedasticity for panel data methods. (i) For the pooled OLS estimates, obtain the standard errors that allo

> Use the data in RENTAL for this exercise. The data on rental prices and other variables for college towns are for the years 1980 and 1990. The idea is to see whether a stronger presence of students affects rental rates. The unobserved effects model is wh

> The purpose of this exercise is to compare the estimates and standard errors obtained by correctly using 2SLS with those obtained using inappropriate procedures. Use the data file WAGE2. (i) Use a 2SLS routine to estimate the equation Log(wage) = 0 + 1

> Use the data in PHILLIPS for this exercise. (i) We estimated an expectation augmented Phillips curve of the form (inft = 0 + 1unemt + et, where (inft = inft – inft-1. In estimating this equation by OLS, we assumed that the supply shock, et, was uncorre

> Use the data in CARD for this exercise. (i) In Table 15.1, the difference between the IV and OLS estimates of the return to education is economically important. Obtain the reduced form residuals, v^2, from the reduced form regression educ on nearc4, expe

> Consider the analysis in Computer Exercise C11 in Chapter 4 using the data in HTV, where educ is the dependent variable in a regression. (i) How many different values are taken on by educ in the sample? Does educ have a continuous distribution? (ii) Plot

> Use the data in LABSUP to answer the following questions. These are data on almost 32,000 black or Hispanic women. Every woman in the sample is married. It is a subset of the data used in Angrist and Evans (1998). Our interest here is in determining how

> Use the data in WAGE2 for this exercise. (i) If sibs is used as an instrument for educ, the IV estimate of the return to education is .122. To convince yourself that using sibs as an IV for educ is not the same as just plugging sibs in for educ and runni

> For this exercise, use the data in AIRFARE, but only for the year 1997. (i) A simple demand function for airline seats on routes in the United States is Log(passen) = 10 + 1log(fare) + 11log(dist) + 12[3log(dist)]2 + u1 where; passen = average passen

> (i) Suppose that, after differencing to remove the unobserved effect, you think (log(polpc) is simultaneously determined with (log(crmrte); in particular, increases in crime are associated with increases in police officers. How does this help to explain

> Use the Economic Report of the President (2005 or later) to update the data in CONSUMP, at least through 2003. Reestimate equation. Do any important conclusions change?

> (i) Because log(pcinc) is insignificant in both (16.22) and the reduced form for open, drop it from the analysis. Estimate by OLS and IV without log(pcinc). Do any important conclusions change? (ii) Still leaving log(pcinc) out of the analysis, is land o

> A common method for estimating Engel curves is to model expenditure shares as a function of total expenditure, and possibly demographic variables. A common specification has the form; sgood = 0 + 1ltotexpend + demographics + u, where sgood is the fract

> (i) A model to estimate the effects of smoking on annual income (perhaps through lost work days due to illness, or productivity effects) is log(income) = 0 + 1cigs + 2educ + 3age + 4age2 + u1, where cigs is number of cigarettes smoked per day, on av

> Use the data in APPLE for this exercise. These are telephone survey data attempting to elicit the demand for a (fictional) “ecologically friendly” apple. Each family was (randomly) presented with a set of prices for regular apples and the ecolabeled appl

> Use the data in MLB1 for this exercise. (i) Use the model estimated in equation (4.31) and drop the variable rbisyr. What happens to the statistical significance of hrunsyr? What about the size of the coefficient on hrunsyr? (ii) Add the variables runsyr

> There, we used the data in FERTIL1 to estimate a linear model for kids, the number of children ever born to a woman. (i) Estimate a Poisson regression model for kids, Interpret the coefficient on y82. (ii) What is the estimated percentage difference in

> (i) For what percentage of the workers in the sample is pension equal to zero? What is the range of pension for workers with nonzero pension benefits? Why is a Tobit model appropriate for modeling pension? (ii) Estimate a Tobit model explaining pension i

> Use the data in JTRAIN98 to answer the following questions. Here you will use a Tobit model because the outcome, earn98, sometimes is zero. (i) How many observations (men) in the sample have earn98 = 0? Is it a large percentage of the sample? (ii) Estima

> Use the data set in ALCOHOL, obtained from Terza (2002), to answer this question. The data, on 9,822 men, includes labor market information, whether the man abuses alcohol, and demographic and background variables. In this question you will study the eff

> (i) Using OLS on the full sample, estimate a model for log(wage) using explanatory variables educ, abil, exper, nc, west, south, and urban. Report the estimated return to education and its standard error. (ii) Now estimate the equation from part (i) usin

> Use the data in CPS91 for this exercise. These data are for married women, where we also have information on each husband’s income and demographics. (i) What fraction of the women report being in the labor force? (ii) Using only the data for working wo

> (i) The variable favwin is a binary variable if the team favored by the Las Vegas point spread wins. A linear probability model to estimate the probability that the favored team wins is P(favwin = 1|spread) = 0 + 1spread. Explain why, if the spread inc

> (i) Let yt be real per capita disposable income. Use the data through 1989 to estimate the model yt =  + t + yt-1 + ut and report the results in the usual form. (ii) Use the estimated equation from part (i) to forecast y in 1990. What is the forecast

> (i) Graph gfr against time. Does it contain a clear upward or downward trend over the entire sample period? (ii) Using the data through 1979, estimate a cubic time trend model for gfr (that is, regress gfr on t, t2, and t3, along with an intercept). Comm

> (i) Estimate the linear trend model chnimpt =  + t + ut, using the first 119 observations (this excludes the last 12 months of observations for 1988). What is the standard error of the regression? (ii) Now, estimate an AR(1) model for chnimp, again usi

> (i) It may be that the expected value of the return at time t, given past returns, is a quadratic function of returnt-1. To check this possibility, use the data in NYSE to estimate; returnt = 0 + 1returnt-1 + 2returnt-1 + ut; report the results in sta

> Use the data in PHILLIPS for this exercise. (i) Estimate the models represented in equations using the data through 2015. (ii) Use the new equations to forecast unem2016; round to two decimal places. Which equation produces a better forecast? (iii) Use t

> (i) In Example 18.7, we estimated an error correction model for the holding yield on six-month T-bills, where one lag of the holding yield on three-month T-bills is the explanatory variable. We assumed that the cointegration parameter was one in the equa

> In testing for cointegration between gfr and pe, add t2 to equation to obtain the OLS residuals. Include one lag in the augmented DF test. The 5% critical value for the test is -4.15.

> (i) Estimate an AR(3) model for pcip. Now, add a fourth lag and verify that it is very insignificant. (ii) To the AR(3) model from part (i), add three lags of pcsp to test whether pcsp Granger causes pcip. Carefully, state your conclusion. (iii) To the m

> (i) Using all of the years—through 2017—run the regression (inft on inft21 (and an intercept) and test the null hypothesis that {inft} is I(1) against the alternative that it is I(0). At what significance level do you reject the null hypothesis? (ii) Wha

> This question asks you to study the so-called Beveridge Curve from the perspective of cointegration analysis. The U.S. monthly data from December 2000 through February 2012 are in BEVERIDGE. (i) Test for a unit root in urate using the usual Dickey-Fuller

> Use the data in MINWAGE.DTA for sector 232 to answer the following questions. (i) Confirm that lwage232t and lemp232t are best characterized as I(1) processes. Use the augmented DF test with one lag of gwage232 and gemp232, respectively, and a linear tim

> Use the data in TRAFFIC2 for this exercise. These monthly data, on traffic accidents in California over the years 1981 to 1989, were used in Computer Exercise C11 in Chapter 10. (i) Using the standard Dickey-Fuller regression, test whether ltotacct has a

> This exercise also uses the data from VOLAT. Here, you will study the question of Granger causality using the percentage changes. (i) Estimate an AR(3) model for pcipt, the percentage change in industrial production (reported at an annualized rate). Show

> Use the data in VOLAT for this exercise. (i) Confirm that lsp500 = log(sp500) and lip = log(ip) appear to contain unit roots. Use Dickey Fuller tests with four lagged changes and do the tests with and without a linear time trend. (ii) Run a simple regres

> In equation (4.42) of Chapter 4, using the data set BWGHT, compute the LM statistic for testing whether motheduc and fatheduc are jointly significant. In obtaining the residuals for the restricted model, be sure that the restricted model is estimated usi

> (i) Using the data from all but the last four years (16 quarters), estimate an AR(1) model for (r6t. (We use the difference because it appears that r6t has a unit root.) Find the RMSE of the one-step-ahead forecasts for (r6, using the last 16 quarters. (

> Use the data in WAGEPRC for this exercise. Problem 5 in Chapter 11 gave estimates of a finite distributed lag model of gprice on gwage, where 12 lags of gwage are used. (i) Estimate a simple geometric DL model of gprice on gwage. In particular, estimate

> Use the data in RENTAL for this exercise. The data for the years 1980 and 1990 include rental prices and other variables for college towns. The idea is to see whether a stronger presence of students affects rental rates. The unobserved effects model is L

> Use the data in KIELMC for this exercise. (i) The variable dist is the distance from each home to the incinerator site, in feet. Consider the model Log(price) = 0 + 0y81 + 1log(dist) + d1y81.log(dist) + u. If building the incinerator reduces the value

> Consider the version of Fair’s model in Example 10.6. Now, rather than predicting the proportion of the two-party vote received by the Democrat, estimate a linear probability model for whether or not the Democrat wins. (i) Use the binary variable demwins

> (i) In part (i) of Computer Exercise C6 in Chapter 11, you were asked to estimate the accelerator model for inventory investment. Test this equation for AR(1) serial correlation. (ii) If you find evidence of serial correlation, re-estimate the equation b

> Use the data in OKUN to answer this question; (i) Estimate the equation pcrgdpt = 0 + 1cunemt + ut and test the errors for AR(1) serial correlation, without assuming {cunemt: t = 1, 2, . . .} is strictly exogenous. What do you conclude? (ii) Regress th

> Use the data in NYSE to answer these questions. (i) Estimate the model in equation (12.47) and obtain the squared OLS residuals. Find the average, minimum, and maximum values of u^2t over the sample. (ii) Use the squared OLS residuals to estimate the fol

> Use CONSUMP for this exercise. One version of the permanent income hypothesis (PIH) of consumption is that the growth in consumption is unpredictable. [Another version is that the change in consumption itself is unpredictable; see Mankiw (1994, Chapter 1

> Okun’s Law—see, for example, Mankiw (1994, Chapter 2)—implies the following relationship between the annual percentage change in real GDP, pcrgdp, and the change in the annual unemployment rate, cunem: pcrgdp = 3 - 2 * cunem. If the unemployment rate is

> (i) Test for a unit root in log(invpc), including a linear time trend and two lags of (log(invpct). Use a 5% significance level. (ii) Use the approach from part (i) to test for a unit root in log(price). (iii) Given the outcomes in parts (i) and (ii), do

> In this exercise, you are to compare OLS and LAD estimates of the effects of 401(k) plan eligibility on net financial assets. The model is nettfa = 0 + 1inc + 2inc2 + b3age + b4age2 + b5male + b6e401k + u. (i) Use the data in 401KSUBS to estimate the

> Use the data in MURDER only for the year 1993 for this question, although you will need to first obtain the lagged murder rate, say mrdrte-1. (i) Run the regression of mrdrte on exec, unem. What are the coefficient and t statistic on exec? Does this regr

> Use the data in JTRAIN98 to answer this question. The variable unem98 is a binary variable indicating whether a worker was unemployed in 1998. It can be used to measure the effectiveness of the job training program in reducing the probability of being un

> We computed the OLS and a set of WLS estimates in a cigarette demand equation. (i) Obtain the OLS estimates in equation (8.35). (ii) Obtain the h^i used in the WLS estimation of equation (8.36) and reproduce equation (8.36). From this equation, obtain th

> (i) Estimate the model children = 0 + 1age + 2age2 + 3educ + 4electric + 5urban + u and report the usual and heteroskedasticity-robust standard errors. Are the robust standard errors always bigger than the nonrobust ones? (ii) Add the three religio

> Suppose that the return from holding a particular firm’s stock goes from 15% in one year to 18% in the following year. The majority shareholder claims that “the stock return only increased by 3%,” while the chief executive officer claims that “the return

> Much is made of the fact that certain mutual funds outperform the market year after year (that is, the return from holding shares in the mutual fund is higher than the return from holding a portfolio such as the S&P 500). For concreteness, consider a 10-

> In Example, quantity of compact discs was related to price and income by quantity = 120 - 9.8 price 1 .03 income. What is the demand for CDs if price = 15 and income = 200? What does this suggest about using linear functions to describe demand curves?

> Use the data set GPA1 to answer this question. It was used in Computer Exercise C13 in Chapter 3 to estimate the effect of PC ownership on college GPA. (i) Run the regression colGPA on PC, hsGPA, and ACT and obtain a 95% confidence interval for PC. Is t

> Suppose that a high school student is preparing to take the SAT exam. Explain why his or her eventual SAT score is properly viewed as a random variable.

> The following table contains monthly housing expenditures for 10 families. (i) Find the average monthly housing expenditure. (ii) Find the median monthly housing expenditure. (iii) If monthly housing expenditures were measured in hundreds of dollars, rat

> we estimated an equation to test for a tradeoff between minutes per week spent sleeping (sleep) and minutes per week spent working (totwrk) for a random sample of individuals. We also included education and age in the equation. Because sleep and totwrk a

> Write a two-equation system in “supply and demand form,” that is, with the same variable yt (typically, “quantity”) appearing on the left-hand side: y1 = 1y2 + 1z1 + u1 y1 = 2y2 + 2z2 + u2. (i) If 1 = 0 or 2 = 0, explain why a reduced form exists f

> Suppose you are hired by a university to study the factors that determine whether students admitted to the university actually come to the university. You are given a large random sample of students who were admitted the previous year. You have informati

> Let patents be the number of patents applied for by a firm during a given year. Assume that the conditional expectation of patents given sales and RD is; E(patents|sales,RD) = exp[0 + 1log(sales) + 2RD + 3RD2], where sales is annual firm sales and RD

> (i) Suppose in the Tobit model that x1 = log(z1), and this is the only place z1 appears in x. Show that where ï¢1 is the coefficient on log(z1). (ii) If x1 = z1, and x2 = z21, show that where ï¢1 is the coefficient on z1 a

> (i) For a binary response y, let y be the proportion of ones in the sample (which is equal to the sample average of the yj). Let q^0 be the percent correctly predicted for the outcome y = 0 and let q^1 be the percent correctly predicted for the outcome y

Question: