## Intro to Econometrics

This is a take home exam on Oct.21. You have 24 hours to finish the exam. In the assignment tab, there are
and table outputs in a pdf file.
Before you start, make sure you have installed and library packages below: ggplot2, dplyr, tidyverse, foreign,
stargazer
Question 1: ggplot and descriptive statistics [20 points]
In this question, please generate samples of random variables, plot the sample or generate descriptive statistic
table as directed.
(a) [5 points] Generate a sample of 100 coin flips where the flips follow a Bernoulli distribution with p=0.5.
(b) [5 points] Plot the histogram of the data you got from (a).
(c) [5 points] Generate a sample of N =100 observations from a population of normal distribution with
mean=2 and sd=0.6. Hint: rnorm()
(d) [5 points] Create a formatted table (using stargazer) to show descriptive statistics of the sample you
generated in (c).
Question 2: Data [50 points]
Download data by 2016 CPS (the March CPS) on NYU Classes (data named morg16). You should be familiar
with morg16 since it has been used several times in homework. You will be asked to do an ols regression for
specific variables, report the results and interpret the coefficients.
(a) [5 points] Keep only the observations:
(a) on weekly earnings, sex, race, age and education (the corresponding variable names are earnwke,
(b) for respondents aged 25-64.
(c) from New York State (hint: stfips==36)
(b) [5 points] Data cleaning: Change all the data which equals to 0 to NA and drop all the NA data.
(c) [5 points] Plot the weekly earnings of individuals against the age.
(d) [5 points] Take log of weekly earnings of individuals. Create a new column/vector called logincome for
it.
(e) [5 points] Plot the log of weekly earnings of individuals against the age.
1
(f) [10 points] Run a regression with log of weekly earnings of individuals as dependent variable and age as
independent variable.
(g) [5 points] Report your results of ols by a formated table (using stargazer)
(h) [5 points] Interpret the slope by elasticity method. Note: To receive full score, you need to both
calculate the results and state it in words.
(i) [5 points] Interpret the slope by standardized method. Note: To receive full score, you need to both
calculate the results and state it in words.
Question 3: Simulate OLS Coefficients [30 points]
In this question, you are going to simulate ols coefficients step by step. Other than homework, TA videos
might help to answer this question.
Suppose the true relationship between Yi and Xi is captured by:
Yi = 0.8 + 0.9Xi + i
where E[i
|Xi
] = 0; i
is drawn from a normal distribution with mean=0 and sd=1. And Xi is a normal
distribution with mean=2 and sd=sqrt(3)
(a) [5 points] Generate a N=100 sample called X from a population normal distribution with mean=1 and
sd=sqrt(3). Hint: rnorm()
(b) [5 points] Plot your data in the sample by scatterplot.
(c) [5 points] Generate a N=100 sample of epsilon and using Yi = 0.8 + 0.9Xi + i to get a N=100 sample
of Y.
(d) [5 points] Run an ols regression, regressing X on Y. Then use a formated table to show the results
(Using stargazer).
(e) [10 points] Use a for loop to simulate the OLS regression from part d) 1000 times (generating a new
sample of X,Y each time). Save the slope and intercept coefficients from each iteration and then plot
a histogram of the estimates (one histogram for the slope estimates and one for the estimates of the
intercept) Hint: TA video 6: Simulation OLS.https://youtu.be/16eFPlJsxl4