## Intro to Econometrics

This is a take home exam on Oct.21. You have 24 hours to finish the exam. In the assignment tab, there are

two attachments: one is the pdf version of the exam and the other is the R answer sheet. Please answer all of

your questions in the R answer sheet, rename it as “Your name_RExam1”. When submitting your exam,

please submit your R answer sheet and the screenshot of your graph outputs (all of your graphs in one pdf

file). In other words, please submit 2 documents: (1) Your R answer sheet. (2) Screenshots of your graph

and table outputs in a pdf file.

Before you start, make sure you have installed and library packages below: ggplot2, dplyr, tidyverse, foreign,

stargazer

Question 1: ggplot and descriptive statistics [20 points]

In this question, please generate samples of random variables, plot the sample or generate descriptive statistic

table as directed.

(a) [5 points] Generate a sample of 100 coin flips where the flips follow a Bernoulli distribution with p=0.5.

(b) [5 points] Plot the histogram of the data you got from (a).

(c) [5 points] Generate a sample of N =100 observations from a population of normal distribution with

mean=2 and sd=0.6. Hint: rnorm()

(d) [5 points] Create a formatted table (using stargazer) to show descriptive statistics of the sample you

generated in (c).

Question 2: Data [50 points]

Download data by 2016 CPS (the March CPS) on NYU Classes (data named morg16). You should be familiar

with morg16 since it has been used several times in homework. You will be asked to do an ols regression for

specific variables, report the results and interpret the coefficients.

(a) [5 points] Keep only the observations:

(a) on weekly earnings, sex, race, age and education (the corresponding variable names are earnwke,

sex,age, race, grade92).

(b) for respondents aged 25-64.

(c) from New York State (hint: stfips==36)

(b) [5 points] Data cleaning: Change all the data which equals to 0 to NA and drop all the NA data.

(c) [5 points] Plot the weekly earnings of individuals against the age.

(d) [5 points] Take log of weekly earnings of individuals. Create a new column/vector called logincome for

it.

(e) [5 points] Plot the log of weekly earnings of individuals against the age.

1

(f) [10 points] Run a regression with log of weekly earnings of individuals as dependent variable and age as

independent variable.

(g) [5 points] Report your results of ols by a formated table (using stargazer)

(h) [5 points] Interpret the slope by elasticity method. Note: To receive full score, you need to both

calculate the results and state it in words.

(i) [5 points] Interpret the slope by standardized method. Note: To receive full score, you need to both

calculate the results and state it in words.

Question 3: Simulate OLS Coefficients [30 points]

In this question, you are going to simulate ols coefficients step by step. Other than homework, TA videos

might help to answer this question.

Suppose the true relationship between Yi and Xi is captured by:

Yi = 0.8 + 0.9Xi + i

where E[i

|Xi

] = 0; i

is drawn from a normal distribution with mean=0 and sd=1. And Xi is a normal

distribution with mean=2 and sd=sqrt(3)

(a) [5 points] Generate a N=100 sample called X from a population normal distribution with mean=1 and

sd=sqrt(3). Hint: rnorm()

(b) [5 points] Plot your data in the sample by scatterplot.

(c) [5 points] Generate a N=100 sample of epsilon and using Yi = 0.8 + 0.9Xi + i to get a N=100 sample

of Y.

(d) [5 points] Run an ols regression, regressing X on Y. Then use a formated table to show the results

(Using stargazer).

(e) [10 points] Use a for loop to simulate the OLS regression from part d) 1000 times (generating a new

sample of X,Y each time). Save the slope and intercept coefficients from each iteration and then plot

a histogram of the estimates (one histogram for the slope estimates and one for the estimates of the

intercept) Hint: TA video 6: Simulation OLS.https://youtu.be/16eFPlJsxl4