Econ 4650 — Assignment 1
Econ 4650 — Assignment 1
Due Monday, Feb. 1
1. (Labor force participation) I created the table below from the March 2015 Current Population
Survey. It shows the joint probability distribution of two variables: F and L. F is whether
the person is female (F = 1 for female), L is whether the person is in the labor force (L = 1 for
being in the labor force). Being in the labor force means you’re either employed or looking for
employment. Also, only civilians of working age (16+) have a chance to be in the labor force.
L = 0 L = 1
F = 0 0.15 0.33
F = 1 0.22 0.29
Using the table …
(a) Compute the labor force participation rate (the probability of being in the labor force). What
was the official labor force participation rate published by the Bureau of Labor Statistics
for March 2015? Even if you compute it correctly from the table your number won’t exactly
match the BLS because of rounding in the table, and because the BLS number includes a
seasonal adjustment, but it should be within 1.
(b) Is the labor force participation rate you computed in (a) equal to E¡
? Why or why not?
(c) Pr(L = 1|F = 0) and Pr(L = 1|F = 1). What do these represent?
(d) Show how to use the law of iterated expectations to compute the overall labor force participation
(e) Compute sFL — the covariance between labor force participation and whether female.
(f) Is being in the labor force statistically independent of being female? How can you tell?
2. (Standardizing) The random variable Y has a mean of 1 and a variance of 4. Let Z =
Show that µZ = 0 and s
3. (Looking up probabilities) Recollect how to look up normal and t probabilities from a table.
(a) If Y is distributed N(1,4) (mean is 1, standard deviation is 4), find Pr(Y = 3). Use Table 1
in the appendix.
(b) If Y is distributed t90 (t with 90 degrees of freedom), find Pr(-1.99 = Y = 1.99). Use Table
2 in the appendix.
(c) If Y is N(0,1), find Pr(-1.99 = Y = 1.99).
(d) Why are the answers to 3b and 3c approximately the same?
4. (Central limit theorem) Yi
, i = 1,…,n, are i.i.d. Bernoulli random variables with p = 0.5.
Using the central limit theorem, find Pr(0.45 = Y = 0.55) when n = 10 and when n = 100. What
is the minimum n such that Pr(0.45 = Y = 0.55) > 0.99?
5. (Chi-squared) Suppose Yi
is distributed i.i.d. N(0,s
) for i = 1,2,…,m.
(a) Show that E
= 1. Hint: Use what I showed in class about the expected square of
a random variable: E
. Also, realize that s
is a constant. Think of it as a
(b) Show that W = 1/s
is distributed ?
(chi-squared with n degrees of freedom).
Hint: Check the definition of a chi-squared with n degrees of freedom. Why does W fit that
(c) Show that E
= n. Hint: W is a linear combination. In class we’ve looked at how to
compute the expected value of a linear combination. Also, from part 5a you know the
expected value of each component of the linear combination.
6. (Benefits of diversification) Suppose you have some money to invest—for simplicity, $1—and
you are planning to put a fraction w into a stock market mutual fund and the rest, 1-w, into a
bond mutual fund. Suppose that $1 invested in a stock fund yields Rs after 1 year and that $1
invested in a bond fund yields Rb, suppose that RS is random with mean 0.08 (8%) and standard
deviation 0.07, and suppose that Rb is random with mean 0.05 (5%) and standard deviation
0.04. The correlation between Rs and Rb is 0.25. If you place a fraction w of your money in
the stock fund and the rest, 1 – w, in the bond fund, then the return on your investment is
R = wRs +(1-w)Rb.
(a) Suppose that w = 0.5. Compute the mean and standard deviation of R.
(b) Suppose that w = 0.75. Compute the mean and standard deviation of R.
(c) What value of w makes the mean of R as large as possible? What is the standard deviation
of R for this value of w?
(d) What is the value of w that minimizes the standard deviation of R?
7. (Basic descriptive statistics with R) The dataset cps8.csv (on Canvas) is from the Current
Population Survey, a monthly survey of workers carried out by the U.S. Bureau of Labor
Statistics. This data is from the March 2008 survey. Each row in the data set corresponds to a
person who responded to the survey.
DATA DESCRIPTION: ahe is the average hourly earnings reported by the respondent, yrseduc is
the number of years of education, female is a 0-1 indicator of whether the respondent is female
(0 = no, 1 = yes), and age is the respondent’s age, northeast, midwest, south, and west are all
0-1 indicators of whether the respondent works in the northeastern, midwestern, southern, or
western part of the U.S.
note—Parts (c) through (k) require R. Turn in the R code you used, all in one file please, to
Canvas. I should be able to reproduce your results from your code, so please check that your
code really works and is consistent with the answers you’ve given before uploading.
(a) Is this a cross-sectional, time series, or panel data set? Or, have I not give you enough
information to tell? Explain your answer.
(b) How many people/respondents are represented in this survey?
(c) What proportion of respondents are female?
(d) What is the average ahe among females? Among males?
(e) What the variance of ahe among females? Among males?
(f) Plot ahe (y-axis) versus age (x-axis) Does there appear to be an association between ahe
and age? Describe what you see.
(g) Compute the mean ahe for each value of age and add these to the plot.
(h) Compute the covariance and correlation between ahe and age.
(i) Plot ahe (y-axis) versus yrseduc (x-axis) Does there appear to be an association between
ahe and yrseduc? Describe what you see.
(j) Compute the mean ahe for each value of yrseduc and add these to the plot.
(k) Compute the covariance and correlation between ahe and yrseduc.
8. (DataCamp) Complete the DataCamp course “Introduction to R.” Upon completion you’ll get a
certificate. Upload your DataCamp certificate to Canvas to receive full credit for this exercise.