Data management
Research question:
This project will research how the mother’s lifestyle and age affect the birth weight of the infant. The variable is the birth
weight of the infant, and the smoker, age and drinks will be included in the independent variables. It is rational to hypothesize the smoke and alcohol will affect the baby’s weight negatively, and the mother’s age may also influence the birth weight of the infant.
The data be used is Birthweight_Smoking, and it is from 1989 linked National Natality-Mortality Detail files. And
this data is used in Professor Douglas Almond, Kenneth Chay, and David Lee’s paper “The Costs of Low Birth
Weight”. This data is from the Stock-Watson’s textbook “Introduction to Econometrics”, and it contains 3000
observations.
Professor’s opinion: Consider using age squared and age cubed as additional regressors. Mother’s education
maybe an important regressor but there is a difficulty: more than 16 years of schooling are top-coded as 17, so
the value 17 may actually represent 20 and more years of education. You may consider replacing education
with several dummy variables: one for less than 12 years (high-school dropout), one for 12 years (high school
graduate), etc.