Penn State University Economic Statistics Worksheet
Name:Email ID:
@psu.edu
Worked with these other students:
ECON306 Problem Set 2
INSTRUCTIONS: Solve the following questions to the best of your ability. Ask me if
you do not know how to solve any of these questions before the due date. I will work with
you if you are having trouble solving these.
To receive full credit for this assignment, the problem set needs to be submitted to Canvas
in a single PDF document containing your 1) Stata log file in a .pdf file, 2) any figures
(scatterplots, histograms, etc.), and 3) any written explanations and answers. All of these
components need to be attached together in that order. Late submissions will NOT be accepted. DO NOT email! No assignments will be accepted via email.
First of all, for this problem set, you will have to submit the Stata log file. Stata can
record your session into a file called a log file but does not start a log automatically; you must
tell Stata to record your session. By default, the resulting log file contains what you type
and what Stata produces in response, recorded in a format called Stata Markup and Control
Language (SMCL). The file can be printed or converted to plain text for incorporation
into documents you create with your word processor. You can find more information here:
https://www.stata.com/manuals13/u15.pdf.
So, in the beginning of your Stata .do file write the following command: log using PSX,
replace (or a different file name). Then, at the very end of your .do file, include log close and
then on a new line translate PSX.smcl PSX.pdf. This would translate your Stata SMCL log
files directly into PDF files and then use Adobe Acrobat to merge PDF files together. You
will need to turn in this log file to receive full credit for this assignment.
I would strongly suggest compiling the log file in Stata after you have completed all
of your code and can run it smoothly without any errors. In that way, your log
file would not contain any lines of code that do not produce any results or any duplicate
results. Please do your best to include comments in your code (using the ∗ sign in your Stata
.do file) and to make the solutions to the different problems as clearly marked as possible.
Otherwise, the graders might have to penalize you, if they cannot follow your work. And
then I will have to re-grade your work and the whole process becomes highly inefficient.
ECON 306 Problem Set 2, Fall 2022
Page 2
Earnings and Height Revisited
The previous homework assignment used the same data on earnings and height. You can
use your EarningsHeight.dta file from the previous homework or convert the Excel file from
this assignment into Stata format. This time, we are going to perform multiple regressions
instead of simple regressions like last time.
1
Median Height (again)
What is the median value of height in the sample? You should make sure this is the same
value as you found on the previous problem set. That is a pretty good indication that you
are working with the same data set.
2
Histogram
Make a histogram for the variable weight. It is OK to leave the default settings here. If you
type the command “histogram weight” then Stata will provide the histogram in a separate
window. If you go to File → Save As, then you can select the .pdf format to save this image.
Comment on anything strange you notice about the histogram.
3
Simple Regressions
a) Run a regression of earnings on height. You should end up with the same results you
did on the previous problem set. That’s a very high-confidence way to know that you’re
working with the same data set.
b) Run a regression of earnings on weight.
c) What is the meaning of the slope coefficient in the context of this regression?
d) Predict the earnings for someone who weighs 170 pounds.
e) You likely noticed some outliers for weight. To show you how regressions can be sensitive
to these extreme values, run another regression of earnings on weight, but this time
only for people who weigh less than 500 pounds. You can accomplish this by adding “if
weight