The experiment must fulfill two goals: (1) to produce a professional report of your
experiment, and (2) to show your understanding of the topics related to least squares
regression as described in Moore & McCabe, Chapter 2. In this experiment, I will
determine whether or not there is a relationship between average SAT scores of incoming
freshmen versus the acceptance rate of applicants at top universities in the country. The
cases being used are 12 of the very best universities in the country according to US News
& World Report. The average SAT scores of incoming freshmen are the explanatory
variables. The response variable is the acceptance rate of the universities.
I used September 16, 1996 issue of US News & World Report as my source. I started out by
choosing the top fourteen "Best National Universities". Next, I graphed the fourteen
schools using a scatterplot and decided to cut it down to 12 universities by throwing out
odd data.
A scatterplot of the 12 universities data is on the following page (page 2)
The linear regression equation is:
ACCEPTANCE = 212.5 + -.134 * SAT_SCORE
R= -.632 R^2=.399
I plugged in the data into my calculator, and did the various regressions. I saw that
the power regression had the best correlation of the non-linear transformations.
A scatterplot of the transformation can be seen on page 4.
The Power Regression Equation is
ACCEPTANCE RATE=(2.475x10^23)(SAT SCORE)^-7.002
R= -.683 R^2=.466
The power regression seems to be the better model for the experiment that I have chosen.
There is a higher correlation in the power transformation than there is in the linear
regression model. The R for the linear model is -.632 and the R in the power
transformation is -.683. Based on R^2 which measures the fraction of the variation in the
values of y that is explained by the least-squares regression of y on x, the power
transformation model has a higher R^2 which is .466 compared to .399. The residual plot
for the linear regression is on page 5 and the residual plot for the power regression is
on page 6. The two residuals plots seem very similar to one another and no helpful
observations can be seen from them. The outliers in both models was not a factor in
choosing the best model. In both models, there was one distinct outlier which appeared
in the graphs.
The one outlier in both models was University of Chicago. It had an unusually high
acceptance rate among the universities in this experiment. This school is a very good
school academically which means the average SAT scores of incoming freshmen is fairly
high. The school does not receive as many applicants to the school as the others, this
due in part because of the many other factors besides academic where applicants would
choose other schools than University of Chicago. Although the number applicants is
relatively low, most of these applicants are very qualified which results in it having a
high acceptance rate.
Rate = A*(SAT)^(B)
A=2.475x10^23
B=-7.002
From the model I have chosen, I predicted what the acceptance rate for a school would be
if the average SAT score was a perfect 1600.
SAT = 1600
Rate = A*(SAT)^B = (2.475x10^23) *(1600)^(-7.002) = 9.1%
From the equation found, we have determined this "university" would have a acceptance
rate of only 9.1%. This seems as a good prediction because such a school would have a
very low acceptance rate compared to the other top universities. I believe causation does
occur in this experiment. With there being a higher average SAT scores of applicants
admitted, it would be harder to be admitted into that school. Although, I think the
equation found is not very accurate when predicting far away from the median.
I do not believe there would be any sources in collecting the data. All the data
was taken from the magazine, US News & World Reports. I strictly took twelve of the top
14 universities based on this magazine. I believe some lurking variable may be type of
school, majors offered, and number of applicants. The number of applicants a school has
would have somewhat an effect on its acceptance rate. If a school had a enormous amount
of applicants, then this school would have a relatively low acceptance rate. One reason I
think this experiment had a somewhat poor association is because of the schools selected.
Two of these schools were technical schools which meant only certain applicants would
want to apply to these schools while the other schools were more general overall.
In conclusion, the data used in this experiment had a greater than not association with
one another. The higher the average SAT score of the incoming freshmen, chances are that
the schools acceptance rate is lower.
|