It 403 project part 1
Total Point: 100
Details of what is needed to complete the mini project as discussed in class
Use SPSS for your computations, copy all relevant tables and graphs to appropriate
places in your document as discuss in class.
1. Exploratory Data Analysis (EDA) : 36 Points
1.1. Identify your quantitative variables
1.2. Create a histogram and describe the shape of the distribution. Interpret and explain in
plain English what the shape means in terms of your variables.
1.3. Describe the center of the distribution, (what is a good measure of center for your
variables) Interpret and explain in plain English
1.4. Interpret and explain in plain English, the Five-number-summary
1.5. Create a boxplot and Normal probability plot. Interpret and explain in plain English your
finding. How is it related to your histogram?
1.6. Calculate the outlier using the 1.5*IQR rule. Interpret and explain in plain English your
finding. Include list of possible outliers and what might be the possible reasons why they
1.7. Explain your overall findings and thoughts. 2. Regression Analysis: 43 points
2.1. Identify your Response and predictor variables
2.2. Make a scatterplot with regression line, describe it and explain in plain English, the
form, direction and strength of the relationship of the two variables. Don’t forget to
mention any outliers.
2.3. Calculate correlation and interpret and explain in plain English. Don’t forget to copy
your correlation table over here too.
2.4. Calculate the Coefficient of Determination, R2, interpret and explain in plain English
2.5. Calculate your slope and intercept. Write out the Regression equation. Interpret and
explain in plain English the slope (Hint calculate X = 1) and the intercept (Hint calculate
X = 0).
2.6. Use the Regression equation (Y-hat) to predict (Hint select a value of X from your
observed values and plug it in the Regression equation. Compare your Y and Y-hat, are
1 they close or not. Explain), make sure it does not violate extrapolation. Answer the
question – Do you think this model predict well, explain.
2.7. Calculate residual and create the residual plot. Interpret and explain in plain English the
pattern in your residual plot. Does it violate homoscedasticity? Explain
2.8. In conclusion, will you say your X variable is a good predictor of your Y variable? (Hint:
Explain more about the line that best fit the relationship, your conclusion must reflect
some part of the interpretations of R2, explain what percentage of Y is not explained by
X, give examples of other factors that could help explain Y. Interpret everything in plain
English. 3. Contingency Table (Two-way Table): 21 points
3.6. Identify your qualitative variables
Create a contingency table by using cross tabulation from SPSS.
Create a clustered bar graph. Interpret and explain in plain English.
Create a two way table without percentages Interpret and explain in plain English.
Create a two way table with percentages. Interpret and explain in plain English.
What are your most interesting findings in general? Explain Submission
All graphs and tables must be copied over first and then interpretations below them. Please
follow the format of the Project sample.
Any limitations of your project must be stated. How would you project be different if you have
access to more info or more data should be included too. 2