Calculate the descriptive statistics from the data and display in a table.  Be sure to comment on the central tendency, variability and shape for each variable. (1 Mark)

Draw a graph that displays the distribution of life expectancy. (1 Mark)

Create a box-and-whisker plot for the distribution of share of population with HIV and describe the shape.  Is there evidence of outliers in the data? (1 Mark)

What is the likelihood that a country has a life expectancy above 65 if per capita income is below US\$1000? Is per capita income independent of life expectancy?  Use a Contingency Table. (2 Marks)

Estimate the 95% confidence interval for the population mean literacy rate. (1 Mark)

Your supervisor recently stated that life expectancy in Africa is below the average developing country of 67 years. Test her claim at the 10% level of significance. (1 Mark)

Run a multiple linear regression using the data and show the output from Excel. Hint: For the multi-level categorical variables, compare the Americas, Europe, AsiaPacific to Africa. (1 Mark)

Is the coefficient estimate for per capita income different than zero at the 5% level of significance?  Set-up the correct hypothesis test using the results found in the table in Part (G) using both the critical value and p-value approach.  Interpret the coefficient estimate of the slope. (2 Marks)

Interpret the remaining slope coefficient estimates. Comment on whether the signs are what you are expecting. (2 Marks)

Interpret the value of the Adjusted R2. Is the overall model statistically significant at the 1% level of significance?  Use the p-value approach. (1 Mark)

Do the results suggest that the data satisfy the assumptions of a linear regression: Linearity, Normality of the Errors, and Homoscedasticity of Errors?  Show using scatter diagrams, normal probability plots and/or histograms and Explain. (3 Marks)

Based on the results of the regressions, is it likely that other factors have influenced the life expectancy?  If so, provide a couple possible examples and indicate whether these would likely influence the regression results if they were included. (1 Mark)

If a community housing organisation asked for information regarding the characteristics of housing targeting the households of native-born Australians, explain whether a clustered sampling technique of the CBD would provide an accurate representation of these households. (Note: This question does not use the data) (1 Mark)

