Business Data Analysis

Instructions:
• All numerical calculations and graphs/plots should be done using EXCEL. Marks
will not be awarded if you have not provided the appropriate EXCEL output.
• Your assignment must be typed in a Word document in .doc or .docx format – no
pdfs or Excel files should be submitted!
• When answering questions, wherever required, you should cut and paste all the necessary
Excel outputs such as plots, regression output etc to show your working/output on your
assignment.
• A copy of your completed assignment must be submitted electronically. The Coversheet
is now available electronically so it does not need to be attached to your assignment.
Complete the electronic version and then submit your assignment (available in the
Assessment 2 – Computer Assignment → Computer Assignment section of the unit
website).
• You are required to keep a hard copy and an electronic copy of your submitted assignment
to re-submit, in case the original submission is lost for some reason.
Important Notice:
As this is an individual assessment item, students should work on their own and present their
individual assignment submission, even if you discuss it in a group. If found to have cheated,
all submissions involved would receive a mark of zero for this assessment item.
Discussions related to the assignment will not be allowed on the Discussion Board.
AFE135 (SP2, 2019) Computer Applications Assignment Page 1 of 3
Computing Assignment
Answer all FIVE Questions
Economists often discuss how women earn less than men on average as women often choose
to work less hours. Researchers suggest that the choice of hours worked depends on various
factors such as age, childcare needs, occupation choice and flexibility. In order to investigate
the relationship between the hours worked and the income of Australian men and women, a
researcher plans to survey a sample of individuals across the country.
QUESTION 1 [2 marks]
Briefly explain (using no more than 250 words in total for this question)
(a) What type of survey method the researcher could use and why?
(b) What sampling method could the researcher use to select his/her sample and why?
(c) What are the two main variables the researcher should consider collecting data for the
purpose of the above analysis and why? Identify the data type(s) for the variables.
(d) What kind of issues the researcher may face in this data collection?
Suppose a researcher has collected data form a sample of 60 individuals using the sampling
method you have proposed in (b). For each individual, the hours worked per week and yearly
income (measured in ‘000’s dollars) were recorded. The data are stored in file
HOURSWORKED.XLSX which is available in the “Assessment > Assessment 2 – Computer
Assignment > Assignment Instructions and Questions” in the unit website. Using this data set
and EXCEL, answer the questions below.
QUESTION 2 [4 marks]
First, the researcher is interested in presenting the data on hours worked per week and yearly
income. For this purpose, the researcher categorised the data on worked per week and yearly
income into six descriptive groups and calculated the frequencies given below
Frequency tables
Occupation Yearly Income
Occupation category Frequency Income category Frequency
Occupation group 1 4 Income group 1 4
Occupation group 2 18 Income group 2 9
Occupation group 3 14 Income group 3 12
Occupation group 4 13 Income group 4 18
Occupation group 5 5 Income group 5 13
Occupation group 6 6 Income group 6 4
AFE135 (SP2, 2019) Computer Applications Assignment Page 2 of 3
Using the data in the tables above and using EXCEL, answer the following questions.
(a) Which graphical technique or chart should be used if the researcher is interested in
comparing the number of individuals in each occupation category? Explain the reason
for the selection of this graphical chart. Construct the chart and describe what you can
observe about the number of individuals belong to each occupation category.
(b) Which graphical technique or chart should be used if the researcher is interested in
describing the proportion of the individuals in each yearly income group? Explain the
reason for the selection of this graphical chart. Construct the chart and describe what
you can observe about the proportion of individuals belong to each income category.
QUESTION 3 [4 marks]
Second, researcher wishes to use graphical descriptive methods to present a summary of the
data.
(a) The number of observations (N) is 60 individuals. The researcher suggests using 7
class intervals to construct a histogram for each variable. Explain how the
researcher would have decided on the number of class intervals (K) as 7.
(b) The researcher suggests using class intervals as 10-15, 15-20, …, 35-40 for the
hours per week variable and class intervals 50-55, 55-60, …, 75-80 for the yearly
income variable. Explain how the researcher would have decided the width of the
above class intervals (or class width).
(c) Draw a histogram for each variable using appropriate BIN values from part (b) and
comment on the shape of the two distributions.
QUESTION 4 [4 marks]

Third, the researcher wishes to use numerical descriptive measures to summarize the data.
(a) Prepare a numerical summary report for the two variables including summary
measures such as mean, median, range, variance, standard deviation, smallest and
largest values and the three quartiles, for each variable.
Notes: Use QUARTILE.EXC command to generate the three quartiles.
(b) Compute the correlation coefficient using the relevant Excel function to measure
the direction and strength of the linear relationship between the two variables.
Interpret this value.
AFE135 (SP2, 2019) Computer Applications Assignment Page 3 of 3
QUESTION 5 [6 marks]
Finally, the researcher considers using regression analysis to establish a linear relationship
between the two variables – hours worked per week and yearly income.
(a) What is the dependent variable and independent variable for this analysis? Why?
(b) Use an appropriate plot to investigate the relationship between the two variables. On
the same plot, fit a linear trend line including the equation and the coefficient of
determination R2
.
(c) Estimate a simple linear regression model and present the estimated linear equation.
Display the regression summary table and interpret the intercept and slope
coefficient estimates of the linear model.
(d) Find and interpret the value of the coefficient of determination, R-squared (R2

find the cost of your paper

This question has been answered.

Get Answer