DAT565 Data Analysis and Business
Analytics
American Public University System
(APUS)
DAT565 Week 1 Assignment - Statistics Analysis
Week
1 Assignment - Statistics Analysis
Resource: Pastas R Us, Inc. Database
Review the Wk 2 - Apply: Statistical
Report assignment.
In preparation for writing your
report to senior management next week, conduct the following descriptive
statistics analyses with Excel®. Answer the questions below in your Excel sheet
or in a separate Word document:
Insert a new column in the database
that corresponds to "Annual Sales." Annual Sales is the result of
multiplying a restaurant's "SqFt." by "Sales/SqFt."
Calculate the mean, standard
deviation, skew, 5-number summary, and interquartile range (IQR) for each of
the variables.
Create a box-plot for the
"Annual Sales" variable. Does it look symmetric? Would you prefer the
IQR instead of the standard deviation to describe this variable's dispersion?
Why?
Create a histogram for the
"Sales/SqFt" variable. Is the distribution symmetric? If not, what is
the skew? Are there any outliers? If so, which one(s)? What is the
"SqFt" area of the outlier(s)? Is the outlier(s) smaller or larger
than the average restaurant in the database? What can you conclude from this
observation?
What measure of central tendency is
more appropriate to describe "Sales/SqFt"? Why?
DAT565 Week 2 Assignment Apply Signature Assignment
Statistical Report
Wk 2 - Apply: Signature Assignment:
Statistical Report
Resources: Pastas R Us, Inc.
Database & Microsoft
Excel®, Wk 1: Descriptive Statistics Analysis Assignment
Scenario:
Pastas R Us, Inc. is a fast-casual restaurant chain
specializing in noodle-based dishes, soups, and salads. Since its inception,
the business development team has favored opening new restaurants in areas
(within a 3-mile radius) that satisfy the following demographic conditions:
· Median age between 25 – 45 years old
· Household median income above national average
· At least 15% college educated adult population
Last year, the marketing department rolled out
a Loyalty Card strategy to increase sales. Under this program, customers
present their Loyalty Card when paying for their orders and receive some free
food after making 10 purchases.
The company has collected data from its 74
restaurants to track important variables such as average sales per customer,
year-on-year sales growth, sales per sq. ft., Loyalty Card usage as a
percentage of sales, and others. A key metric of financial performance in the
restaurant industry is annual sales per sq. ft. For example, if a 1200 sq. ft.
restaurant recorded $2 million in sales last year, then it sold $1,667 per sq.
ft.
Executive management wants to know whether the
current expansion criteria can be improved. They want to evaluate the
effectiveness of the Loyalty Card marketing strategy and identify feasible,
actionable opportunities for improvement. As a member of the analytics
department, you've been assigned the responsibility of conducting a thorough
statistical analysis of the company's available database to answer executive
management's questions.
Report:
Write a 750-word statistical report that includes the following
sections:
· Section 1: Scope and descriptive statistics
· Section 2: Analysis
· Section 3: Recommendations and Implementation
Section 1 - Scope and descriptive statistics
· State the report's objective.
· Discuss the nature of the current database.
What variables were analyzed?
· Summarize your descriptive statistics findings
from Excel. Use a table and insert appropriate graphs.
Section 2 - Analysis
· Using Excel, create scatter plots and display
the regression equations for the following pairs of variables:
· "BachDeg%" versus
"Sales/SqFt"
· "MedIncome" versus
"Sales/SqFt"
· "MedAge" versus
"Sales/SqFt"
· "LoyaltyCard(%)" versus
"SalesGrowth(%)"
· In your report, include the scatter plots. For
each scatter plot, designate the type of relationship observed (increasing/positive,
decreasing/negative, or no relationship) and determine what you can conclude
from these relationships.
Section 3: Recommendations and implementation
· Based on your findings above, assess which
expansion criteria seem to be more effective. Could any expansion criterion be
changed or eliminated? If so, which one and why?
· Based on your findings above, does it appear
as if the Loyalty Card is positively correlated with sales growth? Would you
recommend changing this marketing strategy?
· Based on your previous findings, recommend
marketing positioning that targets a specific demographic. (Hint: Are younger
people patronizing the restaurants more than older people?)
· Indicate what information should be collected
to track and evaluate the effectiveness of your recommendations. How can this
data be collected? (Hint: Would you use survey/samples or census?)
Cite references to support your assignment.
DAT565
Week 3 Market Analysis research
Wk 3 Market
Analysis Research
Section 1 - Business overview, mission and vision
Describe the
proposed business. Address the following in your summary:
· What type of
product/service will it offer?
· What is the intended
market?
· What is the business
model?
Articulate the
business's mission and vision statements.
Section 2 - Market analysis
Based on your intended product
or service, describe the characteristics of your customer base.
Investigate and
list your current competitors. For example, if you're manufacturing and selling
exercise equipment, current competitors would be companies like
NordicTrack or Nautilus, Inc. To simplify the process, limit yourself to
businesses you are competing directly against. If your business is a local
bistro, then your competitors are other local similar restaurants.
Research and
estimate the size of your intended market. Market size is the number of
potential customers or unit sales for your products/services. Consider the
nature of your business when researching market size. For instance, if your
business is a local bistro, then your market size is determined by the
population within a reasonable radius of the restaurant, say, 5-15 miles
maximum. On the other hand, if your business intends to sell a low-weight
mountain bicycle online, then the market size is the average number of
low-weight mountain bicycles sold nationwide annually.
Estimate the
value of your market. Market value is the potential revenues the market has to
offer. For instance, suppose low-weight mountain bicycles have a market size of
300,000 units a year with a $500 average price. Then the market value would be
$150,000,000.
It can be difficult to
estimate market value as you must make assumptions related to market
size and average unit price. Use the expected value concept introduced in
Chapter 6 of the textbook and the chart below to do the estimation.
Expected Market Value: Mountain Bicycle Scenario
|
||||
Assumptions |
Probability p(x) |
Units ('000) |
Avg. Unit Price ($) |
Market Value ('$000) |
Pessimistic |
0.30 |
200 |
450 |
90,000 |
Most Likely |
0.50 |
300 |
500 |
150,000 |
Optimistic |
0.20 |
375 |
550 |
206,250 |
Expected Market Value ('$000) |
|
|
|
102,110 |
Estimate the
total addressable market or TAM. This is the fraction of the total market you
realistically estimate to get. Most businesses have a relatively modest market
share, well under 20%. For example, if we expect to get a 5% share of the
mountain bicycle market, then our TAM would be: 0.05 * $102,110,000 =
$5,105,500 or approximately $5.1 million.
Section 3: Recommendation
Based on the information
collected, do you feel it is a good idea to continue with the implementation of
the business? Explain why or why not.
Cite references
to support your assignment.
Format your
citations according to APA guidelines.
DAT565 Week 4 Apply Signature Assignment Globalization and
Information Research
Wk 4 - Apply: Signature Assignment:
Globalization and Information Research
The assignment has two parts: one focused on
information research and analysis, and the other is on applied
analytics.
Resources:
· Microsoft Excel®
· "How Netflix Expanded
to 190 Countries in 7 Years"
from Harvard Business Review
· CallCenterWaitingTime.xlsx
file
Part 1: Globalization and Information Research
Context: Companies that perform well in their country of origin usually
consider expanding operations in new international markets. Deciding where,
how, and when to expand is not an easy task, though.
Many issues need to be considered before
crafting an expansion strategy and investing significant resources to this end,
including:
· the level of demand to be expected for the
company's products/services
· presence of local competitors
· the regulatory, economic, demographic, and
political environments
Carefully researching and analyzing these and
other factors can help mitigate the inherent risk associated with an overseas
expansion strategy, thus increasing the likelihood of success.
As a data analyst in your company's business
development department, you've been tasked with the responsibility of
recommending countries for international expansion. You'll write a report to
the company's executive team with your research, analysis, and
recommendations.
Instructions:
Write a 525-word summary covering the following items:
· According to the article listed above, what
were the most important strategic moves that propelled Netflix's successful
international expansion?
· The article mentions investments in big data
and analytics as one of the elements accompanying the second phase of overseas
expansion. Why was this investment important? What type of information did
Netflix derive from the data collected?
· According to the article, what is exponential
globalization?
· Not all international expansion strategies are
a resounding success, however. Research an article or video that discusses an
instance in which an American company's expansion efforts in another country
failed. According to the article/video you selected, what were the main reasons
for this failure? Do you agree with this assessment?
· Explain some of the reasons why certain
companies' expansion plans have failed in the past.
Part 2: Hypothesis testing
Context: Your organization is evaluating the quality of its call center
operations. One of the most important metrics in a call center is Time in Queue
(TiQ), which is the time a customer has to wait before he/she is
serviced by a Customer Service Representative (CSR). If a customer has
to wait for too long, he/she is more likely to get discouraged and hang
up. Furthermore, customers who have to wait too long in the queue
typically report a negative overall experience with the call. You've conducted
an exhaustive literature review and found that the average TiQ in
your industry is 2.5 minutes (150 seconds).
Another important metric is Service Time (ST),
also known as Handle Time, which is the time a CSR spends servicing the
customer. CSR's with more experience and deeper knowledge tend to resolve
customer calls faster. Companies can improve average ST by providing more
training to their CSR's or even by channeling calls according to area of
expertise. Last month your company had an average ST of approximately 3.5
minutes (210 seconds). In an effort to improve this metric, the company has
implemented a new protocol that channels calls to CSR's based on area
of expertise. The new protocol (PE) is being tested side-by-side with the
traditional (PT) protocol.
Instructions:
Access the CallCenterWaitingTime.xlsx file. Each row in the database
corresponds to a different call. The column variables are as follows:
· ProtocolType: indicates protocol type, either PT or PE
· QueueTime: Time in Queue, in seconds
· ServiceTime: Service Time, in seconds
· Perform a test of hypothesis to determine
whether the average TiQ is lower than the industry standard of 2.5
minutes (150 seconds). Use a significance level of
α=0.05.
· Evaluate if the company should allocate
more resources to improve its average TiQ.
· Perform a test of hypothesis to determine
whether the average ST with service protocol PE is lower than with the PT
protocol. Use a significance level of α=0.05.
· Assess if the new protocol served its
purpose. (Hint: this should be a test of means for 2 independent groups.)
· Submit your calculations
and a 175-word summary of your conclusions.
DAT565 Week 5 Apply - Regression Modeling
Wk 5 - Apply: Regression Modeling
Resources: Microsoft Excel®,
DAT565_v3_Wk5_Data_File
Instructions:
The Excel file for this assignment contains a
database with information about the tax assessment value assigned to medical
office buildings in a city. The following is a list of the variables in the
database:
· FloorArea: square feet of floor space
· Offices: number of offices in the building
· Entrances: number of customer entrances
· Age: age of the building (years)
· AssessedValue: tax assessment value (thousands of dollars)
Use the data to construct a model that predicts the tax assessment
value assigned to medical office buildings with specific characteristics.
· Construct a scatter plot in Excel with FloorArea
as the independent variable and AssessmentValue as the dependent
variable. Insert the bivariate linear regression equation and r^2 in your
graph. Do you observe a linear relationship between the 2 variables?
· Use Excel's Analysis ToolPak to conduct a
regression analysis of FloorArea and AssessmentValue. Is
FloorArea a significant predictor of AssessmentValue?
· Construct a scatter plot in Excel with Age
as the independent variable and AssessmentValue as the dependent
variable. Insert the bivariate linear regression equation and r^2 in your
graph. Do you observe a linear relationship between the 2 variables?
· Use Excel's Analysis ToolPak to conduct a
regression analysis of Age and Assessment Value. Is Age a significant
predictor of AssessmentValue?
Construct a multiple regression model.
· Use Excel's Analysis ToolPak to conduct a
regression analysis with AssessmentValue as the dependent variable and FloorArea,
Offices, Entrances, and Age as independent variables. What
is the overall fit r^2? What is the adjusted r^2?
· Which predictors are considered significant if
we work with α=0.05? Which predictors can be eliminated?
· What is the final model if we only use FloorArea
and Offices as predictors?
· Suppose our final model is:
· AssessedValue = 115.9 + 0.26 x FloorArea + 78.34 x Offices
· What wouldbe the assessed value of a medical
office building with a floor area of 3500 sq. ft., 2 offices, that was built 15
years ago? Is this assessed value consistent with what appears in the database?
DAT565 Week 6 Apply Signature Assignment Smart Parking Space
App Presentation
Wk 6 - Apply: Signature Assignment: Smart
Parking Space App Presentation
The PowerPoint presentation includes an audio
component in addition to speaker notes.
Resources: Microsoft Excel®,
DAT565_v3_Wk6_Data_File
Scenario: A city's administration isn't driven by the goal of
maximizing revenues or profits but instead looks at improving the quality of
life of its residents. Many American cities are confronted with high traffic
and congestion. Finding parking spaces, whether in the street or a parking lot,
can be time consuming and contribute to congestion. Some cities have rolled out
data-driven parking space management to reduce congestion and make
traffic more fluid.
You're a data analyst working for a mid-size
city that has anticipated significant increments in population and car traffic.
The city is evaluating whether it makes sense to invest in infrastructure to
count and report the number of parking spaces available at the different
parking lots downtown. This data would be collected and processed in real-time,
feeding an app that motorists can access to find parking space
availability in different parking lots throughout the city.
Instructions: Work with the provided Excel database. This
database has the following columns:
· LotCode: A unique code that identifies the
parking lot
· LotCapacity: A number with the respective
parking lot capacity
· LotOccupancy: A number with the current number
of cars in the parking lot
· TimeStamp: A day/time combination indicating
the moment when occupancy was measured
· Day: The day of the week corresponding to the
TimeStamp
· Insert a new column, OccupancyRate, recording
occupancy rate as a percentage with one decimal. For instance, if the current
LotOccupancy is 61 and LotCapacity is 577, then the OccupancyRate would be
reported as 10.6 (or 10.6%).
· Using the OccupancyRate and Day columns,
construct box plots for each day of the week. You can use Insert > Insert
Statistic Chart >Box and Whisker for this purpose. Is the median occupancy
rate approximately the same throughout the week? If not, which days have lower
median occupancy rates? Which days have higher median occupancy rates? Is this
what you expected?
· Using the OccupancyRate and LotCode
columns,construct box plots for each parking lot. You can use Insert >
Insert Statistic Chart >Box and Whisker for this purpose. Do all parking
lots experience approximately equal occupancy rates?Are some parking lots more
frequented than others? Is this what you expected?
· Select any 2 parking lots. For each one,
prepare a scatter plot showing occupancy rate against TimeStamp for the week
11/20/2016 –11/26/2016. Are occupancy rates time dependent? If so, which times
seem to experience highest occupancy rates? Is this what you expected?
Presentation:
Create a 10- to 12-slide presentation with speaker notes and audio.
Your audience is the City Council members who are responsible for
deciding whether the city invests in resources to set in motion the smart
parking space app.
Complete the following in your presentation:
· Outline the rationale and goals of the
project.
· Utilize boxplots showing the occupancy
rates for each day of the week. Include your interpretation of results.
· Utilize box plots showing the occupancy rates
for each parking lot. Include your interpretation of results.
· Provide scatter plots showing occupancy
rate against time of day of your selected four parking lots. Include your
interpretation of results.
· Make a recommendation about continuing
with the implementation of this project.
No comments:
Post a Comment