Short notes & exercises for Tableau
Tableau is a BI software company recently acquired by Salesforce. While there are many business intelligence and data visualization tools available today (e.g. Power BI, Qlik), Tableau has consistently maintained its leadership in the Gartner Magic Quadrant for Analytics & Business Intelligence Platforms for the past 6-7 years. It is a widely used software in variety of industries including banking and financial sector, health care and pharma, CPG and retail, business consulting and so forth.
Learning Resources (1) Tableau Online Tutorials, (2) Tableau Public
This note provides a quick overview of Tableau with particular focus on:
We will use data from variety of contexts to illustrate the main ideas.
Data here is drawn from Zillow and can be downloaded form Zillow Research. Zillow database includes information on homes for sale and rent, Zestimate (Zillow’s estimate of home values) and a variety of other home-related information. This comprehensive data is provided at various levels of geographies (State, County, Metro/city, ZIP code, and Neighborhood), allowing for a careful look at the housing market in the US. To see full analysis refer to Zillow case study
Graph below shows the time-trends in various neighborhoods across largest 50 metros in the US. The two graphs in the right panel show average values over calendar year and month respectively. Use the dropbdown menu to Pick City
For exercise: Who owns Guns in America? Download Data
Working with discrete data (where your variable of interest is discrete, e.g. Buy/No-Buy,, Click-NoClick, etc.) requires some modifications for visulization. Here is an interactive that provides a quick look at differences across various demographic groups on thier attitudes toward abortion. Data used here is from General Social Survey (GSS 1972–2016).
Use the “Pick Demographic” menu to select ideology/politics, religiosity, or other demographics of respondents.
Download Kantar Data (Commercial data, NYU Access Only)
In the file Kantar_Ad.twbx you are provided data from Kantar Media which is the largest aggregator of Advertising expenditures globally. Our data is for top 10 industries in terms of total AD dollar spent. Goal of this exercise is to understand how to work with Long Form data.
Start by examining each of the non-numerical variables. For example, put Industry on Rows and Spend on Column. Create the following Variable: % Spent on Internet
Answer the following Qs:
Which industry spends the highest proportion of Ad dollars on Internet Display?
Which are the top 3 firms in the Financial Sector in terms of total AD dollar spent in 2015?
What % of Total Ad Dollar in 2015 was spent on internet by American Express?
Analysis by Jessica Battisto, NYU-Stern Fall (2020)
Glance through the following articles for background:
Data used here is from BuzzFeed’s github repository that provides data from FBI’s National Instant Criminal Background Check System. From BuzzFeed repo: ..The FBI provides data on the number of firearm checks by month, state, and type — but as a PDF. The code in this GitHub repository downloads that PDF, parses it, and produces a spreadsheet/CSV of the data.. which currently covers November 1998 – August 2019...
Data for this comes from NYC Health. File used is “data-by-modzcta.csv”. Date Updated: Sept 25, 2020
Note from NYC Health: ‘ZIP Code doesn’t actually refer to an area, but rather a collection of points that make up a mail delivery route. There are some buildings that have their own ZIP Code, and some non-residential areas with ZIP Codes. In this file, the Health Department reports data by modified ZCTA, which combines census blocks with smaller populations to allow more stable estimates of population size for rate calculation.’
Note that these are not standard Census geographics so we need to use spatial files.
A look at historic and current population distribution across Age & Gender. Asia (led by Japan) and Europe (e.g. Italy, Finland & Portugal) are home to some of the world’s oldest populations. In several emerging economies, such as Peru, Brazil, India, the Philippines, and Malaysia, 24 to 30 percent of the population is 14 years of age or younger– this number eing even higher in many sub-Saharan African countries.
There are also large gender differences emerging in many Asian countries most notably in India and China in the young population. Gender differences observed in Middle Eastern countries (primarily in working age bracket) are driven by large migrant population. Notable pattern also emerges in Eastern Europe and Russia in older population where we see larger proportion of females, driven in large part by disparities in life expectancy.
Mobile phone industry is highly competitive with several major players such as Samsung and Apple. In a quest to penetrate the smartphone hardware market, Microsoft took a major step by acquiring Nokia in 2014. The $7 billion deal turned out to be a ‘monumental mistake’ by Microsoft CEO Steve Ballmer. Microsoft in the process of cutting over 8,000 jobs and write down over $7.5 billion on its Nokia phone-handset unit, wiping out nearly all of the value of a business it acquired just over a year ago (If interested in the industry, this 3 minute Bloomberg clip provides a quick summary).
Objective: Suppose you are a consultant to Microsoft before their acquisition of Nokia in 2014. You are provided with Global sales data from Euromonitor. Data contains unit sales (in 000) for 4 major players in the industry (“All other brands” are sales for all other brands besides the 4 listed below).
Your objective is to provide a brief summary of this industry with a focus on Nokia. Step 1: Create a variable for Market share of Nokia. Step 2: Create a global Map for market share of Nokia. Step 3: Show a Time Trend of Share of Nokia. Step 4: Show time trend of Nokia Share in Asia.
Download Data (Commercial data, NYU Access Only)