How to Scatter Plot with Aggregate Data in Tableau

Julia Yang
Analytics Buddies
Published in
4 min readJan 25, 2021

--

In a recent school project we are asked to predict ‘demand’ of Airbnb market against the change of listing prices with a scatter plot, namely, to show that for every 10% increase in Airbnb prices how much percentage change in demand can be expected.

With the 2009–2019 Airbnb listing data ( each row is a unique listing) at hand, our study group came up with 3 ideas, and one of them is to count the number of listings against each price and then make a scatter plot (#of listings vs. price). This maybe not a perfect representation of demand but rather a market supply as in fact a listing can have multiple occupants. Still we thought it maybe interesting to see the relationship between supply and price. Here are the 2 steps to achieve the goal:

Step 1: Aggregate Table in Tableau Prep

We need a new table to show how many listings correspond to each particular price first, so that we can continue with the scatterplot. This can not be done directly in Tableau Desktop (maybe can be done and idk…), instead, can be achieved by Tableau Prep easily as below:

Firstly, open Tableau Prep and hit the button ‘Connect to Data’. Choose the data set based on which you want to generate the aggregate table.

Secondly, hit the blue + beside the data set, then a drop down list with pop out, hit ‘Aggregate’

Then, pick ‘price’ and ‘room type’ from the setting list (original data set) and drag them to ‘group field’

Change the type before ‘id’ from ‘sum’ to ‘count’, and drag it to the ‘aggregated fields’

The aggregate table is done! The next step is to save the output into a .csv file by hitting the blue ‘+’ besides ‘Aggregate 1’ and choose ‘Output’ from the drop down menu:

Finally we can see the aggregate table on the right and choose the output save location and file type on the right:

Step 2: Make the Scatter Plot in Tableau Desktop:

Firstly we import the output aggregate table from step1 to Tableau Desktop:

Then open a new worksheet, drag id to the measure areas from the dimension area, drag price to column field and drag id to the row filed, at this time the graph looks weird as there is only one point on the graph, we will fix that later:

In order to see ‘for every 10% increase in Airbnb prices how much percentage change in listing numbers can be expected’ , we need to use the log-log linear regression model:

log( # listings) = β0+β1*log(price)

The model shows 1% change in price will relate to β1% change in # of listings. Namely 10% change in price will relate to 10*β1% change in # of listings.

So to find the relationship between log( # listings) and log(price), we need to change the row and column fields from ‘sum’ to ‘log’ by double click the field and adjust it to ‘log’:

Then hit the little triangle on the green field to pop out the dropdown menu, pick ‘dimension’.

Now we can see all the points showed up:

Finally, we can hit the Analytic tab, drag ‘Trend Line’ to the graph, hover over the trend line we will see the regression model as below:

--

--