Wednesday, October 1, 2014

Correlation in Tableau

I will start this post out by saying I am by no means a statistics expert.  I took a lot of math courses while getting my physics degree, but statistics was never my forte.  However, I know enough to be able to translate mathematical formulas into Tableau's calculated field syntax.

Before getting into the nitty gritty of the math, I just want to point out that Tableau will automatically compute the r-squared value if you add a trend line to any scatterplot (right-click the trend line and choose 'Describe Trend Model'):

While you can copy/paste this into an annotation or some other calculation, it wont dynamically update when your data updates.  Therefore if we wanted to compute the correlation so we can use it in other places in Tableau we must compute it in Tableau's calculated fields.

First let's start with the formula for correlation ('r'):

Now let's map those to Tableau calculated field functions:
N = size()
Sigma XY = window_sum(sum(X)*sum(Y))
Sigma X = window_sum(sum(X))
Sigma Y = window_sum(sum(Y))
Sigma X^2 = window_sum(sum(X)^2)

Sigma Y^2 = window_sum(sum(Y)^2)

Putting it all together results in a pretty gnarly calc, but it's do-able:

Once you've written your calculation using whatever your X and Y fields represent then drag your new field onto the Detail shelf to include it in the view.

Note:  since this uses table calculation functions, we must tell Tableau what to compute along. After adding the field to your view, right-click it and choose Compute Using->[dimension that makes each point unique in your scatter].  You could probably also leave it on the default 'Table (Across)' but in case that doesn't work, select the specific dimension that is making all the points in your scatter:

Now, where to display it?  While it wont let you insert this field into Area or Point annotations, you can insert it into the Title or into a Mark annotation.  If you use the Title, just edit the title of the view and use the Insert button in the upper right to stick the Correlation field in there:

If you use the mark annotation, just pick any of the marks in your scatter and choose 'Annotate->Mark'.  You can then delete all the other fields in there and just show the correlation.

Lastly, format it to remove the 'line' and 'line end' so it appears as its floating (like an Area annotation):

Final product:


-The Don Data


  1. Nice work. Do you know if there's a possibility to annotate Trend lines in Tableau ?

  2. Drawing trend lines is one of the few easy techniques that really WORK. Prices respect a trend line, or break through it resulting in a massive move. Drawing good trend lines is the MOST REWARDING skill.

    The problem is, as you may have already experienced, too many false breakouts. You see trend lines everywhere, however not all trend lines should be considered. You have to distinguish between STRONG and WEAK trend lines.

    One good guideline is that a strong trend line should have AT LEAST THREE touching points. Trend lines with more than four touching points are MONSTER trend lines and you should be always prepared for the massive breakout!

    This sophisticated software automatically draws only the strongest trend lines and recognizes the most reliable chart patterns formed by trend lines...


    Chart patterns such as "Triangles, Flags and Wedges" are price formations that will provide you with consistent profits.

    Before the age of computing power, the professionals used to analyze every single chart to search for chart patterns. This kind of analysis was very time consuming, but it was worth it. Now it's time to use powerful dedicated computers that will do the job for you: