Showing posts with label Level Of Detail. Show all posts
Showing posts with label Level Of Detail. Show all posts

Monday, November 30, 2015

Benchmarking using LOD's (replicating values, freezing values, sub-queries)



A great use case for Level Of Detail expressions in Tableau is to do benchmarking or comparisons.  The scenario is that you have some Dimension column with a bunch of values in, and you want to pick one of the values and 'freeze' or 'replicate' those numbers so that no matter what you filter on, they will still be available to make comparisons.

For instance, say you have a dataset with competitor data in it.  Your company is one of those competitors, but you also have a bunch of market data for the others.  A common thing you might want to do is to compare your company's performance to a specific competitor.  The challenge with this is that if you add a filter to just keep the data for one competitor, then your company's data will be filtered out too, so there is nothing to compare to anymore!  What would fix this is if we were able to 'freeze' our company's numbers so that no matter what we filter on we will always have our company's data available to make comparisons to.

LOD expressions offer an elegant way to achieve this (with no data modeling necessary!). Essentially we can 'fix' or 'freeze' our company's numbers using the FIXED type of LOD expression.  Because of the way Tableau's query pipeline works, any calculations that use FIXED will happen before most other filters are applied.  We can use this to our advantage by fixing our numbers so that when we filter on a single competitor, the math done to compute our company's numbers has already been done and is available for all the other rows of data.  See the below image and notice how Dimension Filters happen after Fixed LOD's are calculated.


Therefore you can filter on any competitor you want and still be able to compare them to your company's numbers.

First let's write a Fixed expression that pre-calculates only the numbers from our company (in the example our company is called 'JM Gebs').  We will find our company's rows with an IF statement, then sum up those rows, and Fix those numbers at the entire-dataset-level:

See!  It works!  Our company's numbers are replicated for all the other companies too!
  
And it sill works even if we filter us out!  Booyah!

Something to keep in mind when using Fixed calculations is that you need to specify any dimensions you want to group those numbers by (for those familiar with SQL, you need to specify the GROUP BY part of the query).  For example, if I want to build a view where I show the comparison at a Country level, then I need to specify in my calculated field to fix/group the math at the Country level:

See now the numbers are different per Country

A consequence of this is that you may need to create many versions of this LOD calc depending on what dimensions you are using in your chart.  You might have a dashboard in which you have a monthly trend of the comparison, a comparison by country, another by segment, etc, so you will need a separate LOD calc for each one of these since they will each need to have different dimensions on the left side of the calculated field (each chart will need the math to be 'grouped by' different dimensions).  Your other option is you can make one calculated field with all the dimensions you want to group by in it (just comma-separate them) - this will work so long as your numbers are additive by nature - i.e. this will not work for something like Unique Customers (which would use CountD, count distinct) - I like just making separate ones for each so I am 100% sure the math is working correctly.


Another consequence of the order of how things are applied - that normal filters (by normal I mean stuff you just add to the filter shelf and dont do anything special to it like make it a Context Filter or make it a Data Source Filter) are applied after the Fixed calculation math happens - is that most other filters you want to add should probably happen before the numbers are 'fixed/frozen', therefore you need to make them Context Filters!  This is not a bad thing at all, but it is just something to be aware of.  I find that I add pretty much every filter besides the one on Company (or whatever you are doing the benchmarking against) as a context filter.

For example, say we have a chart where we are doing the comparison per Country, but we want to let the user filter to a certain Segment (or Year or anything else not being used in the view).  We really want to apply that Segment filter before we pre-calculate our company's fixed sales numbers, because we want to see our sales number just in that Segment.  If we dont make the Segment filter a Context Filter, the math will happen across all Segments and then it will just filter out the rows of data after the fact (which often has no effect, at least not the desired effect).  Again, reference the image above which shows the query pipeline order.


This also applies to Action Filters. For example, if you have a dashboard with two views - a monthly comparison and a comparison by country - and we want to allow users to click on a country and filter the other chart (and vice versa, click a month and filter the countries to just that month) then we would use a Filter Action to do that.  Great! but.... remember that the action filter is going to be applied after the math has already been calculated.  What we really want to happen is if I click on one month, I want to see the country breakdown just for that month.  Therefore in order to make that action filter apply before any of the fixed calculations, we need to add it as a Context Filter.




That should get us the desired behavior.  Now we are able to pick any competitor we want to benchmark ourselves against, and we can use any quickfilters we want to limit by Segment etc, and we are able to use our Filter Actions to drill from one chart to the other on the dashboard, all while being able to keep our number's available for performance comparisons - pretty cool!

-The Don

Workbook:  https://app.box.com/s/pnm61jivsaym5ztrjoeztqlornhjfi3u

*Note - you can set the Competitor quickfilter to multi-select, too.  I just happened to pick single-select (this is one reason why this is better than using parameters to achieve this)

*Note - if you want to show a quickfilter to pick different competitors, but you dont want to include your own company in the list, you can create a Set from the Company field and exclude 'JM Gebs', then show the quickfilter on the Set instead.

Thursday, July 30, 2015

Flow Chart / Flow Diagram / Decision Tree in Tableau

There are two ways to go about building a flow chart/diagram or decision tree in Tableau.  You can either treat it as a scatterplot on a single worksheet, or you can think about using a dashboard as your canvas, and then float multiple worksheets on the dashboard to achieve the right shape/flow.

I first started exploring the former.  It is a similar technique to using background images in tableau and creating a lookup table for the x,y coordinates of the marks you want to plot on top of the image.  I quickly noticed two downsides with this:  1. You need to add more data in the form of a new table of x,y coordinates to your underlying data (to be joined or blended to your actual data) - there are obvious downsides to this; 2. You cannot use actions to trigger a drilldown to another worksheet by clicking on one of the numbers/marks in your flow chart (since all the marks are in one chart you cannot create a different filter action per mark (metric) in your chart - i.e. if I clicked on a metric I would want to be taken to another worksheet that has more details about that metric, and if I clicked a different metric I would want to be taken to a different sheet).

That's why I decided to try using the dashboard as my canvas and then make a separate worksheet for each metric, which could then be floated/arranged on the dashboard canvas.  Benefits of doing it this way:  1. You do not need to alter your data at all!; 2. You can create a separate filter action for each metric/worksheet that goes to a different destination worksheet in order to drill down and present more information about that metric.

That approach allowed much more flexibility in terms of how to arrange each mark in the flow chart, but one question remained:  How to get the arrows into the chart?  My first thought was to insert images of arrows into the dashboard.  If you are good with editing images and rotating them and sizing them to then be inserted into the chart then this is a viable approach.  However in my example I wasn't quite sure about how far each mark needed to be from one another and what angle each arrow needed to be at etc.  Since you cannot rotate images in a tableau dashboard you would need to create an image with arrow at exactly the right angle, which is a pain.  Perhaps if you only need straight arrows or already have a bunch of these images laying around then this might not be very painful.

Instead of images, I decided to use another worksheet to be the background or first layer of canvas which I could then add point annotations to to create my arrows.  A little formatting is necessary after you add the annotation (need to remove all the text, then format it to get rid of any borders and change the thickness and color to suit your needs) but they worked quite well.  The only trouble I had with this was even tough there was no text in my annotation, I needed to make the 'text box' quite wide so that once I floated a worksheet with one of the metrics over the top of it that I could still select the text box in order to move it around to get the start of the arrow to be in the right place.

Quick order of how to build this:
1. Build all the worksheets - one metric per worksheet
2. Build one more worksheet for the 'arrows'.  In order to add annotations there needs to be something in the view, so I just added number of records to text, changed transparency to make it clear (via Color shelf), removed the tooltip, and set the fit to 'Entire View'.  Dont add the annotations yet.
3. Create the dashboard: Add the Arrow worksheet first, then 'float' each of the other metrics and arrange them how you want.
4. Add the annotations to create the arrows from the dashboard itself.  I used Point Annotations.
5. Move them around and format them to your liking.

In addition to being a good example of this specific chart type, there are lots of nifty Level Of Detail (LOD) expressions used to calculate each of the metrics.  Things like 'Spend Per Customer', 'Avg Visit Value', 'Items Per Visit' and so on.  Since the granularity of the underlying data is at a per order per line item level, getting these type of metrics would require lots of sub-queries with multiple levels of aggregation to achieve before the LOD expressions were introduced in 9.0.  LOD's rule.

Two workbook's are available for download - one with just the flow chart and one with some filter actions to allow you to drill down to other detailed worksheets (I got lazy and only set up drilldowns for the first 3 metrics, but you will get the point).

Flow Chart: https://app.box.com/s/v06fqkxiufhwpyfhc3h3fba3rk83soqo
Flow Chart with Drill: https://app.box.com/s/mkf3ka80j8hy2cvtqxybo6rvk1918ruq

Any questions or improvements you've found, please comment below.

-The Don