How to build a Stream Graph in Tableau Software

I have been trying to build a Stream Graph in Tableau for a few days now. What a coincidence, Alex Jones offered us his (beautiful) version one week ago. Alex’s visualization uses the ThemeRiver algorithm upstream for point placement and data preparation in Alteryx. I offer you today an alternative version, entirely realized in Tableau Software.

French Version

2018-09-13_14h14_00
Link to the visualization

 

What is a Stream Graph ?

The « Stream Graph » is a type of Stacked Area Graph built around a central axis. It is used to visualize evolutions of volumes, on several categories. It gives, at the same time, visual information on the evolution of the shares for each category and on the total evolution of the volume.

streamgraph1.png
Learn more about the Stream Graph (coming soon)

Polygons !

Polygons my dear Tableau friends, it’s all about polygons! Like many other non-native Tableau visualizations, building a Stream Graph requires using and mastering them.

7096wV.gif
Alerte Polygone !

This tutorial is entirely based on Rody Zakovich‘s area bump chart on his blog datatableauandme. His blog is amazing, take a look ! 🙂

Data

We have to realize a cross join of our original dataset with a second one. I advise you to aggregate your dataset before starting to use it to avoid performance issues.

I am using an Excel file with two sheets.

2018-09-15_15h46_40.png
« data » sheet

The first sheet (data) contains my aggregation on the time axis and the category. A technical field (Link) is also created to facilitate the cartesian product in Tableau.

2018-09-15_15h47_12.png
« model » sheet

The second sheet (model) contains the technical fields to build the polygons. One field is the sigmoid function, used to create the organic and smoothed appearance between each year. The technical field « Link » is still present to perform the Cartesian product.

The Excel file can be downloaded here.

2018-09-15_15h43_52.png
The very famous cartesian product

The last step is to join these two sheets on the « Link » field in Tableau Software.

Calculations

Once your data is ready and your model is loaded, the real work can begin. A series of more or less complex calculations must be performed before starting to visualize something.

Please-say-you-are-kidding-GIF.gif
Yes, a Gilmore Girls gif, and ?

These calculations and techniques are entirely based on Rody Zakovich‘s area bump chart.

1.Partition :  allows you to create all the polygons points.

IF [Position]=1
THEN [Path]
ELSE 50+(49-[Path])
END

2.YearFake : will be used as a continuous field, in « Columns ». This is the abscissa of our points (instead of the Year field).

[Path]/49+[Year]

3.ValueCurrent : current value, the one to the left of each stream part.

sum({FIXED [Category],[Year]: MIN([Value])})

4.ValueNext : value for the next year for each dimension, the most right of each stream part.

IFNULL(LOOKUP([3.ValueCurrent],1),0)

5.ValueX : allows to center the curve according to the maximum possible value. This is the new part.

MIN({MAX({FIXED [Year]: SUM([Value])/100})}
-{FIXED [Year]: SUM([Value])/100}/2)

6.Rank : rank on the current value. This calculation is useful for the area bump chart aestitic but not for our Stream Graph. You can replace its value by « 0 ».

RANK([3.ValueCurrent]) 0

7.TopCurrent : initial top position, also used to get the cumulative effect of curves (thanks to running_sum). This calculation is important and reused in the calculations that follow.

RUNNING_SUM([3.ValueCurrent])
+((SIZE()-[6.Rank])/100)+[5.ValueX]

8.TopNext : next top position.

LOOKUP([7.TopCurrent],1)

9.BottomCurrent : initial bottom position.

[7.TopCurrent]-[3.ValueCurrent]

10.BottomNext : next bottom position.

IFNULL([8.TopNext]-[4.ValueNext],0)

11.Curve : adding the organic effect for the intermediate points thanks to the Sigmoid function. This is the ordinate of our points.

IF MIN([Position])=1 THEN
[7.TopCurrent]+(([8.TopNext]-[7.TopCurrent])*MIN([Sigmoid]))
ELSEIF MIN([Position])=2 THEN
[9.BottomCurrent]+(([10.BottomNext]-[9.BottomCurrent])*MIN([Sigmoid]))
END

12.PathPoint : gives to Tableau the settings for the lines to be drawn.

IF [Position]=1
THEN [2.YearFake]
ELSEIF [Position]=2
THEN {MAX([Year])}+({MAX([Year])}-[2.YearFake])
ELSE [2.YearFake]+{MAX([Year])}
END

@excludeLast : allows to exclude the last part of the polygons.

LOOKUP(MIN([Year]),0)<>MIN({MAX([Year])})

 

Visualization

Now that the most tedious is done, we can configure the visualization.

2018-09-15_16h48_47

We put our field « Category » and « 1.Partition » in « Details » (a color field in color if necessary). The type of marker used must be « Polygons ». The 12.PathPoint field is to be placed in « Path ».

We use the @excludeLast filter to remove the last part of the visualization that is not needed (all values after the last year of our Year field).

 

2018-09-15_16h49_14

2.YearFake and 11.Curve are placed respectively in Columns and Lines.

 

And we set the nested calculations of our calculated field 11.Curve:

2018-09-15_16h47_302018-09-15_16h47_39

2018-09-15_16h47_592018-09-15_16h47_48

 

 

Eeeet … VOILA! A wild Stream Graph appears (Geek)! I’ll let you redefine the format of the YearFake field and format your visualization.

 

2018-09-15_17h06_19.png
Link to the visualization

 

Once again, I thank Rody for his tutorial which allowed me to build this visualization and to write this article.

1 commentaire

Laisser un commentaire

Entrez vos coordonnées ci-dessous ou cliquez sur une icône pour vous connecter:

Logo WordPress.com

Vous commentez à l'aide de votre compte WordPress.com. Déconnexion /  Changer )

Photo Google+

Vous commentez à l'aide de votre compte Google+. Déconnexion /  Changer )

Image Twitter

Vous commentez à l'aide de votre compte Twitter. Déconnexion /  Changer )

Photo Facebook

Vous commentez à l'aide de votre compte Facebook. Déconnexion /  Changer )

Connexion à %s