Chapter 11 Tableau Public

11.1 Introduction

In this chapter we will be analysing and visualising patent data using Tableau Public.

Tableau Public is a free version of Tableau Desktop and provides a very good practical introduction to the use of patent data for analysis and visualisation. In many cases Tableau Public will represent the standard that other open source and free tools will need to meet.

This is a practical demonstration of the use of Tableau in patent analytics. We have created a set of cleaned patent data tables on pizza patents using a sample of 10,000 records from WIPO Patentscope that you can download as a .zip file from here to use during the walkthrough. Details of the cleaning process to reach this stage are provided in the codebook that can be viewed here. The Open Refine walkthrough can be used to generate cleaned files very similar to those used in this walkthrough using your own data. You will not need to clean any data using our training set files.

This article will take you through the main features of Tableau Public and the types of analysis and visualisation that can be performed using Tableau. In the process you will be creating something very similar to this workbook.

11.2 Installing Tableau

Tableau can be installed for your operating system by visiting the Tableau Public website and entering your email address as in the image below.

Tableau Public Front Page

Figure 11.1: Tableau Public Front Page

While you are waiting for the app to download it is a good idea to select Sign Up to create a Tableau Public Account as displayed in Figure 11.2. This will allow you to load up your workbooks to the web and share them. We will deal with privacy issues in making workbooks public or private below but as its name suggests Tableau Public is not for sensitive commercial information.

Sign Up for a Free Tableau Account

Figure 11.2: Sign Up for a Free Tableau Account

This will lead you to a profile page. It will be empty if this is a new account. Paul Oldham’s account, which contains useful example workbooks that will help with learning Tableau, can be accessed here and is displayed in Figure 11.3.

Paul Oldham's Tableau Public Account

Figure 11.3: Paul Oldham’s Tableau Public Account

While you are there you might want to check out the Gallery of other Tableau Public workbooks to get some ideas on what it is possible to achieve with Tableau. You may want to view a Tableau Workbook for scientific literature that accompanied this PLOS ONE article on synthetic biology. While it is now a few years old it gives an idea of the possibilities of Tableau and the feel of an existing profile page.

Tableau Public Gallery

Figure 11.4: Tableau Public Gallery

11.3 Getting Started

When you first open the application you will see a blank page. Before we load some data, note the helpful How-to-Videos on the right and the link to a visualisation of the day. There are also quite a lot of training videos here and a very useful community forum. If you get stuck, or wonder how somebody produced a cool visualisation, this is the place to go.

Tableau Public Start Page

Figure 11.5: Tableau Public Start Page

To avoid staring at a blank page we now need to load some data. In Tableau Public this is limited to text or Excel files. To download the data as a single .zip file click here or visit the GitHub repository. unzip the file and you will see a collection of .csv files. The excel file and codebook should be ignored as supplementary.

Access the Datasets

Figure 11.6: Access the Datasets

As we can see above there are a number of files in this dataset. The core or reference file is pizza.csv. All other files are children of that file, such as applicants, inventors and international patent classification codes. That is, concatenated fields in pizza have been separated out and cleaned up. One file, applicants_ipc is a child file of applicants that will allow us to access IPC information for individual applicants. This may not make a lot of sense at the moment but don’t worry it will shortly.

To get started we will select Text file then pizza.csv file for import as in Figure 11.7.

Select the pizza csv file for import

Figure 11.7: Select the pizza csv file for import

We will then see a new screen showing some of the data and the other files in the folder as in Figure 11.8. At the bottom is a flag with Go to Worksheet or Sheet 1, so let’s do that.

Data Source View in Tableau Public

Figure 11.8: Data Source View in Tableau Public

We will now see a screen that is divided in to Tables on the left, with Measures below. We can see that in the dimensions there are quite a large number of data fields. Note that Tableau will attempt to guess the type of data (for example numeric or date information is marked with #, geographic data is marked with a globe, text fields are marked with Abc). Note that Tableau does not always get this right and that it is possible to change a data type by selecting a field and right clicking as we can see below.

Exploring Tableau Tables and Measures

Figure 11.9: Exploring Tableau Tables and Measures

On the right hand side we can see a floating panel menu. This can be hidden as a menu bar by clicking the x. This panel displays the visualisation options that are available for the data field that we have selected. In this case two map options are available because Tableau has automatically recognised the country names as geographic information. Note that persuading Tableau to present the option that you want (for example visualising year on year data as a line graph) can involve changing the settings for the field until the option you want becomes available.

At the bottom of the screen we will see a worksheet number Sheet 1 and then options for adding three types of sheet:

  1. A New Worksheet
  2. A New Dashboard
  3. A New Story

For the moment we will focus on building worksheets with the data and then move into creating Dashboards and then Stories around our pizza data.

11.5 Adding New Data Sources

We will follow the same procedure that we used for applicants to add the remaining files as data sources. We will add the following four files (as they appear in the folder in alphabetical order).

  1. applicants_ipc.csv
  2. inventors.csv
  3. ipc_class.csv
  4. ipc_subclass.detail.csv

To add the data sources either click the Data menu and New Data Source or (faster) the cylinder with a plus sign. Then select Text file, add each file and allow it to load as in Figure 11.18.

Add More Files

Figure 11.18: Add More Files

If all goes well the Data panel will now contain the following files as in Figure 11.19.

Check that the Files are Loaded in the Data Panel

Figure 11.19: Check that the Files are Loaded in the Data Panel

Note here that the applicants data displays a blue tick. This is because it was the last data source that we used and is therefore active. The fields we see in Dimensions belong to that data source. Next click in the bottom menu to create a new worksheet and then click inventors in the Data field. The field names will now change slightly. It is important to keep an eye on the data source that you are using because it is quite easy to drop a field from one data source onto another. In some cases this is a good thing. But, if you receive a warning message you will be attempting to drop a data source on to another data source where there is no matching field. We will come back to this on data blending.

Next follow the same procedure for ranking applicants with inventors using the Inventors All. For anyone interested in seeing the dramatic impacts of concatenated fields try dropping the Inventors Original field onto the worksheet.

Using Inventors All you should now see the following ranked list of inventors. To change the format to left align right click in the inventors panel > Format > Alignment > left align as in Figure 11.20.

Ranked Inventors

Figure 11.20: Ranked Inventors

Now repeat this exercise for the remaining data sources by first creating a sheet and then selecting the data source. As you move through this select the following dimensions to add to the sheet and then drop number of

  1. applicants_ipc. Drop Ipc Subclass Detail onto the sheet. Then drop number of records onto the sheet where the field says Abc. Note that a number 6 will appear in the first row. This is an artifact from the separation process. Select that cell, right click and then choose Exclude.

Do not rank this data, but instead drag the field Applicants Orgs All onto the sheet so that it is the first row (tip, it is easiest to do this by dragging the field into the row bar before the IPC field). You will now see a list of company names followed by a list of IPCs. Congratulations, we now have an idea of who is patenting in a particular area of technology using the word pizza at the level of individual applicants.

Add a new sheet. Then click on ipc_subclass_detail. Note that if you click on the data source first, the dimensions panel will go orange. Don’t panic. The reason is that Tableau thinks you are trying to blend data from the ipc_subclass_detail source with applicants_ipc. If you do this simply click on ipc_subclass-detail again.

  1. ipc_subclass-detail. Drop the Ipc Subclass Detail dimension on to the sheet. Then drop the number of records onto the sheet. Then click on the first cell containing 6 as an artifact and exclude. Repeat for 7. Then select the bar chart in the floating Show Me panel, then drag ipc_subclass_details.csv onto the Label button. Now rank the column using the descending button in the upper menu as before.

At this point, if we had not trimmed the leading white space the ranked list would display indentations and there would be duplicates of the same IPC code. For that reason it is important to trim leading white space before attempting to visualise data (and this applies to all our separated fields).

11.6 Creating an Overview Dashboard

You should now have five worksheets each of which displays aspects of our core pizza set. We have named the sheets as follows and suggest that you might want to do the same. Note that where there is more than one sheet containing similar but distinct information it will be helpful to give them distinct names (e.g. IPC Subclass and Applicants IPC Subclasses). We might even start using less technical labels by calling the IPC something clearer like Technology Area, to aid communication with non-IP specialists

Add and Format the IPCs as Technology Areas

Figure 11.21: Add and Format the IPCs as Technology Areas

Let’s get a quick overview of the data so far. Next to the add worksheet button in the worksheets bar is a second icon to create a dashboard. Click on that and we will now see a sheet called Dashboard 1. Dashboards are perhaps Tableau’s best known feature and are rightly very popular. We can fill our dashboard by dragging the worksheets from the Dashboard side menu. The order in which we do this can make life easier or more difficult to adjust later. Let’s do it in the following steps

  1. Drag Trends onto the dashboard and it will now fill the view.
  2. Drag Organisations onto the dashboard.

That is rather messy, but all is not lost. On the left hand side look for Size and click the dropdown menu button. This reveals that Tableau can display dashboards in a range of sizes. Choose Automatic as in Figure 11.22

Choose Size and Automatic to Resize the Dashboard

Figure 11.22: Choose Size and Automatic to Resize the Dashboard

Now select the top of the organisations box and a small inverted triangle will appear. Click on that and then choose Fit > Fit Width.

Resize a column to fit

Figure 11.23: Resize a column to fit

The bars may now disappear. Click into the box on the line where the bars start and drag them back into view. At this point long names may start to be obscured. If desired, right click on a long name such as Graphic Packaging International, choose Edit alias and edit it down to something sensible such as Graphic Packaging Int.

We now have two panels on the dashboard. Let’s add two more. First drag technology areas below the line where Trends and Organisations finish. Grey shaded boxes will appear that show the placement, across the width is fine. This can take some time to get right, when the whole of the bottom area is highlighted let go of the mouse. If it goes somewhere strange either select the box and in the top right press x to remove it, or try moving it (in our experience it is often easier to remove it and try again).

Do not try to format this box yet. Instead, grab inventors and drag it into the space before the Technology Areas.

We now have four panels in the dashboard but they need some tidying up. First, in the two boxes we have just edited repeat the Fit Width exercise and then drag the line for the bars around until they are in view and satisfactory. Next, we have names such as Applicants Orgs All that are our internal reference names. Click on them in each of the three panels one at a time and select Hide Field Labels for Rows.

Hmm… our Technology Areas panel is proving troublesome because even the edited version of the IPC is rather long.

Before we do any editing, first experiment with the Size menu in the bottom right. The default dashboard size in Tableau Public is actually quite small. Change the settings until you have something that looks cleaner even if there are still some overlaps. Options such as Desktop, Laptop and Large blog are generally decent sizes but in part the decision depends on where you believe it will be displayed.

To fix the long Technology areas labels we go back to the original sheet (tip: if you move the mouse to the top right in the panel an arrow with Go to Sheet will appear, it is very useful for large workbooks). Inside the original sheet, try dragging the line separating the text and bars so that the bars now cover some of the longer text. Then switch back to the dashboard. If you feel unhappy with the result then right click in the panel in the dashboard and then choose Edit alias. This is useful for simply making labels in the view more visible (it does not change the original data).

If all goes well you will now have a dashboard that looks more or less like Figure ??. Note that depending on the worksheet settings you may want to make the font size consistent (right click and choose Format, then font size). Note also that if you increase the font size (the default is 8 point) then you may need to edit some of the labels again.

A Completed Tableau Dashboard

Figure 11.24: A Completed Tableau Dashboard

We have now done quite a lot of work and produced an Overview dashboard. It is time to save the workbook to the server before doing anything else.

11.7 Saving, Display and Privacy Settings

The only option for saving a Tableau Public workbook is to save it online. To save the file go to File and Save to Tableau Public. If you want to save the workbook as a new file (after previously saving) then choose Save to Tableau Public As. If you cannot see an option in the File menu it means that you have downloaded a trial of Tableau Desktop by accident rather than Tableau Public. All is not lost. Before pushing to Tableau public online you will need to right click each of the worksheets in Data and choose Extract Data and press OK. Then go to Server > Tableau Public > Save to Tableau Public.

Save to Tableau Public

Figure 11.25: Save to Tableau Public

You will then be asked to enter your username and password (Tableau does not remember the password) and the file will upload. Tableau will then compress the data. As of June 2015 it is possible to store 10GB of data overall and to have up to 10 million rows in a workbook (which is generally more than enough).

Tableau will then open a web browser at your profile page and it will look a lot like Figure 11.26.

Publish to Tableau Onl;ine

Figure 11.26: Publish to Tableau Onl;ine

Do you notice anything strange? Yes, we can only see the Dashboard and not any of the other sheets. To change this and any other details click on the small pen icon next to Details near the title and some menus will open up as follows.

Edit Details in Tableau Public

Figure 11.27: Edit Details in Tableau Public

To make sure the worksheets are visible go to the Settings icon on the top right and select Show Sheets as in Figure 11.28.

Show All Sheets

Figure 11.28: Show All Sheets

To access this demonstration workbook go here.

11.8 Privacy and Security

As emphasised above, Tableau Public is by definition a place for publicly sharing workbooks and visualisations. It is not for sensitive data. In the past users, such as journalists, relied on what might be called ‘security by obscurity’ but the trend towards storing data on a Tableau public profile (the only option) makes that less of an option. Logically, the answer to any concerns about Tableau Public and sensitive information is not to include sensitive information in the first place.

In the latest edition of Tableau Public it is not possible to stop others from viewing the work that you have made visible. However, as Figure 11.28 demonstrates for the button menu called Allow Access you can restrict the ability of others to actually download and share a workbook. Tableau public is fundamentally about sharing information with others through visualisation so do not include confidential information in what you share. Here it is briefly worth returning to the completed dashboard above and clicking the share button.

Sharing A Tableau Public Workbook

Figure 11.29: Sharing A Tableau Public Workbook

As we can see here, Tableau generates embed codes for use on websites or for emailing as a link along with twitter and facebook.

Finally, note that you can also Edit the workbook online by selecting the Edit icon above your dashboard as in Figure @ref(fig:tedit.png). When you have finished editing you press Publish to save the edits.

Edit a Tableau Workbook Online

Figure 11.30: Edit a Tableau Workbook Online

11.9 Round Up

In this chapter we have introduced the visualisation of patent data using a set of nearly 10,000 patent documents from WIPO Patentscope that mention pizza. As should by now be clear Tableau Public is a very powerful free tool for data visualisation. It requires attention to detail and care in construction but is one of the best free tools that is out there for visualisation and dashboarding.

To take working with Tableau on pizza patents forward on your own here are some tips.

  1. You already know how to use Tableau to create a map of publication countries.
  2. The pizza source file contains a set of publication numbers. Try a) creating a visualisation with the publication numbers, b) looking in the pizza source file for a set of URL and then exploring what can be done with Worksheet > Action with that URL.
  3. In dashboards consider using one field as a filter for another field (such as applicant and title). What data source or data sources would you need to do that?
  4. What kinds of stories does the pizza data tell us and how might we visualise them using the information provided on applicants and its subset Applicants IPCs?

If you get stuck, and it does take time to become familiar with Tableau’s potential, perhaps try exploring this workbook on synthetic biology and the use of Tableau images in this article PLOS ONE article. As a tip, try clicking on the bars and then the titles to understand Actions. Downloading workbooks prepared by others can be a very good way of learning the tips and tricks of tableau visualisation and dashboarding.

If you would like to download the pizza workbook it is here.

However, one of the most important issues exposed by working with Tableau is that you must ensure that fields you want to visualise are tidy, that is not concatenated, and also that they are as clean as it is reasonable to make them. For researchers wishing to work up their own data we suggest the Open Refine article as a good starting point.