Becoming a Tableau Desktop Specialist – Part 1: Create and Save Data Connections
Sep 3, 2019Christine Rietmann
This week we will learn how to create and save data connections. This is part of the first skill measured in the Tableau Desktop Specialist Exam: Connecting to & Preparing Data.
This blog is a part of the “Becoming a Tableau Desktop Specialist Series”. Check out the introduction to learn more about it:
First of all, when you open Tableau, you can see on the left side your options for connecting a data source in the Connect Pane. You can choose between connecting to a file or a server (click “More…” to see all opportunities). On the bottom of the connect pane you can find the saved data sources. (Found on this Tableau Help page.)
Next skill measured is to explain the differences between using live connections versus extracts. For this topic, there are a lot of information available and authors explain it very well. I couldn’t find better words for it so, let me just quote the essence of what I’ve found and show you where you can find more.
“Live connections offer the convenience of real-time updates, with any changes in the data source reflected in Tableau. But live connections also rely on the database for all queries. And unlike extracts, databases are not always optimized for fast performance. With live connections, your data queries are only as fast as the database itself.” – Diego Medrano
“A Tableau data extract is a compressed snapshot of data stored on disk and loaded into memory as required to render a Tableau viz” – Gordon Rose
“When you create an extract from a local file (such as a .csv or an Excel workbook) or an on-premise database, you’re speeding up the workbook through optimization. As a result, Tableau doesn’t need the database to build the visualization. Instead, Tableau’s in-memory data engine queries the extract directly.”- Diego Medrano
The biggest advantage of a data extract is that tableau can work faster especially when the visualization is complex and includes a big dataset, a lot of calculations, filters, etc. In return, you lose the real-time updates. This is why you have to refresh your data.
Zen Master Ryan Sleeper indicates in his book ‘Practical Tableau’ that you don’t need an extract, when your dataset is held on powerful, in-memory databases. In this case, working with live connections is recommended.
The explanations include two words I didn’t understand immediately: ‘On-premise database’ and ‘in-memory data engine’. For those of you who also have no clear understanding of these terms, maybe the further notes will help.
‘on-premise’ or ‘on-prem’ refers to a database which is managed and maintained locally from a company itself. An opposite of an on-premises database is a cloud database. In this case, the access to the database is provided as a service.
When trying to find out what ‘in-memory data engine‘ means, google only gives me explanations for the ‘in-memory database‘ back. Sounds good, but I was not sure if the database and data engine are used synonymously. Luckily, I found a statement that says: A data engine is a “purpose-built analytic database for extracts”.
This is why I conclude that the explanation for in-memory databases will also fit for the in-memory data engine.
The main difference between the in-memory database and a traditional (on-prem) database is that the in-memory database is stored in a computer’s main memory (RAM) contrary to a traditional database stored on disk drives. This is the reason why the extract loads faster: the access to RAM is faster than to a disk drive.
Find my additional sources for this paragraph below:
1. Explanation of ‘on-premises’ by hpe
2. Explanation of ‘cloud database’ by wikipedia
3. ‘In-memory database’ by FlipHTML5
4. Difference between in-memory database and a traditional database by ramia
Now it is time to learn how to create an extract in Tableau. There are two options to create an extract. The first one you can see at the upper right corner on the data editing interface.
For the second one, you have to open the authoring interface. Now you can click on the data source and choose ‘Extract Data…’. This option also offers opportunities to filter your data.
After creating the extract, the icon changed from one cylinder to two cylinders with an arrow on it. Every time you see this symbol you now know that the data has been extracted into a faster data engine.
Find all the information about creating extracts also in this video from the Tableau Website.
Finally our last topic for today is how to save metadata properties in a TDS.
First of all, we need to know what metadata exactly means. In general, metadata defined as “data that provides information about other data”.
Tableau makes a great job and recognizes a lot of details about our data. For example, Tableau often allocates the right datatype like the geographic role automatically.
But there are still some things for us to do to organize our data. I found the following list for the most popular examples:
A TDS file will save all your changes on the metadata.
Sources I used for this paragraph:
1. Definition of metadata by web.archieve
2. List of the examples by Interworks
3. Tableau Metadata Model by Tableau Software
Let’s come to the main topic we want to discuss: How to save the data source with all the metadata. Therefore, you can click on the data source itself and choose ‘Add to Saved Data Sources…’.
Now Tableau will save the connection as a .tds. If you chose the default directory, Tableau would show the data source on the bottom of the connect pane. But you can of course choose any other directory to save your .tds file.
Note: “This is not the data itself, just the information about the connection”. The consequence is that if you change the place where you store the data, the connection won’t work anymore. You have to locate the original data source in your .tds before you can start working with it.
Another option to save your data sources is to publish it to a Tableau Server or Tableau Online. Therefore, you click on server-menu and say publish data-source. Then choose the data source you want to publish.
A detailed description of how to publish a data source you can find here on the Tableau Website.
Additional Information from the Tableau Website: Look here to get a good overview of file types and folders.
I hope you gained a good understanding of how to create and save data connections. Let me know in the comments below (or reach out to me on Twitter) if you have any suggestions for improvements.
Stay tuned – next week we will go on!
new to learn
Great Jpb !! and Thanks a Lot !!
Thanks Christine for your Wonderful Explaination !! Clear right to the point
The way that you explain is easy to understand and very detailed with visual examples. Thank you so much.
Christine Rietmann says:
Thanks for your feedback. I’m glad you liked it.
Great series. Thanks Marian for the shout out in the comments. I am giving a presentation on this exam next week and will include mentioning your series. Thanks!
Christine Rietmann says:
Thanks, Michael. That’s great. I would be happy if some listeners can benefit from my series.
I will definitely watch your video with all the great advice on the exam again before I take my exam.
I just did an internal presentation about the exam at work, and have added a link to your blog series in it.
Right?!? Having this when I was preparing for my exam would have saved me a lot of time. So cool that Christine is not only writing this but sharing with everyone else.
Christine Rietmann says:
Thanks, Marian for your kind words.
I am doing it in the same way you do. So I learned a lot from blogs like yours and hopefully, some people can now learn something from my blogs 🙂