Earlier this year, Tableau, a leader in data visualization and modern business intelligence and analytics platforms, launched a new product called Tableau Prep. Tableau Prep is a data preparation tool used to combine, shape, and clean your data for analysis in Tableau. While the company has made continuous improvements over the years in Tableau Desktop’s ability to connect to, reformat, and integrate data, Tableau Prep combines those capabilities – and more – in a more robust, visual, and easy-to-use package. The tool is designed as a preface to Tableau analytics development, (via either Desktop or Tableau Server) but can be a standalone for users just looking for integrated, cleaned up flat files for other purposes.
Keep in mind Tableau Prep is geared for business analysts and other non-technical audiences, and is not a fully-enabled ETL (Extract, Transform, Load) software platform. Other modern tools such as Talend and WhereScape, among others, are options that can support your needs on that front as well.
To understand how Tableau Prep can help you quickly and confidently combine, shape, and clean your data, let’s walk through a sample use case and study Tableau Prep’s capabilities and limitations. Given it’s back-to-school time across the country, we chose to work with public school data, and in particular from Florida (my homebase). Following the aftermath of the tragic Parkland school shooting in South Florida on February 14th, the state government passed SB 7026 which allocated $400 million in spending to improve safety, security, and mental health in schools across the state’s 67 county-level districts. So while changes are put into effect, we set out to analyze where those dollars are now being spent, in the context of recent school performance, enrollment, staffing, discipline, crime rates, and other factors.
For more information about SB 7026, visit http://www.fldoe.org/safe-schools/.
The publicly available data on this subject came in the form of Excel files and PDFs; none of it formatted in a Tableau-friendly manner. At face value, this is a common challenge faced by many of you and that’s where Tableau Prep comes in handy. For this use case, the focus will be on the Tableau Prep experience and how the tool enabled the below analytics.
Use Case: Florida Schools & Post-Parkland SB 7026 Expenditures
To start, we accessed and downloaded an array of publicly available data from the State of Florida Department of Education; this district-level data was in five Excel files and two PDF files (see below).
Connecting to Excel Files
We then connected to each of the Excel files, using the Data Interpreter option, and unchecked fields which weren’t relevant to our analysis (to narrow the breadth of data and improve performance) or that we would want to calculate in Tableau (e.g., percentages). We also changed the field names to be consistent across sources (e.g., District Number vs. #, District Name vs. District).
Before & After Data Interpreter:
PDF Alternate Approach
At this time, Tableau Prep does not yet connect to PDF files (see below) so we used Tableau Desktop to convert those two files into Hyper Extracts, which we then brought into Prep.
Tableau Prep enables users to quickly perform a wide range of actions to improve data quality and usability. These data files did not require much cleansing, but we did take some action easily with Prep. Case in point: the two PDF files had District names and numbers combined; an automatic split carved the field into two, which we then renamed using our normal convention. We also excluded null and blank rows; all actions are shown in the “changes” field, which makes review quick and easy. One major advantage of Tableau Prep is that it can group, replace, and make other data quality fixes smoothly and much faster. Another sleek feature is the automatic data profiling, which saves the time it previously took us to do the same in Tableau Desktop. Data discovery is at your fingertips.
Most of the files had fields which needed to be pivoted for Tableau analytics. Prep makes this fairly simple in terms of multi-selecting and dropping them into a Pivoted Fields window. However, in the case of the discipline file, once ethnicity was pivoted we had to add a second pivot for gender. Note that Prep can also perform aggregations based on your data’s dimensions. In this case, we chose to keep the data at its finest level of granularity.
We're halfway there...
Once the data is sufficiently cleaned up, Prep allows us to rapidly “drag and drop” objects together to form joins, or unions, if and where appropriate. While these can be sophisticated, our use case and data required straight-forward inner joins based on District Number; together these yielded a data set of over a million rows (see below). The exact row count is not yet apparent as Prep samples the rows resulting from our last join (notice the “Sampled” label). The expansion in row count from the original 72 necessitates the use of Level-of-Detail calculations within our Tableau analytics (more on that next time). Tableau Prep provides a much higher level of visibility into the nature and results of joins and unions than Tableau Desktop – this is one of Prep’s inherent strengths.
Output to Hyper Extracts
After the six joins to tie together the seven data sources are set, we land at this Tableau Prep Flow:
It’s time to export the data into a Hyper extract file for Tableau Desktop or directly publish it to Tableau Server (or to a csv file). A few simple clicks to add the output, run the flow, and voila – the data is ready for action! Almost 1.8 million rows were the result of integrating these seven data sources in minutes.
We’ll discuss the next phase of our Florida Schools analysis through a follow-up post that will hit on the latest features in Tableau Desktop (e.g., Extensions) to tie it all together. As you see, Tableau Prep is a useful tool to directly filter, combine, and clean data—regardless of its size. Feel free to reach out to us with any questions – we’re happy to help you in your Tableau journey!
Don’t want to miss the follow-up? Subscribe to our mailing list.
Data Prep Best Practices: https://www.tableau.com/learn/whitepapers/data-prep-best-practices.
Licensing and Installation: https:/onlinehelp.tableau.com/current/desktopdeploy/en-us/desktop_deploy_tableau_prep.htm
Tableau Prep and Tableau Customer Support: https://www.tableau.com/support/prep.