How many statisticians does it take to build a new data model? According to Tableau Software, none. The company says that the upcoming version of its widely used analytics tool will do all the work itself.
Last week in New Orleans, Tableau presented a new feature called Ask Data. The new tool allows users to create visualizations by describing what they want in a natural language. Also, the company introduced some useful automation functions contained in its new data preparation tool.
In corporate software development, it’s a growing trend aimed at automating and simplifying tasks that used to require specialized skills. Nowadays, enterprises can use their data in a more effective way and reassign qualified employees to less routine work.
AI’s first steps in BI
Advances in AI technology made it easier for corporate software developers to enter data in a natural language, whether spoken or typed. Today, a user doesn’t have to give specific commands or manipulate objects on the screen. Instead, AI technologies display all the necessary information on the screen. AI technologies are getting increasingly common in the key BI tools, encouraging “democratization” of analytics and data science.
Microsoft Power BI, a major Tableau’s competitor, introduced the “Ask a question about your data” feature a few years ago. However, even the latest versions of this solution seem more complicated than Tableau Ask Data, both in terms of grammar and spelling. Nonetheless, both of these tools are a step forward compared to what Dundas BI and other companies, which are still using the drag-and-drop feature to create visualizations.
With Tableau’s solution, software will figure out how to join database tables, which columns to select, and what operations to perform to get the required answer. This and other new functions will be featured in Tableau 2019.1, which is due for release in early 2019. The beta version was issued in late October 2018.
According to Bennett, data processing specialists spend up to 80% of their time preparing data. If the preparation work takes less time, people will be able to focus on the BI functions that are really important for the business.
One of the ways to fix the problem is to delegate a bigger part of the work to machines. Another way is to simplify the way data are processed, which is also referred to as “democratization of data.” The aim is to make data processing easier for people who previously couldn’t do so due to the lack of special skills.
Drawbacks of using AI
According to Bennet, “Providing access to data to a bigger number of employees come with certain risks. Data can’t replace expertise in a certain field and rational situation analysis,” highlights Bennett.
“Before making new automation functions widely available, CIOs must test them to find out whether they’re suitable or not,” says Bennett.
Tools that analyze data without giving precise guidelines, may leave users perplexed as to what actions to do.
“Unless you provide a person with detailed instructions, don’t expect them to do everything right the first time.”
— Martha Bennett, lead analyst at Forrester
At the same time, you can’t simply push the responsibility to the software.
“Automation is not the same as control. It requires supervision all the same. In court, it will sound ridiculous if you say that it was the computer that did the wrong thing, and not you,” warns Bennet. This is the problem known as “AI’s black box.”
Plus, you need to find out whether your data are fit for the automation tool. For example, machine learning systems need lots of data to work with.
“If you’re applying machine learning algorithms to the data, where exceptions prevail over normal data, this won’t work,” says Bennet.
About Ask Data demo
In New Orleans, Andrew Vigneault, Tableau’s visual analytics manager, used the Kickstarter database to showcase the functionality of Ask Data. He demonstrated that, unlike most compilers, Ask Data doesn’t require perfect punctuation.
The program transformed Vigneault’s request “whats the total funding” “sum of funding” and returned the answer. When Vigneault entered the words “by year” and “by status”, Ask Data transformed his request to “sum of Funding by Deadline’s year and by Status.” Then, having no further requests, the program created a colored line chart. In the chart, color green showed the funding of successful projects growing with each year, while colors red, orange, and yellow showed the funding for failed, canceled and suspended projects remaining unchanged.
When Vigneault asked the system “which categories were successful”, Ask Data added “by Category, filter Status to successful” to the previous request and drew a histogram that ranked Kickstarter categories by number of successful projects in descending order.
It’s been Tableau’s long-term dream that corporate software performs the task even if an employee failed to make an accurate request. Andrew Vigneault showcased that Tableau is very close to it. When he entered the request “correlate with avg funding”, Ask Data produced a scatter plot of number of projects compared to the average funding for different tech project subcategories that he had viewed earlier.
Some things are still easier to do with a mouse, especially if you’re typing at a slow pace. For example, adding “fashion” and “games” subcategories to the scatter plot only took 4 clicks.
New data models
Vigneault’s colleague Tyler Doyle only needs a few clicks to create a new data model that displayed the fields used by Tableau for analyzing data in SQL requests. The fields can be easily understood by the underlying database.
“To create a data model, I just need to make one click on the “Add related objects.” You don’t have to think about which tables to use, how these tables are related, or which join it is (left or right). The advanced modeling features do all the work for you.”
— Tyler Doyle
But how did the data model know the relations between the tables? It turns out that Tableau uses the help of CIOs, database administrators, and data processing specialists. To help Tableau perform its task, you need to make sure there is the required information in the data repository.
Data preparation is another task Tableau is focused on. Zaheera Valani, Tableau’s senior engineering manager, demonstrated how Tableau Prep can automate data cleaning using “roles.” Tableau uses roles to identify the fields performing a certain role, such as URL addresses, email addresses, geographic locations (countries, postal codes), etc. In just a couple of clicks, Tableau Prep can check the field contents to identify the right role, detect invalid elements and set them to “null” or filter them out. The same can be done with user roles, e.g. enumerated types.
According to Francois Ajenstat, Tableau’s chief product officer, Tableau Prep will be updated on a monthly basis, compared to 3 updates a year for the main Tableau software.
Planning is the responsibility of another tool, Tableau Prep Conductor, which is currently being beta-tested. The program will allow companies to automate the preparation of data sources by relocating them to Tableau based on the selected schedule. A standalone Tableau product, Tableau Prep Conductor will require an individual license. The program will be available for purchase in 2019.