Designing BI processes
Effective decision making is impossible without full and detailed information on how your organization is doing. The problem is that most organizations only go as far as annual and quarterly reports, which is not nearly enough.
For effective performance analysis, enterprises are implementing business intelligence (BI) systems. In this article, we’ll share a few useful tips on setting up a BI system for your company. (Truth be told, if we knew what we know now a year ago, it would have saved us lots of time and effort.)
Store source BI data, not slice data
It’s important to remember that reports and charts aren’t enough. Company managers will need more detailed and elaborate data.
For example, you need to provide data on the last year’s monthly spending of single heterosexual people under 45 from Texas. To be able to answer that question, we needed to have not only a table with full user profiles but also a table with their payments.
Analyze raw data, not slice data
Avoid pre-aggregation and analyze raw data. Keep in mind that by aggregating data, you’re losing valuable information.
For instance, you need to provide stats on how many new contacts people from Chicago establish every day. If you’re working with raw data, you’ll be able to back up your stats with specific pieces of data (who made contact with whom and when it happened).
Forget the Not-Invented-Here approach
Remember that you’re not the first organization to set up a BI system. For many tasks, there are already smart solutions available. Be sure to make good use of them. As a result, your work will come down to collecting data and configuring analytics programs.
For example, Iteora did implementation on some projects the columnar Vectorwise database and Pentaho analytics tool. As a matter of fact, all we need to do is to import data into the database.
Think about users
The system you’re designing will be used by common managers that may not know sophisticated terms, such as “first time differential.” With that said, make sure your interface is clear and intuitive. Instead of inventing a brand new interface, take a look at the existing solutions and borrow a few ideas.
Many BI tools have demo pages where you can see what the tool can do. Pick a few features and ask the future users of your BI system to test them for user-friendliness.
Don’t wait too long with BI development
Designing, developing and implementing a BI system is, without any doubt, time-consuming and challenging. Remember that 9 women can’t make a baby in 1 month. At Iteora, our BI team includes 7 specialists and 3 consultants.
Forget about normalization
Don’t be afraid to “denormalize” your data. For instance, if you have a table with users and a table with user sign-ins, you can merge them into a table containing sign-ins for each user. On one hand, it appears like you’re duplicating data. On the other hand, instead of a sophisticated JOIN operation, you now have to calculate unique values, which is a lot simpler.
Collect data asynchronously
When it comes to collecting data on user behavior, our advice is to do that asynchronously. Feel free to use logs or Scribe, whichever you like best. Remember that collecting data on an object must be done without interfering into its behavior. Furthermore, any malfunctions in a BI system must not affect the studied object in any way.
When developing infrastructure for collecting information on user behavior, we knew we’d have to process big amounts of data. We had to collect data in one database. Any problems with the database operation must go unnoticed by the website users. This is why we decided to first register raw data in logs and only then move the data to the database using a separate background script. Later on, logs and parsers were replaced with the Scribe service.
Track data flows
Create a data flow chart and make sure there are no cycles (feedbacks).
Also make sure that information from the BI system doesn’t leak to the studied objects. For example, after analyzing data, you need to send an email notification to a certain group of users. Don’t use your BI system to create a list of email recipients.
Check your collected data
When implementing a BI system, you need to carefully check incoming data. For example, if you get data on system users, be sure to check the distribution of registration dates, dates of birth, etc. Ideally, you should check the distribution of values in each column and pair of columns.
When you add new data, it’s not uncommon that the column value is the same for all rows. Most of the time, it’s nothing but a human error. The developer must have simply forgotten about the column.
Remember that there is no such thing as excess data. Data duplications, on the contrary, are very common. Be careful with duplications. It’s better to make sure you already have these values than avoid duplications at the start. This helps a lot in detecting system errors.
This is how we debugged plenty of errors in user profiles, city data, and financial reports.
Don’t aim for a 100% fit
Comparing data from different sources, don’t aim for a 100% fit. Most of the time, a 95% fit is more than enough. It’s not an accounting system you’re designing, right?
It’s not uncommon that data discrepancies are caused by objective reasons, e.g. time asynchronization. Take, for example, the time when a payment is registered in your billing system and the payment system. On December 31, due to just 1-second difference, the payment may be registered under different years.
Feel free to contact Iteora if you have questions about Business Intelligence processes!