Best practices, Product

Think You Need that Data Warehouse? Think Again.

If you’re involved in data analytics, you’ve probably been exposed to the concept of a data warehouse. A data warehouse brings together a large amount of data from different source systems—think millions and millions of rows—into one centralized place, so it can be systematically synthesized and analyzed while minimizing risk and negative impact on system performance.

But to implement an analytic solution using this  traditional data warehouse approach requires you to de-normalize and aggregate the data. This is done primarily to speed the read performance of the underlying database. And it’s as challenging as it probably sounds.

Ask anyone who’s done it, and they’ll tell you building and maintaining a traditional data warehouse can be a long, arduous and expensive project. In many cases, it’s a project that never ends.

Here’s why.

The Long and Winding Road of a Data Warehouse

A data warehouse can take months (if not years) to initially deploy, depending on the size of the source databases and your specific business requirements. Suppose you need to analyze sales trends by various products and product categories. This is the process you’d likely follow to set up the corresponding data warehouse:

  • Acquire ETL and other tools/skillsets to design and build the data warehouse.
  • Design and build a “Sales” fact table by de-normalizing various sales transaction tables.
  • Design and build a “Product” dimension by de-normalizing various product tables.
  • Design and build a “Date” dimension table.
  • Create and resolve dimension IDs in “Sales” fact table, to be used as a reference to the primary key of the “Product” and “Date” dimension tables.
  • Build a summary or aggregate tables by various dimensional attributes, like Year and Quarter, to improve the data and report performance.

Don’t let the above concise, bulleted list fool you—this data warehouse-driven process will probably take months.

And, since data in a data warehouse exists in a different shape and form than the original data living in your source systems, to make changes—such as adding a table or adding an attribute to a table—you’ll have to significantly change your data model. So, once you’ve launched your data warehouse, you still need to update and maintain it to keep pace with your changing business needs.

Until recently, time-intensive data warehouses really were the only way to efficiently aggregate and analyze large volumes of complex data coming from different data sources without dragging down the performance of your business systems. Newer, “in memory” databases might promise better performance, but you still need a data warehouse. They don’t eliminate the need for data de-normalization and aggregation, nor reduce the time it takes to build or modify an analytic solution.

Data warehouses once were a necessary evil. But not anymore.

The No-Data-Warehouse Approach to Analytics

There’s a new—and better—way to get the insight you demand. All while saving money, speeding time to analysis and freeing yourself from the lengthy, never-ending work associated with data warehouses.

With this new way, you can:

  • Enjoy rich, data warehouse-type analytics.
  • Receive sub-second responses on high-volume, complex data.
  • Launch an analytic dashboard in less than a day.
  • Empower business users to easily create their own reports in real-time.
  • Never have to de-normalize or pre-aggregate data.
  • Eliminate the need for special hardware, special skillsets and third-party tools.

My company Incorta leads the charge on this no-data-warehouse approach to analytics. And our customers love it. One of our Fortune 500 customers validated our approach, telling us, “It used to take 8-12 weeks to get a report from request to production. With Incorta, business users can do that on their own instantaneously.”

Sound intriguing? A no-data-warehouse approach might be the right one for your company if:

  • You don’t currently have a data warehouse. Skip the lengthy, costly data warehouse project, and get to analytics fast.
  • You need real-time reporting. Achieve near real-time access to data, so information and insights you receive are up-to-the-minute and accurate.
  • You have new data. If you recently acquired a business or have a new structured/unstructured data set that needs to be integrated with your existing analytic solution.
  • Your reports suffer from poor performance. You use a data visualization tool such as Tableau, and you’re not happy with report performance.
  • You need to discover deeper insights or make predictions with your data. Incorta provides out-of-the-box integrations with Spark and R that you can leverage for all your advanced analytic needs (watch for my next blog for more details on this topic).

Conversely, a data warehouse still might be a good approach for your company if:

  • You already have a data warehouse. You’ve likely invested a lot of time, money and effort into your existing data warehouse infrastructure. Incorta can work alongside it to turbocharge its performance—the data warehouse essentially becomes a data source for Incorta. Then, you can determine over time if you need to continue your data warehouse investment.
  • You don’t need real-time reporting. For situations where real-time reporting is not needed, working with a data warehouse might still fit your needs.

Want to learn how Incorta’s real-time, no-data-warehouse analytics technology makes all this possible? Contact us at info@incorta.com.