Denormalised data, disaggregated

If you’ve been following these blogs in order (and if not, why not??), you will have read all about:

The time has come to discuss the structure of your data warehouse. What is the best way to build it? Kimball? Immon? Facts? Dimensions? What’s it all about? Grab a coffee and let’s dive in.

As we discussed in the ELT versus ETL blog, the near removal of storage and compute barriers has changed the game for data modelling and design. No longer do we need to worry about minimising data redundancy and repetition. No more careful Kimball modelling of fact and dimension tables with spidered joins between them.

Instead we can flatten, denormalise and disaggregate.

With any of the column store cloud data warehouse technology out there, the thing to focus on is minimising the complexity of the data at rest.

In nearly all cases, the greatest cost to a business in delivering business intelligence, insight and analysis is in the human resource required to actually perform the analysis. The cost of having to wait a week for an answer to a question almost certainly outweighs any infrastructure cost in your data pipeline (see our data driven company blog here).

By denormalising the data into the minimum number of large, flat, disaggregated tables (think: a customer table – single row per customer, an orders table – single row per order; each with potentially hundreds of columns with all of the relevant attributes), you minimise the cognitive load on the analysts.

Analysis is predominantly conducted intra-table rather than using joins to create inter-table queries.

If constructed carefully and designed well, you may even achieve the nirvana of Insight at the Speed of Thought.

Imagine the dream scenario:

You can sit down in a meeting room with your senior stakeholders, a decent BI tool connected to your curated, structured data warehouse, and you can confidently say to them: “ask me the burning questions – what do you want to know?”

  • How are our orders trending versus last week and last year?
    • A few clicks and there is a graph on screen for them to see;
  • Ok, split that out by region:
    • Click – there it is;
  • Throw in our customer segments and split out by value cohorts;
    • Click – the analysis is on screen.

Go as deep as you like – with the data structured properly, available at the fingertips of your analysts, you can become the data heroes your business needs.

Use the transform layer and the power of a cloud data warehouse to create the data structures. Disaggregate the data completely (remember – to aggregate data is to destroy data – don’t do it where you can’t undo it). Use a BI tool (Tableau, Looker, PowerBI – they’re all good) to aggregate, compare, slice and dice for analysis.

Want to know more? We’d love to chat. Please get in touch.