Practical Advice for Building a Real-Time Analytics Business Case

“Heard this Story Before…”

This story came from a conversation I had at SQLPass this year. tl;dr at the bottom!

Ryan has been working as a Data Engineer for his company for a couple years. He’s has great success delivering analytical models to many within the organization.

Now, Cathy from marketing came to him and asked to build a real-time dashboard. Cathy wants to show his boss how active the CRM system is being used. There are requirements for New User counts and Customer Lifetime Spend figures. They plan on purchasing some TV’s and displaying these numbers all around the department .

Ryan sits back and starts making a plan!

Seeing the mountain that needs to be climbed

After about an hour of research, Ryan smacks his head in frustration. His company uses Microsoft and has moved most of their infrastructure to Azure.

This is one of the first times he’s had a request for real-time. It’s a cool concept that’s always been on the back burner to learn. The problem is that now he needs to provision more infrastructure and development than he expected. This is going to increase his technical breadth and costs.

With the move to Azure, Ryan migrated most of the data engineering to Azure Synapse. Spark has the ability to stream data into the Delta lake or SQL solutions. From there it can be direct-queried from Power BI.

Inside Ryan’s mind, he can hear a quote “Everything breaks… All the time”, but couldn’t quite remember who said it first.

Here I come!

Ryan and I go way back. We worked together for a few years, but since I started consulting, we haven’t talked shop all that much. He knows that I’m an expert with Synapse and have implemented solutions like this in the past. So he calls me.

After describing the problem and Cathy’s requirements, all I can think is “Why? What possible reason do they have for doing this?”

It’s not that they can’t do it. Only that it will cause more headaches in the future than either of them can foresee. Ryan is right that stuff is always breaking!

Let’s chat about Why?

I told Ryan a couple stories.

Real-time solutions are for when you need to make quick decisions.

Think about a cars dashboard. You need to be able to look down, see the speedometer, and decide if you’re going the proper speed or need to fill up on gas.

Remembers the 1999 John Cusack movie, Pushing Tin? You can see the same thing happens with Air Traffic Control. That looks like such a hectic position. No wonder John’s character has an emotional break down. They need to be able to tell where the planes are, decide where they need to go, and communicate back to the planes. All these actions need to take place within a few seconds of each other.

I also worked for a call centre with a couple hundred agents. The control desk needed to be able to see if there were lots of people waiting in the queue. It was important to control how much the queue grew or shrank.

These scenarios all have one major thing in common: The time between seeing the data and making an action is very short. Notice that I say “action” and not “decision”. This is the business case you need to make for or against your requests.

I told Ryan to go back to Cathy. They should frame everything around the decisions that will come from the dashboards.

Rule of thumb

Whenever a business stakeholder comes to me with a real-time scenario, I frame the work by latency.

If actions are not going to happen for a week, then 24 hours is perfectly reasonable latency for the data. This would be scenarios like scheduling, forecasting, and most financial data uses.

Actions taking place the next days can generally have 4-6 hours of latency. Sales and operational processes need to be a bit more agile, thus falling into this category.

Actions in the next hour could be real-time, but may benefit with a 5 minute latency. Remember that the lower the latency, the higher the cost to put in place and maintain.

Lastly, actions in the next 10 seconds need to be real time. We’ve already talked about some of these cases.

Back to the Business

I sent Ryan back to Cathy to frame all his requirements around action latency. He came to realize that this wasn’t the right use case for real-time. He can use the nightly loads to fulfil his dashboard requests. It took and hour to build and was ready the next day in production because we already had the data!

This is all to say that there are scenarios in which real-time is the proper solution. It’s important to examine those specific instances. It’s also paramount to determine the least amount of data to make the actions. More data than necessary will lead to slower decisions!

TL;DR

I use the 10% rule: I can accept 10% latency between a user seeing data and when action takes place.

One thought on “Practical Advice for Building a Real-Time Analytics Business Case

Leave a Reply

Your email address will not be published. Required fields are marked *