Self-Service vs. Guided Analytics

Self service analytics has been a buzzword floating around for a while. In this blog I want to think about what it means, what it could mean, why we should be interested and some potential limitations to consider – and better, how we can overcome them!

For the sake of clarity let’s quickly outline exactly what a self-service BI tool is, and what is the alternative?

Traditionally BI reporting is a centralized function, typically owned by IT, which works across the business to provide analytics to help business users make decisions. This model is usually referred to as “guided analytics” which refers to the concept of a data expert building your visuals and the end user consuming the report as designed by the expert.

A self-service concept would change how BI works within the organization. These models instead focus on giving more control to the end user, who should have the business specific knowledge required to interpret the data they receive. The handover point, where IT passes more control to the end user can change, but the central concept is on giving an end user a tool, or set of tools and allowing them to explore the data themselves.

The idea behind self-service analytics (and other similar ideas like a citizen data scientist, or democratisation of data) is that we should allow everyone to access and consume data, moving from a “guided analytics” model toward a model that empowers users to find their own answers. A critical component of these concepts, especially the citizen data scientist, is that we need to give people who are not data experts the ability to draw their own conclusions based on the data.

*This guy is having fun with it, in an abandoned warehouse. Sorry, this is image is not relevant but the blog looked pretty intimidatingly long without pictures.*

A lot of these ideas came about a decade or two ago, when data science was still nascent, and there was a worry there wouldn’t be enough data scientists available to do all the data science requests. While I don’t think that fear has been that realised, there are still good arguments for moving toward these self-service models.

For example, if you could create a product that allows users to do their own basic analysis, perhaps using a restricted dataset and a model that has been designed by an expert, that would alleviate a lot of the routine tasks your data scientist has to perform. This frees them up to start looking at advanced analytics – really unlocking the potential in your data, using advanced techniques that would never be suitable for a citizen data scientist to perform.

In addition, the other big plus is that in a traditional model, an IT team may produce a BI report and once it is live someone realises they just need one more visual, that they completely forgot about for the six month project, but now it is live, it absolutely can’t work without this one visual. In a traditional model, this means raising a change request, going through the hassle of deciding whether to do it / how to do it / who will pay for it etc. etc. In a self-service model, our end user can just produce this visual themselves, saving a lot of time and hassle!

A quick look at BI tools will show you that this is a real trend – both Tableau and PowerBI are ensuring their platforms can deliver self-service analytics to capatalise on this trend. So, should developers start looking for new jobs? In our brave new world, will CEO’s perform all analytics themselves?

The answer to that obviously facetious question is no – a metaphor for BI I really like is the tip of the iceberg.

*This picture is wholly unnecessary, but it does break the wall of text up nicely.*

The front end visualisations represent the most visible 10% of the process, the only bit a client sees, but underneath that slick dashboard is a mass of data & tables, that someone has had to wrangle into some kind of meaningful conformed dataset. The concept of self-service doesn’t typically descend below the waterline of the iceberg analogy (although a few of the braver proponents of the concept might argue it should!). But the role of developers is likely to change from churning out multiple reports to being architects of self-service user interfaces. So, if you see self-service reporting as a threat to the role of front-end developer, you can probably stop updating your CV, as the role of front-end developers in a self-service world will be more exciting, not less!

So, this all sounds great. Can we just start throwing data at these people?

Again, the obvious answer is another no. That’s because like every other idea in data, there are qualifications to consider. The potential downsides of self-service analytics are stark! Let’s think about a couple;

Skill

*This picture actually borders on relevance!*

Even though we want to believe everyone has a budding data scientist in them, we do have to be aware that people will have wildly differing skillsets when it comes to analytics. The citizen data scientist idea is based on picking people who already have a bit of an aptitude toward it. But as a slightly exaggerated example of self service analytics going wrong, imagine a CEO who is happily putting together their own report but doesn’t know how to change the aggregation on a measure, and sees the companies profits as an AVG instead of a SUM..

Also folding into this issue, guided analytics can enforce best practices, which of course, will be a little more hit and miss if we throw it open to everyone to get involved!

The skill barrier is potentially the toughest risk to mitigate. Of course, we can always knowledge share – having a workshop where you teach the basics of data analysis, and maybe more importantly, interpretation could definitely help your end users along the path to the holy grail of totally independent self-service. However, you probably can’t stop taking calls from end-users after the workshop!

Self-service analytics should not be used as a process to remove the need for IT (I’m not just saying this because it is my job – this is a genuine point!), rather it just changes the focus of analytics from being wholly owned and administered by IT, to instead being a collaborative partnership in which IT plays the role of a data steward, providing and ensuring the quality of data and any tools, as well as providing analytics guidance to users. However, policies like establishing a “centre of excellence” or having a prominent community on Teams / Yammer – basically having somewhere people know they can go for guidance, can massively reduce the risk of people drawing faulty conclusions.

In addition to this, when we talk about self-service BI, we aren’t expecting CEOs to go and start mining big data. Setting up the tools to allow for easy, simple analysis designed with the skill level of your audience in mind is the key. This is where the idea of a curated dataset really comes to the fore, giving users sophisticated, directed tools to help guide them in their exploration. This could be as simple as naming your data properly for humans (i.e. Sales (£) instead of _WDKSLI00094390-GBP), building simple measures to be available (a year on year profit calculation for example) or including helpful hints / tutorials.

Access

*This one kind of works, although that is a really bad way to store wireless headphones.*

Any traditional IT people who read about data democratization must have mild panic attacks - the idea of making data accessible to everyone to help empower and guide decision making at all levels of the business is a noble goal, but if you have ever tried to get access to some data from a conservative IT department – you will see why they would not be a fan!

I think most people would agree that while the idea of total data democratization is noble, it probably can’t work in a real organization. Too much data is classified as business critical and there are also legal implications (for example in Germany sales people aren’t legally allowed to see other sales people figures).

I think the idea that self service gives users more access to the data, is a bit misguided. We don’t actually have to allow anyone to see more data than they usually can. Just because we share the data model with our users it doesn’t mean they have to see everything in it. Instead we can choose to restrict what they see. For instance, in PowerBI if a column is hidden, an end user can’t see it – you could use this feature to hide almost everything, leaving only the acceptable data to play with.

Redundancy

I couldn’t find a good picture for redundancy, sorry.

One of the huge benefits of centralized BI offerings / guided analytics is that often many users will have similar requests – by producing products centrally to meet these demands it is often possible to create a BI report that will fulfil many different business requests. The self-service model means each department will be performing their own analytics which may involve significant overlap, thus wasting effort.

However, this would only really be a concern if the self service platform is implemented poorly. Ideally the process should be simple enough that even if there is duplication of effort the effort is so minimal the effect is negligible – or at least less effort than in a traditional guided analytics model.

Let me illustrate my point with an example. Let’s say that we have a report ready to go live. Once it goes live, 3 users from different departments all identify one additional KPI that they all need adding to improve the usefulness of the report.

In our traditional model one of the users (or more likely all three!) will email the report owner in IT and request the feature to be added. This will then get recorded, probably added to a backlog of change requests and so on. At the end of that, the report designer will build a new visual and reshare the product. Now all the three users are happy and the report is perfect, although it might have taken a while to get to that next iteration.

Let’s consider the same request but in a self service analytics model. Instead of contacting anyone, each of the users (after some training, and knowing there is a centre of excellence they can leverage to cover any gaps in their knowledge they might have) can simply add that visual themselves, using our simple tool. So, while technically the same visual is created 3 times, which on the surface seems a waste of effort, realistically is dropping a couple measures into a new chart going to take longer than actioning a change request? Quite unlikely.

I accept this is a very specific scenario which I devised to prove my point, which isn’t very fair. Of course, depending on the requirement – especially if it involves advanced analytics, it might be easier to update the report centrally. But consider if you could take all the change requests and only have to action the difficult ones, allowing your users to build the simple stuff themselves and you’ll see one of the benefits. For example, imagine the last scenario again, but where each of the three users have a different simple ask – suddenly the centralized workload has tripled.

Of course, looking at it from the business point of view, it’s also a big win, as your end users can get the information they want much faster, which will in turn speed up decision making processes, which is exciting for any business user.

Decentralisation

While the whole concept of opening data up is to decentralize data from being the sole remit of IT, the concept of decentralization does bring certain risks. The main one being that if you let everyone perform their own analysis on data, there is a risk you start seeing “multiple versions of the truth”. Some would argue this isn’t a big concern, as even in heavily centralised environments you still find multiple versions of the truth anyway, as a result of “black market” analysis going on under the radar, but it is fair to say decentralising where the analysis flows from probably does increase the risk of the truth becoming muddied by competing interpretations.

An additional risk is that where there are competing versions of the truth, more senior staff member’s opinions may carry more weight, regardless of the veracity of their conclusions – this would obviously not be a good thing!

But similar overcoming skill barriers, IT led initiatives such as a centre of excellence, or a community of data experts that business users have access can greatly help in reducing the risk of this. Again, self service analytics is not a model where you produce a tool, throw it out and disappear – IT is still responsible for guiding where required. Also, as previously mentioned, even with guided analytics models, multiple versions of the truth arise anyway, so at least in the self-service model there is an impetus to have a more visible and accessible means of obtaining help.

So which is better?

At this point, I’m going to cop out slightly and argue that neither approach is ideal. I think by adopting a hybrid approach and having experts shape the tools & tailor it to the audience, the self service model can have some huge potential benefits, while also minimizing the potential risks. In my ideal hybrid model IT can still own the data, and be responsible for providing guidance and support, while simultaneously giving business users the tools to get the answers they need to run a business without involving IT in every single request. It also doesn’t mean entirely doing away with guided analytics – you could still build a report, but just hit the 80% of content that everybody needs, and let users define their own content for the more specific, involved 20%.

It might be better to think about it as guiding users toward self-service.

Of course, no system is perfect and there probably are some challenges that would be very difficult to overcome. However, searching for a perfect system in any avenue of life is generally a waste of time – and the risk of something not being perfect is no reason not to try it! Also, the longer you stick with the self-service approach, and ingrain it into the corporate culture, the more natural it will seem. Again, if you take a look at the big visualization platforms you can see how new tools and features are pushing toward the self-service model, which is a good indication of what companies are asking for, which in turn means the self-service concept isn’t going to diminish anytime soon.

Thanks for making it through – this was a long one! Unlike my usual blogs, this was very theory based but don’t worry, my next blog will be a part 2 to self-service analytics and I’ll run through how we can turn a basic guided analytics report into something with a little more customisability, while making sure we avoid the main pitfalls highlighted here. Until next time!

Self-Service vs. Guided Analytics

Self Service Analytics: The Practical

Sort a line chart by the Y axis in Power BI

Day to Data Stuff