“It’s a dangerous business, Frodo, going out your door. You step onto the road, and if you don’t keep your feet, there’s no knowing where you might be swept off to.”
โ J.R.R. Tolkien, The Lord of the Rings
It started with one report.
Someone in finance heard about Power BI and thought it would help them in their work. Then, more people wanted their own reports. Operations got on the bandwagon, too, and so did marketing. All these reports pulled data from various source systems and bogged them down to the point that they were unusable.
So, the company hired a team of data engineers. They built databases, pipelines, and other resources to move, shape, and combine data, in a haphazard approach. Every night, dozens of data factories ran hundreds of pipelines. Some were business-critical, and others were obsolete, with no user looking at the data.
Data and code are duplicated (or near enough), pulling data from the source and transformed multiple times to produce similar results.
Often, reports delivered fail to meet the needs of the business, leading to reports with little to no use and building skepticism from the company about the utility of analytics.
This is wasting money on cloud resources and developer time and is not sustainable. Fixing this situation is possible but expensive. All of this could have been avoided if the organization had a proper data strategy that governed how it would pursue its data.
Sections
What is a Data Strategy?
A data strategy is a plan that defines how your organization will manage its data, including the processes that are followed, how the data is organized, what technology is used, and who the people involved are and their roles.
It doesn’t have to have every detail defined, but a high-level plan with proper guidelines will guide you as things come up and your analytics system grows. This blog will help you think about what you must do to create one.
How to Create a Data Strategy
The first step in creating a data strategy is to decide where you want to end up. The entire goal of analytics is to help make decisions for your organization, so it is important to think about how your organization will use data to make decisions. Knowing where you want to go will guide the path you take to get there, so take some time to figure out where you want to go first.
The analytics environment of even a modest-sized organization can quickly become very complex. It will consist of many processes, storing data in many locations and forms. There needs to be a plan on how your data will be organized and flow through the analytics environment. Some things to think about are:
- Data layers. Raw data will come in from various source systems and often flow through several ‘layers’ or transformations as it is cleaned and organized before showing up in reports. What are the layers you will have?
- Divisions. An organization will have different departments (finance, operations, marketing, etc.), each with its own data they generate and want to see. They may also need to look at data from other departments. How will you handle these divisions and provide access? How will you mix data from different departments?
- Processes. What are the processes for creating new reports? Adding or enhancing data?
- Laws and Standards. Are there any laws or industry standards that apply to your data?
- Privacy protection/GDPR, Personally identifying information (PII), Payment account numbers (PAN), health data
- Technology. What technology will you use? On-prem or cloud. Proprietary software or open-source? Single vendor or multiple?
- Documentation. What platform will you use for documentation? What processes will ensure it is accurate and up to date? Who needs access to the documentation? Data engineers will need one type of documentation, while business users need another.
- The team. What will the development team look like?
- Remember, beyond technical staff, there should be experts in various business areas working with the development team. They need the capacity to do this, and you will have to move them off business tasks to help with data development.
- How will you know it is working? Is it delivering what you want? Are business users using what you’ve built?
Conclusion
A good data strategy will allow your organization to build its analytics system in a logical way, which allows for clear development and maintenance and, most importantly, useful data. Going back to the introduction of this blog, had our imaginary company had a good data strategy, it could have avoided many issues and created an analytics system that would be an asset instead of a liability.