Powerful analytic tools are at our disposal. These tools provide great insights into data that allows us to make better decisions. But are we looking at the right data? Don’t better decisions depend on the quality of the data? What is the data strategy to make sure we are considering the right data in our analytics?
Golden copy Concept
The Golden Copy is the agreed-upon version of truth that is to be used for analytics and other important applications. Sometimes known as the single source of truth, the Golden Copy will come from agreed to sources of data, in the appropriate timeliness which then passes through a data curation process that often includes traditional extract/transform/load processes (ETL). The result is data that has been vetted and can be used confidently and consistently for a multitude of purposes.
The process of validating appropriate sources of data is no simple task. This is not just an IT exercise, nor is it just a business unit (BU) process. The BU needs to define the decisions and what sources of data are needed to support those decisions. IT and business analysts may be able to help the BU identify additional data sources. For data to become a competitive advantage, it may be necessary to combine traditional and non-traditional sources of data. There are plenty of data sources available but care must be taken to consider the sources and quality of data. A common arrangement might be that data ownership resides with the BU and they make data source decisions but IT owns the infrastructure and processes for the data.
Data management is comprised of several different processes, and there are different considerations for data sources inside and outside the company. Data management typically exists for data from inside the company such as sales data, manufacturing process data and supply chain data. There may be other data from inside the company, often unstructured data, which may not have existing management to determine if it is suitable to be put into the Golden Copy repository. Examples might include sensor data, customer service feedback, process machinery data or other IOT data that needs to adapt for more general use before it is placed in the Golden Copy repository.
There are also potential data sources from outside the company which may have very different ETL needs compared to internal data. For instance, social media data for the marketing department may require a lot more screening than internally sourced data from the CRM application. There needs to be a process to prepare data for the Golden Copy to make it useful.
There can be such a thing as being too careful, however. Remember, more data is often more insightful than better algorithms. A competitive advantage may be to incorporate new sources of data to help your analytics even if there is work to be done before it can go into the Golden Copy.
Multiple Versions of the Truth
Now you have a Golden Copy repository/data lake, and you get your first request for data that is a outside the Golden Copy. Time to talk about changing business needs and multiple versions of the truth. It is going to be necessary to modify the Golden Copy data to accommodate new requests, but consider using the same process you created to validate the Golden Copy data in the first place. Get the group of IT and BU decision makers together to make sure as the data is created it can be trusted and curated, and if appropriate, added to the Golden Copy, or just used for a one-off need.
Using data as a competitive advantage is a great concept. Making data a competitive asset is a lot of work. By putting the right strategy and processes in place you can create a model that will continue to improve over time and create a foundation for a sustainable competitive advantage. As always, data analytic success is best when the desired result is firmly in mind before proceeding. In many cases, how you source and treat the data will be critical to data analytic success.