Managing Data or Data products?
Managing data requires a deep technical understanding of data transformations
Tealfeed Guest Blog
On some of the product management groups and in online conversations, I have seen the term Data Product Manager become more popular in the past couple years.
Of course, the functions of a data PM existed even more the buzzword did, but just like every other fancy buzzword that beefs up the resumé, Data PM sounds cool, and obfuscates understanding at the same time.
The catch is, what you’re managing actually impacts the how you manage, so in this case, managing data and managing a data product are not interchangeable. I’ve done both, and there are some significant differences in methodology between being a PM for a data product and for data.
What is a data product?
Do you need data to make decisions about your product roadmap and strategy? Is some usage/interaction data critical to building your product, and do you need algorithmic models that use copious amounts of data to make your product function?
If your answer is yes, you’ve got a data product on your hands.
If, on the other hand, your product generates data or uses data, but you simply need a way to store and organize the data to ensure a better customer experience, or to create analytics on the generated data, then you are looking at managing data.
As an example, a retail app that sells clothing on a mobile device actually needs to know how many users they get everyday, how many users use their filters and what filters they use to determine what functionality to roll out to customers next. They need statistical and machine learning models to personalize clothing recommendations to returning users. They’ve got a data product.
A news organization, on the other hand, simply produces news and releases it on the app. Usage on their app generates a lot of data on user interaction on pages, engagement, and subscription, but news organizations’ primary role is to report on news, so low user engagement on a story won’t mean they’ll stop doing the news.
They will need the data to run analytics so they can keep track of subscription revenues, and how many comments they receive and perhaps how many times the app was installed. They will need an efficient way to organize the news stories and media they have collected for archival and future reference. That is simply managing data.
Of course, the line between a data product and data management can get blurred easily.
If the news organization chose to renew or cancel the contract of a columnist based on page engagement, would that count as a data product? Technically, no.
The primary role of the organization is still to deliver the news, they used the data to create analytics on top of it, but didn’t need to use the data to algorithmically alter the user experience.
However, if the news organization actually made a real-time decision on which users should see their opinions and only show opinions to a user that are personalized for them, the news app/website would then become a data product.
How is managing data different?
Managing data requires a deep technical understanding of data transformations that your product does from the time the user starts interacting with your product to the point where your user exits the product.
This data is going to be used to make significant conclusions about product KPIs and if the PM is not aware of the gaps in data transformations or how the data is being stored, the analytical interpretation of data is likely going to be wrong, and you will have assigned inflated, deflated KPIs and metrics, or worse, made causal observations based on data that doesn’t have direct causal links at all.
One other aspect of using data for analytics is keeping an eye on data quality. Poor data quality usually means transformations are not being tracked, and what you think is happening is probably way off than what is actually happening. When the data has found its way through the different systems and you see the end product in a data warehouse, short of trying to retrace at which point the latitude and longitude of a city became out of range and seeing a black box, there is nothing much to do.
Instead, monitoring data transformations in data stores is important for error handling, and maintaining data quality metrics is equally important, particularly when creating automated test coverage for data quality.
There are several techniques that can be used to manage data and data attributes efficiently, as well as ideate on what data you might need for a new product.
In my next post, I’ll write about some frameworks I’ve found useful in managing data.
This article was originally published by Anwesha bhattacharjee on medium.
Tealfeed Guest Blog