Thursday, September 26, 2013

Enterprise Data Platform: A Reboot and a Reality Check

The last post I wrote on the Enterprise Data Platform was in January. It's September. What happened?

A whole lot, actually. My understanding of what an Enterprise Data Platform is and how it needs to be 'sold' has changed dramatically. My role in this process has also changed. And my understanding of delivering software and running a team has grown and changed  and continues to change for the better.

What I've found out in the past year, among many other things, is that getting people to fund what I've been calling an Enterprise Data Platform is as much about education as it is about technical execution. I'm not talking about educating other people, I'm talking about educating myself. It has been an incredibly educational, humbling, uncomfortable, frustrating, awesome nine months.

When I look back at the posts I wrote early this year, one thing that is very murky throughout a reasonably well laid out argument for data management is a value proposition. I didn't know that when I wrote it, but I quickly found it out when I went to ask for money.

Here is the value disconnect: a system to collect, manage, and leverage data is that system solves  a secondary problem that assumes that a primary problem has been solved. In other words: If I'm trying to get you to fund a data collection, storage, and management platform, I'm assuming that there is something that is already generating the data that needs to be stored.  Netflix, Amazon, Google, my bank website, Blogspot, all solve primary problems. Those problems are easy to explain, regardless of how hard they are to implement. Solutions to primary problems have clear, concise, direct connections to value.

Solutions to secondary problems are optimizations of primary solutions. Decreasing time to insight on operational metrics of a website is an optimization. A great one, to be sure, but not necessary if the website is not getting any traffic.

Any solution to a secondary problem has an indirect value connection at best. Secondary solutions only make sense when the initial value proposition of the primary problem is diluted or reduced due to a secondary problem. A system that doesn't scale to support a site whose popularity is exploding through the roof is a secondary problem. The Secondary problems I see in my current role are operational in nature, and the solutions to them are optimizations. They can deliver huge value when done correctly.

"When done correctly". Three words that are seared into my brain. In the past year several things have happened while I've been trying to explain, again and again, why we should build a solution to a secondary problem, and while I've been trying to build that solution with limited resources:
  1. I've realized that the best way to solve a secondary problem is one primary problem at a time. Building a platform to optimize an undefined set of primary solutions is a risky, 'field of dreams' approach, and there are many ways to go awry. 
  2. I've become less of a technologist and more of a product owner. My last piece of production facing code will (hopefully) be retired in the next couple of months. There are much better engineers on the team, and I rely on them to deliver working software in the same way that they rely on me to come up with a useful product.
  3. Where I used to think about use cases and requirements and assume that these were valid, I now question and validate product direction up front. That involves use cases, but if the use case is invalid, why spend time extracting requirements from it? I spend more time thinking about validating the use case as cheaply as possible.  Requirements emerge and solidify as product direction takes shape -- doing them in advance of having a validated use case seems backwards. This insight radically changes the way software is delivered, and our teams are in the middle of this change process.
  4. The cost and recovery plan for any effort -- infrastructure and resources -- combined with time to recovery, is best defined as soon as you have validated use cases. Those plans need to change as validated features emerge that impact cost and recovery. In previous roles I had been 'sheltered' from that aspect of the business. I'm finding now that financial data is the ultimate data point that helps quantify whether value is being delivered.
This process is far from complete. I am continuing to learn every day, and while it can get very uncomfortable, it has been an amazing education. 

I've tried to write things down several times in the past 9 months. I haven't gotten very far because what I was writing didn't feel complete. Writing about the technology is only one side of the story. What I've learned in the past nine months is that there is a much bigger picture -- now that I'm starting to be able to externalize what I've learned, I'm excited to write about it. More soon...