Hack Your Own Enterprise
In ‘Obese Data’, one may be left with the impression that the extra-IT data and programming initiatives have created an unmanageable mess.
‘Unmanageable’ is always of course dependent on resources, if one can expand the IT staff sufficiently one can get the situation under control. If you can’t....
There are a number of vendors offering solutions in this regard, such ‘solutions’ make certain assumptions. The first is that someone, somewhere, knows where everything is. One vendor has a product that integrates information from the enterprise data warehouse and legacy systems to produce the ‘big picture’ view of market trends. I’ve worked in a site where that was attempted. Issues were discovered.
In a nationally distributed enterprise with 100,000+ employees, one could presume that 10,000 managers and their associated administrative staff are entering raw data at all times, with the objective of gaining some insight on perhaps weekly intervals. Very often these insights are attempts to justify resource allocations - more or less people, a new building, equipment expenditures, or marketing campaigns. Often the reason line managers are going to this trouble is that they perceive that their executive chain of command isn’t paying attention to them. Very often the data going into the system is not what is most interesting to management, it’s the questions being raised in the first place. A data warehousing system can tell you that demand dropped off in a particular town on a particular date - a month or two after the demand dropped off - if someone is looking at demand statistics. Way before this makes it to the warehouse, the executive responsible for revenue in that area is going to be lifting up rocks to see what’s crawling around underneath.
So one envisions the following scenario with that in mind: that some scanner is out looking at office productivity files, identifies a new report in an Access database, and adds that to a table of known Access reports companywide. The title of the report is a dead giveaway - ‘Rolling Average New Subscriptions and Cancellations’. Why did a line manager need to know this?
In another city, a manager may be focused on damaged freight - ‘Customer Repairs and Returns by Shipper’. Again, people in the lower ranks of management see this before it bubbles into corporate. Sometimes the target of the report are the shipping companies - internal use is an afterthought.
In such circumstances it isn’t the data, per se, that’s of interest to senior management, it’s the questions that are being asked right now. Similarly, a set of spreadsheets might define an implicit schema (tables and columns). It might be used by a collection of marketing people for a year, and then suddenly morph into something else - new spreadsheets and new columns. Who cares what the data is - why did the schema change?
The opposite side of this coin are the areas of limited activity. The Omaha operation seems to be creating reports every week or so, Duluth about one every six months. Why are these outliers - at either end of the scale?
Certain parts of the country tend to be regulatory hotbeds, such as New York and California. A lot of data analysis is defensive: ‘are we going to get in trouble with current business practice?’ Some behavior can get you in trouble in Utah and Tennessee, either ‘as well as’ or ‘instead of’ more activist governments. Some of this could be due to culture, some due to the fact that overwhelmed regulators in large states don’t pick nits over small stuff little states have time to chase. Sometimes companies are targeted by some agency due to some political slight - stuff happens.
That certain groups get political over who is looking over their shoulder goes without saying. Therefore, because IT could, in theory, pull data from all Access databases and all Excel spreadsheets doesn’t mean it will actually occur. If necessary, someone keeping something close to the chest can save their file to a USB flash drive on a laptop. Sometimes this is a good idea. Other times it is precisely the behavior that will bankrupt the business.
In short, setting up a scanner to identify every productivity file across the enterprise, and consolidate all that into a corporate wide view requires handshaking and hand holding. So the ‘big data’ environment has multiple parts - one is a scanner that recognizes new and changed productivity files. Another one might be an analyst that figures out the materiality of new and changed files. A third is someone with ‘ambassador’ skills that periodically drops in on various locations to see what their actual degree of cooperation is - whether things are actively hidden or whether the files can be seen in their directories, but their contents are protected.
In this sense field initiatives turn into experiments corporate can learn from. Rather than ‘take over’ a project a user has started on their own, the IT group can train the end user on best practices. Why go hire a programmer when you’ve got someone that’s almost there. Use whatever is already in-house.
At this point the productivity files are no longer some amorphous mass of things someone has to catalog and convert, they become part of the corporate sensor network. The IT people need to look, but not touch. Certain ideas will filter in from the periphery to be absorbed at the core, but this will be done as resources permit and according to well understood priorities. Attempting to ‘centralize’ all known user initiatives turns the company into the Soviet Union - central planning didn’t work there, and people shouldn’t have to get permission to keep their information in the form they need to do their best job.