Be in the know.

The Platfora blog.

Beyond the Filing Cabinet: a New Workflow for Enterprise Data


One key difference between the two models has to do with access to the data. In the old model, data scientists and business analysts (along with business users of all varieties, for that matter) rely upon data administrators in IT to round up the data and make it readily available for exploration.

That handoff can occur only after IT spends a good deal of time creating and adjusting data models and performing extensive cleanup on the data. And that’s just for data coming out of transactional systems. Trying to get semi-structured and unstructured data into the analytical mix is even more time consuming and labor-intensive.

In the old model, dealing with the multiple data structures of transactional, social, and machine-generated data can be a nightmare. And that’s why traditional BI (Business Intelligence) and DW (Data Warehousing) solutions typically provide access to so little of that kind of data, and why it takes a lot of time to get even that little bit.

Think of the old BI / DW model as something like a filing cabinet. A filing cabinet is a great system for holding folders containing documents. Anything you need to know is readily available and easily accessed — just as long as it is found in a document of the sort that fits easily into a folder.

Contrast the filing cabinet with a box of “junk” of the sort that tends to accumulate in garages and basements. There is all kinds of stuff in there — a photo album, an old radio, some VHS tapes, some paperback books, some programs from high school sports events, etc. Now these are all sources of information, too, but none of them really work with the filing cabinet model. Their data is all structured differently. If you have to use the filing cabinet for the information they contain, the best you are going to be able to do is a bunch of workarounds. For example, you can make a list of all the pictures in the photo album and then put that into the filing cabinet. How handy.

Unfortunately those kinds of workarounds make up the bulk of activity surrounding traditional BI / DW, especially when such techniques are applied to the “junk box” that is the multi-structured data environment that organizations must deal with today. Like the filing cabinet, traditional BI / DW is a high-friction environment requiring a lot of work for a relatively small amount of achievable benefit when applied to multi-structured data. All this prep work is done up front. Each time you need to add a question or change how you’re accessing the data, you end up back at square one having to do even more up-front work. And usually all this work is done by IT.

Platfora’s Big Data Analytics capability changes all that, creating a whole new workflow for data-driven analysis and decision-making.  Platfora is native to Hadoop, which provides access to 100% of the organization’s data, rather than the roughly 10% accessible by most traditional solutions. Platfora puts all of this data directly into the hands of data scientists and business users, enabling them to ask new questions iteratively. With Big Data Analytics, you can go from raw data – of any format or structure, and from any physical location – to actionable business insights in a day. Or less. The work is shifted from up-front cycles spent trying to make the information fit the available tools to an organic learning experience focused on the previously hidden drivers and behaviors that can make or break your business.

With Big Data Analytics, data scientists and business analysts (along with other business users) can now focus on business challenges, not technology issues and delays. The workflow of the traditional model was all about getting data into systems; the new workflow is all about getting insight into business.