<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
> <channel><title>Platfora &#187; Ben Werther</title> <atom:link href="http://www.platfora.com/author/bwerther/feed/" rel="self" type="application/rss+xml" /><link>http://www.platfora.com</link> <description>Clarity From Big Data</description> <lastBuildDate>Tue, 21 May 2013 22:21:00 +0000</lastBuildDate> <language>en-US</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.5.1</generator> <item><title>BI is Better on Hadoop</title><link>http://www.platfora.com/bi-is-better-on-hadoop/</link> <comments>http://www.platfora.com/bi-is-better-on-hadoop/#comments</comments> <pubDate>Tue, 26 Mar 2013 07:55:24 +0000</pubDate> <dc:creator>Ben Werther</dc:creator> <category><![CDATA[Uncategorized]]></category> <guid
isPermaLink="false">http://www.platfora.com/?p=5535</guid> <description><![CDATA[I&#8217;m thrilled to announce that after almost two years of intense development, Platfora&#8217;s product is out of beta and GA (generally available). I want to congratulate and thank the entire Platfora team &#8212; brilliant engineers and designers who put their heart and soul into building a product that fundamentally changes what business users can do with data. I&#8217;m extremely proud of the product that we&#8217;ve built, really gratified by the reaction from customers and the industry, and at the same time very aware that we have a five to ten year long roadmap of innovation in front of us to ...]]></description> <content:encoded><![CDATA[<p>I&#8217;m thrilled to announce that after almost two years of intense development, Platfora&#8217;s product is out of beta and GA (generally available). I want to congratulate and thank the entire Platfora team &#8212; brilliant engineers and designers who put their heart and soul into building a product that fundamentally changes what business users can do with data. I&#8217;m extremely proud of the product that we&#8217;ve built, really gratified by the reaction from customers and the industry, and at the same time very aware that we have a five to ten year long roadmap of innovation in front of us to truly realize our greater vision.</p><p>When we started this journey back in 2011, Hadoop was largely seen as the place to land and prepare data for use in relational data warehouses. A few bleeding-edge organizations used Hadoop directly as a poor-man&#8217;s data warehouse, but struggled with Hive queries that were hard to write and would take minutes to hours to complete. The idea of interactive BI (business intelligence) directly against Hadoop seemed like a pipe-dream.</p><p>At Platfora, we made a bet that Hadoop&#8217;s destiny wasn&#8217;t simply to be a cheaper, slower cousin of the relational data warehouse. Rather than creating a pale imitation of the status quo, we saw a way to leverage Hadoop&#8217;s strengths and offset its weaknesses to do a better job of exploratory BI and analysis than had been possible before.</p><p>Hadoop is superb at two things &#8212; it provides a near-infinite data reservoir where data of all kinds can be landed without needing to figure out how it will be used ahead of time, and it is a slow lumbering freight-train of an engine for crunching and aggregating batches of millions or billions of rows.</p><p>Platfora uses Hadoop the way it was intended &#8212; as a batch work engine &#8212; and we built our own scalable in-memory engine designed for interactive (sub-second) response that automatically drives Hadoop to distill raw data into in-memory aggregates, while serving up lighting fast queries and a complete BI experience built native for this Hadoop stack. Ying and yang, woven together to allow superb performance with a never-before-seen degree of agility to get at any Hadoop data today without waiting for an IT project.</p><p>Our inspiration was the iPhone &#8212; a beautiful synthesis of design and technology that delights everyday users and just works. That&#8217;s a high bar, and I don&#8217;t want to claim that we&#8217;ve attained Steve Jobs-esque levels of sublime design. But we&#8217;ve done what hasn&#8217;t been possible until now &#8212; business users going from raw datasets to visual insight in a day (not 6 months, 12 months, or more).</p><p>Along the way, we&#8217;ve met hundreds of companies that are committed to Hadoop and the agility that comes with a data reservoir architecture. Fortune 500 organizations, big banks and retailers, web and advertising companies, media and telcos. We&#8217;re overwhelmed by the reactions of users inside customers like <span
style="text-decoration: underline;"><a
href="http://www.riotgames.com/" target="_blank">Riot Games</a></span> and <span
style="text-decoration: underline;"><a
href="http://www.edmunds.com/" target="_blank">Edmunds.com</a></span> that are using data in ways that just weren&#8217;t possible before Platfora.</p><p>We&#8217;re entering a new age. You don&#8217;t have to believe us &#8212; listen to our customers. The relational data warehouses and traditional SQL-based BI products of the past are becoming the legacy mainframes of the data age. With Platfora, BI is better on Hadoop.</p><div
style="min-height:33px;" class="really_simple_share robots-nocontent snap_nopreview"><div
class="really_simple_share_facebook_like" style="width:100px;"><iframe
src="//www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.platfora.com%2Fbi-is-better-on-hadoop%2F&amp;send=false&amp;layout=button_count&amp;width=100&amp;show_faces=false&amp;action=like&amp;colorscheme=light&amp;height=27&amp;locale=en_US"
scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:100px; height:27px;" allowTransparency="true"></iframe></div><div
class="really_simple_share_twitter" style="width:100px;"><a
href="https://twitter.com/share" class="twitter-share-button" data-count="horizontal"
data-text="BI is Better on Hadoop" data-url="http://www.platfora.com/bi-is-better-on-hadoop/"
data-via=""  ></a></div><div
class="really_simple_share_google1" style="width:80px;"><div
class="g-plusone" data-size="medium" data-href="http://www.platfora.com/bi-is-better-on-hadoop/" ></div></div></div><div
style="clear:both;"></div>]]></content:encoded> <wfw:commentRss>http://www.platfora.com/bi-is-better-on-hadoop/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>Platfora, Impala, and the Future of Hadoop</title><link>http://www.platfora.com/platfora-impala-and-the-future-of-hadoop/</link> <comments>http://www.platfora.com/platfora-impala-and-the-future-of-hadoop/#comments</comments> <pubDate>Wed, 13 Feb 2013 06:00:17 +0000</pubDate> <dc:creator>Ben Werther</dc:creator> <category><![CDATA[Uncategorized]]></category> <guid
isPermaLink="false">http://www.platfora.com/?p=5453</guid> <description><![CDATA[Roll back a mere six months and put yourself in the shoes of an IT director looking for a platform to store and analyze large volumes of data. You&#8217;d have a stark choice: - <em>Option 1</em> &#8212; buy into the traditional MPP database approach of Teradata, Oracle Exadata, etc., and spend 12+ months modeling and implementing schemas, ETL pipelines, aggregation jobs, maintenance scripts and more. It is expensive to purchase, and once implemented is rigid and hard to evolve, but on the plus side provides standard SQL support and good performance. Let’s call this the “Legacy Database Option.” - <em>Option</em> ...]]></description> <content:encoded><![CDATA[<p>Roll back a mere six months and put yourself in the shoes of an IT director looking for a platform to store and analyze large volumes of data. You&#8217;d have a stark choice:</p><p>- <em>Option 1</em> &#8212; buy into the traditional MPP database approach of Teradata, Oracle Exadata, etc., and spend 12+ months modeling and implementing schemas, ETL pipelines, aggregation jobs, maintenance scripts and more. It is expensive to purchase, and once implemented is rigid and hard to evolve, but on the plus side provides standard SQL support and good performance. Let’s call this the “Legacy Database Option.”</p><p>- <em>Option 2</em> &#8212; store the data in Hadoop, and query it directly using MapReduce, Pig and Hive (Hadoop&#8217;s SQL-ish syntax). This option is low cost, scalable and requires no decisions or modeling up front &#8212; just write raw data files in any format and figure out how you want to use the data later. However, querying it down the road is complex and requires highly specialized IT staff. Keeping track of what data you have in the cluster can be just as hard and performance is inadequate. Each question can take minutes or hours across tens or hundreds of nodes. Let’s call this one the “Hadoop-Centric Option.”</p><p>Both options fail to achieve what customers are telling us they want now: more rapid (but secure) self-service exploration and analysis that allows business users to answer questions that span disparate datasets in unanticipated ways. The solution must be consistently fast, responding to user queries in sub-second time frames, and feeding our appetite for today’s powerful data discovery tools.</p><p><strong>Impala solves a part of the problem</strong></p><p>Based on Google&#8217;s F1 architecture, Impala is an open source Cloudera-led project to improve the performance of Hive by bypassing MapReduce and layering an MPP database-style architecture on top of HDFS. In many ways a hybrid of Options 1 and 2, it improves Hive query performance by about 5-10x. It’s a clear shot across the bow of the traditional MPP database vendors. Impala will lag behind them on raw performance, but it has the potential to replace them by making ad hoc queries on smaller datasets nearly interactive (seconds instead of minutes).</p><p>The success of Impala will be great for the Hadoop ecosystem. However, it takes us back into the land of Option 1 (the Legacy Database), with the need for DBAs to manage transformation and maintenance jobs, design and implement aggregations, tune performance, etc.</p><p>Impala is quick at querying small amounts of data, but laws of physics dictate that querying terabytes or petabytes of data, is slow. It is slow even with the highly tuned I/O subsystems of Teradata and Oracle Exadata, and far slower in any Hadoop-centric architecture like Impala that needs to pull unoptimized raw files from HDFS. This isn’t news to any DBA, and in response, they manually build and maintain ‘aggregate’ tables &#8212; much smaller tables of rolled-up data that can be queried in seconds but lack fine-grained details &#8212; and instruct their data analysts to use them rather than the big slow tables that clog up their systems when queried.</p><p>But the presumption that a DBA knows what data will be important up-front is usually wrong, and it creates a never ending loop of change orders between analysts and the DBAs building the aggregates. This is exactly the challenge that popular BI tools have when they operate against big data. If a desktop BI user finds interesting data that they want to drill in on, or use as a dimension to slice a different dataset, they must go back to their DBAs for new aggregations that include the data they want; a tedious and slow process that slows down their work.</p><p>Worse, if a desktop BI user hits the wrong tables (i.e. the raw data), or submits a complex query, they can chew up vast amounts of cluster resources and dramatically impact other users. This is not the scalable big-data architecture of the future, and it is exactly the painful world that every customer we talk to is trying to escape.</p><p><strong>Platfora makes Hadoop consistently fast and self-service</strong></p><p>Here is where Platfora comes into the picture. Our platform instantly turns raw data in Hadoop into interactive in-memory business intelligence. Platfora connects in minutes to any Hadoop distribution and automatically generates MapReduce jobs (w/ added Impala acceleration on the roadmap) to build and maintain scale-out in-memory aggregates.</p><p>Our scale-out middle tier is simultaneously an ‘aggregate cache’ of the data below, and a lightning fast in-memory analytical query engine to the users above. It holds ‘lenses’ &#8212; automatically materialized data marts that roll up raw Hadoop data and can be refined with a click of a button (using our Fractal Cache(TM) technology) to hold whatever level-of-detail is most interesting to users at any point in time. Performance is consistently sub-second, and offloads work from the Hadoop cluster to turn it into a scalable enterprise resource.</p><p>The front-end is a completely web-based (HTML5 Canvas) exploratory BI framework, in the spirit of modern data discovery BI tools, but natively built for Hadoop. Now users can interactively explore and visualize, build dashboards, collaborate and storytell seamlessly against any volume and diversity of Hadoop datasets. The front end is tied directly into the middle-tier, giving analysts the first closed-loop exploratory framework that lets them reshape their aggregations or add additional dimensions in a self-service manner without IT involvement.</p><p><strong>This is the future</strong></p><p>Lets not pine for Hadoop to emulate Option 1, with DBAs architecting and managing every aspect of the data warehouse &#8212; constantly modeling, tweaking, and maintaining systems for better performance while hopelessly guessing the needs of their business users. Nor should we put up with Option 2, which is slow, complex and provides inconsistent performance. The future can truly be better than the past &#8212; a world of consistently fast, scalable, and modern business analytics and BI for all of your data.</p><div
style="min-height:33px;" class="really_simple_share robots-nocontent snap_nopreview"><div
class="really_simple_share_facebook_like" style="width:100px;"><iframe
src="//www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.platfora.com%2Fplatfora-impala-and-the-future-of-hadoop%2F&amp;send=false&amp;layout=button_count&amp;width=100&amp;show_faces=false&amp;action=like&amp;colorscheme=light&amp;height=27&amp;locale=en_US"
scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:100px; height:27px;" allowTransparency="true"></iframe></div><div
class="really_simple_share_twitter" style="width:100px;"><a
href="https://twitter.com/share" class="twitter-share-button" data-count="horizontal"
data-text="Platfora, Impala, and the Future of Hadoop" data-url="http://www.platfora.com/platfora-impala-and-the-future-of-hadoop/"
data-via=""  ></a></div><div
class="really_simple_share_google1" style="width:80px;"><div
class="g-plusone" data-size="medium" data-href="http://www.platfora.com/platfora-impala-and-the-future-of-hadoop/" ></div></div></div><div
style="clear:both;"></div>]]></content:encoded> <wfw:commentRss>http://www.platfora.com/platfora-impala-and-the-future-of-hadoop/feed/</wfw:commentRss> <slash:comments>6</slash:comments> </item> <item><title>The End of the Data Warehouse</title><link>http://www.platfora.com/the-end-of-the-data-warehouse/</link> <comments>http://www.platfora.com/the-end-of-the-data-warehouse/#comments</comments> <pubDate>Tue, 23 Oct 2012 06:59:44 +0000</pubDate> <dc:creator>Ben Werther</dc:creator> <category><![CDATA[Uncategorized]]></category> <guid
isPermaLink="false">http://www.platfora.com/?p=5391</guid> <description><![CDATA[Today is a major milestone on the Platfora journey. But it is more than that. Today we reach out beyond our early beta customers and share what we know is possible. We&#8217;ve been living in the dark ages of data management. We&#8217;ve been conditioned to believe that it is right and proper to spend a year or more architecting and implementing a data warehouse and business intelligence solution. That you need teams of consultants and IT people to make sense of data. We are living in the status quo of practices developed 30 years ago &#8212; practices that are the ...]]></description> <content:encoded><![CDATA[<p>Today is a major milestone on the Platfora journey. But it is more than that. Today we reach out beyond our early beta customers and share what we know is possible.</p><p>We&#8217;ve been living in the dark ages of data management. We&#8217;ve been conditioned to believe that it is right and proper to spend a year or more architecting and implementing a data warehouse and business intelligence solution. That you need teams of consultants and IT people to make sense of data. We are living in the status quo of practices developed 30 years ago &#8212; practices that are the lifeblood of companies like Oracle, IBM and Teradata.</p><p>When I ran product at Greenplum, we understood this reality. Working with brilliant folks like Joe Hellerstein (UC Berkeley) and Brian Dolan (then at Fox Interactive), the team developed practices to navigate around the outmoded approaches of the past. Joe coined the name <a
href="http://databeta.wordpress.com/2009/03/20/mad-skills/">&#8216;MAD Skills&#8217; (Magnetic, Agile and Deep)</a>.</p><p>But we could only distort reality so far. At the end of the day it was still a big relational database. When the rubber met the road, DBAs were doing what they always do &#8212; designing data models, building ETL jobs, and tuning indexes and aggregates.</p><p>The insight for Platfora came a number of months after leaving Greenplum (post EMC acquisition). I&#8217;d been spending a lot of time thinking about Hadoop and why it was gaining so much momentum. Clearly it was cost-effective and scalable, and was intimately linked in people&#8217;s minds to companies like Google, Yahoo and Facebook. But there was more to it. Everywhere I looked, companies were generating more and more data &#8212; interactions, logs, views, purchases, clicks, etc. These were being linked with increasing numbers of new and interesting datasets &#8212; location data, purchased user demographics, twitter sentiment, etc. The questions that these swirling datasets would one day support couldn&#8217;t be know yet. And yet to build a data warehouse I&#8217;d be expected to perfectly predict what data would be important and how I&#8217;d want to question it, years in advance, or spend months rearchitecting every time I was wrong. This is actually considered &#8216;best practice&#8217;.</p><p>The brilliance of what Hadoop does differently is that it doesn&#8217;t ask for any of these decisions up front. You can land raw data, in any format and at any size, in Hadoop with virtually no friction. You don&#8217;t have to think twice about how you are going to use the data when you write it. No more throwing away data because of cost, friction or politics.</p><p>Which brings us to the insight.</p><p>In the view of the status-quo players, Hadoop is just another data source. It is a dumping ground, and from there you can pull chunks into their carefully architected data warehouses &#8211; their &#8216;system of record&#8217;. They&#8217;ll even provide you a &#8216;connector&#8217; to make the medicine go down sweet. Sure, you are back in the land of consultants and 12-18 month IT projects, but you can rest easy because you know the &#8216;important&#8217; data is safely being pumped into your multi-million dollar database box. Just don&#8217;t change your mind about what data or questions are important.</p><p>But lets go through the looking glass. The database isn&#8217;t the &#8216;system of record&#8217; &#8212; it is just a shadow of the data in Hadoop. In fact there is nothing more authentic than all of that raw data sitting in Hadoop. With just a bit of metadata to describe the data, it&#8217;d be possible to materialize any &#8216;data warehouse&#8217; from that data in a completely automated way. These ephemeral &#8216;data warehouses&#8217; could be built, maintained, and disposed of with a click of a button.</p><p>Imagine what is possible. Raw data of any kind or type lands in Hadoop with no friction. Everyday business users can interactively explore, visualize and analyze any of that data immediately, with no waiting for an IT project. One question can lead to the next and take them anywhere through the data. And the connective tissue that makes this possible &#8212; bridging between lumbering batch-processing Hadoop and this interactive experience &#8212; are &#8216;software defined&#8217; scale-out in-memory data marts that automatically evolve with users questions and interest. <a
title="Platfora Unveils World’s First In-Memory Business Intelligence Platform for Hadoop" href="http://www.platfora.com/press-release-story-3/?namePress=Platfora%20Unveils%20World%E2%80%99s%20First%20In-Memory%20Business%20Intelligence%20Platform%20for%20Hadoop">Enter&#8230; Platfora</a>.</p><p>Through the looking glass, there is no need for a traditional data warehouse. It is an inflexible, expensive relic of a bygone age. It is time to leave the dark ages.</p><div
style="min-height:33px;" class="really_simple_share robots-nocontent snap_nopreview"><div
class="really_simple_share_facebook_like" style="width:100px;"><iframe
src="//www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.platfora.com%2Fthe-end-of-the-data-warehouse%2F&amp;send=false&amp;layout=button_count&amp;width=100&amp;show_faces=false&amp;action=like&amp;colorscheme=light&amp;height=27&amp;locale=en_US"
scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:100px; height:27px;" allowTransparency="true"></iframe></div><div
class="really_simple_share_twitter" style="width:100px;"><a
href="https://twitter.com/share" class="twitter-share-button" data-count="horizontal"
data-text="The End of the Data Warehouse" data-url="http://www.platfora.com/the-end-of-the-data-warehouse/"
data-via=""  ></a></div><div
class="really_simple_share_google1" style="width:80px;"><div
class="g-plusone" data-size="medium" data-href="http://www.platfora.com/the-end-of-the-data-warehouse/" ></div></div></div><div
style="clear:both;"></div>]]></content:encoded> <wfw:commentRss>http://www.platfora.com/the-end-of-the-data-warehouse/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> <item><title>Hadoop is Irresistible and Flawed</title><link>http://www.platfora.com/hadoop-is-irresistible-and-flawed/</link> <comments>http://www.platfora.com/hadoop-is-irresistible-and-flawed/#comments</comments> <pubDate>Tue, 25 Sep 2012 12:00:26 +0000</pubDate> <dc:creator>Ben Werther</dc:creator> <category><![CDATA[Uncategorized]]></category> <guid
isPermaLink="false">http://www.platfora.com/?p=5088</guid> <description><![CDATA[We’re in the Big Data age. This isn’t business-as-usual with a new buzzword. We’re going to look back at the early part of this decade as marking a major disruption and rejuvenation in the decades-old way that companies use data to understand their business. The status quo is an intricate and expensive supply chain of data that is progressively stored, manipulated and processed until value is seen by a business user. The steps along this chain include storing data in a big (expensive) relational database such as Oracle or Teradata, loading data via a complex ETL (extraction, transformation, load) solution ...]]></description> <content:encoded><![CDATA[<p>We’re in the Big Data age. This isn’t business-as-usual with a new buzzword. We’re going to look back at the early part of this decade as marking a major disruption and rejuvenation in the decades-old way that companies use data to understand their business.</p><p>The status quo is an intricate and expensive supply chain of data that is progressively stored, manipulated and processed until value is seen by a business user. The steps along this chain include storing data in a big (expensive) relational database such as Oracle or Teradata, loading data via a complex ETL (extraction, transformation, load) solution such as Informatica, and building reports and dashboards using business intelligence products such as Business Objects or MicroStrategy. Companies can spend 12-18 months architecting, planning and implementing all of these pieces. When a user’s questions evolve, or new data sources need to get integrated, count on months of lead time for these change orders.</p><p>These systems are like a very organized filing cabinet. Everything has its place – except when it doesn’t, in which case there’s nowhere to put it without a major reorganization. But this won’t cut it in a Big Data world. Companies realize that their data is a critical asset, and they need to get increasingly good at answering questions (about user behavior, fraud, product interactions, etc.) that they cannot anticipate today but will be critical in the future. Any attempt to build out our “filing cabinet” ahead of time is guaranteed to be wrong and force us to throw away essential data.</p><p>Enter Hadoop – the open source technology designed by pioneers at Yahoo!, Facebook and other companies based on prior work done at Google. The core of Hadoop is a reliable distributed file system (HDFS) and a parallel data processing framework (MapReduce). In other words, you can store lots of data very cheaply (e.g. Facebook has 100 Petabytes in their cluster), and you can write programs or query the data with performance that scales with the number of nodes. By the analogy, Hadoop is more like a huge cardboard box – ready to be filled with endless amounts of raw data without any need to agree on data models or make any assumptions about data usage up-front.</p><p>Hadoop is an incomplete solution in many ways, and yet it is rapidly gaining adoption in Global 2000 enterprises. We’ve spoken in-depth to over 100 large enterprises and web companies about their Hadoop initiatives. What we’ve heard repeatedly is that these projects are driven by mandates to stop throwing away data that may be useful. This data is their lifeblood, and they must not squander it. However, the rapidly increasing variety, rate of change, and size of datasets has swamped their traditional systems, and storing it in Hadoop is their only viable way of keeping up. For all of its weaknesses, Hadoop is the only credible answer to the Big Data challenge of moving beyond the “filing cabinet.”</p><p>Hadoop is irresistible for this reason, but the big question that remains is how to use the data there once you’ve stored it. The challenge is that Hadoop is a very different architecture to traditional data warehouses. It is a batch engine &#8212; a lumbering freight train that can process immense amounts of data, but takes a while to get up to speed, so even the simplest question requires minutes of processing.</p><p>Given this, be very skeptical of the legions of business intelligence (BI) and data warehouse vendors who are racing to announce their “Hadoop connector.” Simply treating Hadoop as another “data source” for their legacy product won’t work – the data isn’t structured like a relational “filing cabinet”, and even if was, no user wants to wait an hour for the next page when they drill-down in a report.</p><p>The good news is that a new class of much more intelligent Hadoop-native solutions are on the horizon – leveraging Hadoop correctly for predictive analytics, interactive/exploratory business intelligence, agile ETL,<strong> </strong>and a multitude of vertical solutions. Here at Platfora we are laser focused on this next phase of Hadoop. The result won’t just match the status quo, but exceed it in flexibility and the ability to scale and adapt to changing requirements. Exciting times are ahead – stay tuned.</p><div
style="min-height:33px;" class="really_simple_share robots-nocontent snap_nopreview"><div
class="really_simple_share_facebook_like" style="width:100px;"><iframe
src="//www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.platfora.com%2Fhadoop-is-irresistible-and-flawed%2F&amp;send=false&amp;layout=button_count&amp;width=100&amp;show_faces=false&amp;action=like&amp;colorscheme=light&amp;height=27&amp;locale=en_US"
scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:100px; height:27px;" allowTransparency="true"></iframe></div><div
class="really_simple_share_twitter" style="width:100px;"><a
href="https://twitter.com/share" class="twitter-share-button" data-count="horizontal"
data-text="Hadoop is Irresistible and Flawed" data-url="http://www.platfora.com/hadoop-is-irresistible-and-flawed/"
data-via=""  ></a></div><div
class="really_simple_share_google1" style="width:80px;"><div
class="g-plusone" data-size="medium" data-href="http://www.platfora.com/hadoop-is-irresistible-and-flawed/" ></div></div></div><div
style="clear:both;"></div>]]></content:encoded> <wfw:commentRss>http://www.platfora.com/hadoop-is-irresistible-and-flawed/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>We’re in the Pre-Industrial Age of Big Data</title><link>http://www.platfora.com/pre-industrial-age-of-big-data/</link> <comments>http://www.platfora.com/pre-industrial-age-of-big-data/#comments</comments> <pubDate>Fri, 08 Jun 2012 13:49:18 +0000</pubDate> <dc:creator>Ben Werther</dc:creator> <category><![CDATA[Uncategorized]]></category> <guid
isPermaLink="false">http://www.platfora.com/?p=4957</guid> <description><![CDATA[Everywhere you look today, ‘big data’ is a hot topic. Articles in the media conjure up images of data scientists with PhDs in stats and physics deftly navigating petabytes of data to decipher meaning and insight. Companies of all stripes are pivoting their marketing to embrace the big data moniker. And key enabling technologies such as Hadoop are moving from the lab to mainstream adoption in record time. <strong>THE ‘BIG DATA’ SHIFT</strong> “Big Data analysis is usually iterative: you ask one question or examine one data set, then think of more questions or decide to look at more data. That’s ...]]></description> <content:encoded><![CDATA[<p>Everywhere you look today, ‘big data’ is a hot topic. Articles in the media conjure up images of data scientists with PhDs in stats and physics deftly navigating petabytes of data to decipher meaning and insight. Companies of all stripes are pivoting their marketing to embrace the big data moniker. And key enabling technologies such as Hadoop are moving from the lab to mainstream adoption in record time.</p><p><strong>THE ‘BIG DATA’ SHIFT</strong></p><blockquote><p>“Big Data analysis is usually iterative: you ask one question or examine one data set, then think of more questions or decide to look at more data. That’s different from the “single source of truth” approach to standard BI and data warehousing.” &#8212; <span
style="color: #000000"><span
style="color: #000000"><a
title="PwC 2010 Technology Forecast" href="http://www.pwc.com/us/en/technology-forecast/2010/issue3/features/big-data-pg1.jhtml" target="_blank">PwC 2010 Technology Forecast</a></span></span></p></blockquote><p>You could be forgiven for thinking that ‘big data’ is just a new name and a lot of hype around the same data work that IT has been engaged in for decades. But you’d be mistaken – there is a fundamental change underway that will transform how every industry uses data.</p><p>The way we’ve managed data for the last 25+ years assumes that IT has a complete picture from the get-go of all data sources and the requirements of their end-users. Based on this they spend 12+ months building a data warehouse that hard-codes these assumptions. Users are trapped in this limited view of the world, and even simple changes can require six months of engineering. All in all not a great solution, but sufficient in an age when data sources rarely changed and users were satisfied with the same canned queries and reports every day.</p><p>What has changed – the heart of the ‘big data’ shift – is only peripherally about the volume of data. Companies are realizing that there is surprising value locked up in their data, but in unanticipated ways that will only emerge down the road. These data sources could span functional silos and organizations, and include external ones such as Twitter and foursquare, weather and economic sentiment or pop trends. There is no way that IT can predict up front which data will be needed or how the users will need to combine or distill this data to test hypothesis and converge on answers to each question.</p><p>Before, we only stored the data that we knew we needed. Now we understand that every scrap of data may play a role in answering next year’s unanticipated questions. The cost of keeping this data – in commodity clusters such as Hadoop – is next to nothing. The cost of throwing it away is untold.</p><p><strong>FIRST STEPS INTO A BRAVE NEW WORLD</strong></p><p>We talk to a lot of companies that are navigating this big data shift today &#8212; across industries such as advertising, finance, media, retail, telecommunications and government. Given the choice of throwing away data or entering a brave new world, they are consistently choosing the latter. They are starting to load all of this data into Hadoop, to keep it, often without any real understanding of how to empower everyday business users to take advantage of that data.</p><p>This was driven home at a couple of big data events that I spoke at over the past month. The first was the Accel Big Data Conference, a superb event run by Ping Li, where we discussed enterprise usage of big data with other panelists from Facebook, JP Morgan, and Cloudera. The second was the Berkeley DataEDGE event, in which Quentin Hardy from the New York Times led a really insightful conversation with our panel about startup innovation in the big data space. In both, the need for simple, democratized access to big data by everyday business users rose up as the most critical issue needing to be solved.</p><p><strong>THE BLACKSMITH AND THE DATA SCIENTIST</strong></p><blockquote><p>“What Big Data is seeing now looks like the classic industrial curve. There is the first discovery of something big, leading to establishing principles like scientific rules. Science moves toward engineering as a means to manufacturing, resulting in mass deployment. Then things really change.” – Quentin Hardy, <span
style="color: #000000"><span
style="color: #000000"><a
title="NYT: How Big Data Gets Real" href="http://bits.blogs.nytimes.com/2012/06/04/how-big-data-gets-real/" target="_blank">How Data Gets Real (NYTimes 06/04/12)</a></span></span></p></blockquote><p>Today, there is a fundamental lack of tools and technologies that make big data usable by everyday business users. Even the simplest questions require developers or data scientists to painstakingly wrangle the data into some useful form. In ‘industrial revolution’ terms, we are in the pre-industrial era of artisanship that proceeded mass production. It is the equivalent of needing to engage an expert blacksmith to forge the forks and spoons for our dinner table, rather than being able to easily and cheaply buy mass-produced flatware.</p><p>Since every company of any scale is going to need to leverage big data, as an industry we either need to train up hundreds of thousands of expert blacksmiths (aka data scientists) or find a way into the industrialized world (aka better tools and technology that dramatically lower the bar to harnessing big data).</p><p>If we can achieve the latter, then data scientists still have a critical role &#8212; as experts applying their sophisticated techniques and training to questions that demand that attention. And yet, most business questions won’t require that expertise, so with these tools everyday users could use their intuition and domain expertise to work with data in productive and meaningful ways.</p><p>That is my bet &#8212; that it won’t be long before we see the first signs that we are entering the industrial age of big data. We’ll begin to see tools and techniques that truly democratize access and allow business users to serve most of their own data exploration and visualization needs, while allowing data scientists and more sophisticated users to collaborate and impart their unique value. Exciting times ahead.</p><div
style="min-height:33px;" class="really_simple_share robots-nocontent snap_nopreview"><div
class="really_simple_share_facebook_like" style="width:100px;"><iframe
src="//www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.platfora.com%2Fpre-industrial-age-of-big-data%2F&amp;send=false&amp;layout=button_count&amp;width=100&amp;show_faces=false&amp;action=like&amp;colorscheme=light&amp;height=27&amp;locale=en_US"
scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:100px; height:27px;" allowTransparency="true"></iframe></div><div
class="really_simple_share_twitter" style="width:100px;"><a
href="https://twitter.com/share" class="twitter-share-button" data-count="horizontal"
data-text="We’re in the Pre-Industrial Age of Big Data" data-url="http://www.platfora.com/pre-industrial-age-of-big-data/"
data-via=""  ></a></div><div
class="really_simple_share_google1" style="width:80px;"><div
class="g-plusone" data-size="medium" data-href="http://www.platfora.com/pre-industrial-age-of-big-data/" ></div></div></div><div
style="clear:both;"></div>]]></content:encoded> <wfw:commentRss>http://www.platfora.com/pre-industrial-age-of-big-data/feed/</wfw:commentRss> <slash:comments>6</slash:comments> </item> <item><title>Welcoming our New Board Member &#8211; Mike Speiser from Sutter Hill Ventures</title><link>http://www.platfora.com/welcoming-our-new-board-member-mike-speiser-from-sutter-hill-ventures/</link> <comments>http://www.platfora.com/welcoming-our-new-board-member-mike-speiser-from-sutter-hill-ventures/#comments</comments> <pubDate>Fri, 11 Nov 2011 21:54:27 +0000</pubDate> <dc:creator>Ben Werther</dc:creator> <category><![CDATA[Uncategorized]]></category> <guid
isPermaLink="false">http://www.platfora.com/?p=4765</guid> <description><![CDATA[There are many challenges in building a new company. Possibly the greatest is recruiting a talented, energetic, and world-class team. Smart people are in high demand in Silicon Valley and, let’s face it, we’re in another boom in the tech world. I know this because it takes 45 minutes to go 10 miles on the 101 and because smart people are harder to find than they ever have been. We’ve got a few silver bullets though, and we just added another one. We had wrapped our Series A, and were done thinking about financing for now. But when Mike Speiser ...]]></description> <content:encoded><![CDATA[<p>There are many challenges in building a new company. Possibly the greatest is recruiting a talented, energetic, and world-class team. Smart people are in high demand in Silicon Valley and, let’s face it, we’re in another boom in the tech world. I know this because it takes 45 minutes to go 10 miles on the 101 and because smart people are harder to find than they ever have been.</p><p>We’ve got a few silver bullets though, and we just added another one. We had wrapped our Series A, and were done thinking about financing for now. But when Mike Speiser from Sutter Hill Ventures reached out and wanted to chat, I couldn’t say no. We are huge fans on Mike’s work an CEO/entrepreneur (most recently at Pure Storage) and on the venture side, and he is one of the handful of investors that we’d go to extreme lengths to get involved at this stage.</p><p>Mike had done his homework on us, and in the blink of an eye we’d added another Series A closing to include Sutter Hill Ventures (bringing our total to $7.2 million) and added Mike to our board of directors.</p><p>One of Mike’s passions and talents is building incredible engineering teams &#8211; e.g. at Bix/Yahoo!, VERITAS/Symantec and Pure Storage. He has superb instincts and experience assembling remarkable talent that executes at a different plane than other companies. We’ve been living this philosophy since day one of Platfora, so together we’re going to go even faster as we bring together the best of the best in the face of this hyper-competitive market.</p><p>We’re very fortunate to have what we believe is the best group of investors on the planet. With Scott Weiss from a16z, Mike from Sutter Hill, T.J. Rylander from In-Q-Tel, and a group of the smartest and most passionate angels around.</p><div
style="min-height:33px;" class="really_simple_share robots-nocontent snap_nopreview"><div
class="really_simple_share_facebook_like" style="width:100px;"><iframe
src="//www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.platfora.com%2Fwelcoming-our-new-board-member-mike-speiser-from-sutter-hill-ventures%2F&amp;send=false&amp;layout=button_count&amp;width=100&amp;show_faces=false&amp;action=like&amp;colorscheme=light&amp;height=27&amp;locale=en_US"
scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:100px; height:27px;" allowTransparency="true"></iframe></div><div
class="really_simple_share_twitter" style="width:100px;"><a
href="https://twitter.com/share" class="twitter-share-button" data-count="horizontal"
data-text="Welcoming our New Board Member &#8211; Mike Speiser from Sutter Hill Ventures" data-url="http://www.platfora.com/welcoming-our-new-board-member-mike-speiser-from-sutter-hill-ventures/"
data-via=""  ></a></div><div
class="really_simple_share_google1" style="width:80px;"><div
class="g-plusone" data-size="medium" data-href="http://www.platfora.com/welcoming-our-new-board-member-mike-speiser-from-sutter-hill-ventures/" ></div></div></div><div
style="clear:both;"></div>]]></content:encoded> <wfw:commentRss>http://www.platfora.com/welcoming-our-new-board-member-mike-speiser-from-sutter-hill-ventures/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Platfora: Bringing Business Intelligence into the 21st Century</title><link>http://www.platfora.com/platfora-bringing-business-intelligence-into-the-21st-century/</link> <comments>http://www.platfora.com/platfora-bringing-business-intelligence-into-the-21st-century/#comments</comments> <pubDate>Thu, 08 Sep 2011 14:58:36 +0000</pubDate> <dc:creator>Ben Werther</dc:creator> <category><![CDATA[Uncategorized]]></category> <guid
isPermaLink="false">http://www.platfora.com/?p=4749</guid> <description><![CDATA[I’m very excited to share that Platfora has closed a Series A of $5.7 million, with Andreessen Horowitz (a16z) as the lead investor. We’ve had a superb experience working with the whole team at a16z, and we are thrilled that Scott Weiss is joining our board.We also received a strategic investment from In-Q-Tel, the strategic investment firm that works to identify, adapt, and deliver innovative technology solutions to support the missions of the U.S. Intelligence Community. And we are fortunate to have a number of prominent angels and seed investors participating in the round. Read more from Scott <a
href="http://scott.a16z.com/2011/09/08/the-big-data-conundrum/">here</a> ...]]></description> <content:encoded><![CDATA[<p>I’m very excited to share that Platfora has closed a Series A of $5.7 million, with Andreessen Horowitz (a16z) as the lead investor. We’ve had a superb experience working with the whole team at a16z, and we are thrilled that Scott Weiss is joining our board.We also received a strategic investment from In-Q-Tel, the strategic investment firm that works to identify, adapt, and deliver innovative technology solutions to support the missions of the U.S. Intelligence Community. And we are fortunate to have a number of prominent angels and seed investors participating in the round. Read more from Scott <a
href="http://scott.a16z.com/2011/09/08/the-big-data-conundrum/">here</a> about a16z’s investment in Platfora and check out today’s announcement <a
href="http://www.platfora.com/press-release-story-3/?namePress=Platfora%20Secures%20%245.7%20Million%20in%20Series%20A%20Funding%20Led%20by%20Andreessen%20Horowitz%20for%20Big%20Data%20Business%20Intelligence">here</a>.</p><p>The common thread that connects us and all of our investors is an understanding that we live in historic times. Marc’s recent op-ed in the Wall Street Journal, “<a
href="http://online.wsj.com/article/SB10001424053111903480904576512250915629460.html">Why Software is Eating the World</a>,” eloquently laid out how even the most traditional industries – from agriculture to health care to national defense &#8212; are at a tipping-point and are being transformed by new software ideas.</p><p>But the counterweight to the power of software is in the data it generates. Software can be used to process and understand data, and software is also a massive generator of data. In fact “machine generated data” – data produced by software systems – is growing overwhelmingly faster than “human generated data.”</p><p>This data is the future. For any company, data holds the ability to understand, predict and better service customers. Data tells the tale of interrelated events related to fraud or threats. Datasets answer questions about health trends and risk factors, financial market sentiment and inefficiencies, and social influence and behavior. We must be mindful of issues like privacy and security, but there is no escaping that data is increasingly capturing the pulse of the world.</p><p>In the same way that ‘software is eating the world’, companies around the globe will either sink or swim with data. But existing data warehousing and business intelligence products are relics from another age. They contemplate carefully organized, structured and curated datasets that grow and change slowly. Every single company I’ve talked to over the past six months has shared tales of how those systems are breaking. My friend Mike Driscoll, CTO of MetaMarkets, recently tweeted, “Prediction: there will be a mass extinction event among BI vendors in the next decade. They won&#8217;t survive the data deluge.”</p><p>If the old is breaking, what is there to replace it? Hadoop is the industry’s best hope. Based on Google’s MapReduce, it is universally in use at web companies (Facebook, Yahoo, Netflix, etc.) and is making deep inroads in every one of these industries facing the data deluge. It is flexible, scalable, and has the promise of taming terabytes or petabytes of data. But despite Hadoop&#8217;s amazing potential, it is low-level infrastructure that has similarities to batch-processing system from the 1960s. Everything takes minutes to hours to run, and jobs need to be carefully submitted by experts. It isn&#8217;t interactive like a traditional database, so just layering a BI product on top is asking for disappointment. Interactive reporting for business users simply doesn&#8217;t exist for Hadoop.</p><p>This is where Platfora comes in. We bring business intelligence into the 21st century, giving business analysts the intuitive and richly interactive tools to explore and produce business insights from massive and rapidly evolving datasets. Whether a company has gigabytes, terabytes or petabytes of data, our platform eliminates the need for traditional data warehouses, ETL tools and the legacy BI products of the past. We replace complexity and scaling pain with simplicity and beauty.</p><p>Platfora&#8217;s breakthrough is a combination of server technology, user experience innovation, and data science. Our platform works with existing Hadoop clusters (Cloudera, MapR, Amazon EMR, etc.), and automatically turns the questions of business users into Hadoop jobs that synthesize and distill Hadoop datasets into dimensional and predictive dashboards, reports and insights. The system intelligently drives Hadoop to create and maintain &#8216;work products&#8217; &#8212; highly compressed partial results that are refined at the click of a button to achieve subsecond report delivery, analytics overlay, and drilldown performance.</p><p>We are hard at work building our product, and working closely with a handful of industry leading companies to get it right. More importantly, to get there we are assembling a superb team of data and distributed systems architects/engineers, UI and UX developers and data scientists. The team is still small, but growing fast, and we&#8217;re focused on high-energy teamwork and building product that changes the world. Come join us.</p><div
style="min-height:33px;" class="really_simple_share robots-nocontent snap_nopreview"><div
class="really_simple_share_facebook_like" style="width:100px;"><iframe
src="//www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.platfora.com%2Fplatfora-bringing-business-intelligence-into-the-21st-century%2F&amp;send=false&amp;layout=button_count&amp;width=100&amp;show_faces=false&amp;action=like&amp;colorscheme=light&amp;height=27&amp;locale=en_US"
scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:100px; height:27px;" allowTransparency="true"></iframe></div><div
class="really_simple_share_twitter" style="width:100px;"><a
href="https://twitter.com/share" class="twitter-share-button" data-count="horizontal"
data-text="Platfora: Bringing Business Intelligence into the 21st Century" data-url="http://www.platfora.com/platfora-bringing-business-intelligence-into-the-21st-century/"
data-via=""  ></a></div><div
class="really_simple_share_google1" style="width:80px;"><div
class="g-plusone" data-size="medium" data-href="http://www.platfora.com/platfora-bringing-business-intelligence-into-the-21st-century/" ></div></div></div><div
style="clear:both;"></div>]]></content:encoded> <wfw:commentRss>http://www.platfora.com/platfora-bringing-business-intelligence-into-the-21st-century/feed/</wfw:commentRss> <slash:comments>4</slash:comments> </item> <item><title>Introducing Platfora &#8211; Putting a New Face on Hadoop</title><link>http://www.platfora.com/introducing-platfora-putting-a-new-face-on-hadoop/</link> <comments>http://www.platfora.com/introducing-platfora-putting-a-new-face-on-hadoop/#comments</comments> <pubDate>Wed, 29 Jun 2011 14:38:15 +0000</pubDate> <dc:creator>Ben Werther</dc:creator> <category><![CDATA[Uncategorized]]></category> <guid
isPermaLink="false">http://www.platfora.com/?p=4635</guid> <description><![CDATA[The Hope. That Hadoop is the silver bullet that is going to save us from a relentless explosion of data. Data volumes are increasing 100x over the next 5 years, and Hadoop is THE answer to scalably storing and harnessing that data. Forget about relational databases &#8212; Hadoop has it all. The Reality. That one scratch below the surface and you realize that this whole thing is at a much earlier stage than most people appreciate. We&#8217;re still living the first baby steps &#8212; vendors battling to provide the best low-level Hadoop plumbing. Once you get the cluster up and ...]]></description> <content:encoded><![CDATA[<p>The Hope. That Hadoop is the silver bullet that is going to save us from a relentless explosion of data. Data volumes are increasing 100x over the next 5 years, and Hadoop is THE answer to scalably storing and harnessing that data. Forget about relational databases &#8212; Hadoop has it all.</p><p>The Reality. That one scratch below the surface and you realize that this whole thing is at a much earlier stage than most people appreciate. We&#8217;re still living the first baby steps &#8212; vendors battling to provide the best low-level Hadoop plumbing. Once you get the cluster up and running, now what? You start to pump data into it. Then what? That&#8217;s when things start to unravel. You can write some basic Hive or Pig queries, and wait minutes or hours for jobs that take seconds on MPP databases. You can get down to the metal and try your hand at MapReduce programming in Java, or maybe adapt some Mahout machine-learning code. However you cut it though, unless you are willing to get your hands very dirty, it is going to be a rocky ride. This technology is a long way from main street.</p><p>The Future. That&#8217;s where we come in. We&#8217;re not ready to talk about what we&#8217;re doing yet, but we believe that the Hadoop that the world needs (or is able to consume) is a far cry from what you see today. The big data explosion is real and imminent, and companies worldwide need a way to get ahead of the onslaught and be ready to put that data to work. The Hadoop that we envision is not just evolutionary enhancements of today&#8217;s stack (although there are plenty of vendors that have that angle covered). It is tangibly more. That&#8217;s all we can say about our plans for now.</p><p>At Platfora, we are assembling an extraordinary team of big data and scalable systems experts, rock-star data scientists, and sublime UX/visualization designers. If this describes you, then we&#8217;d love to talk. It is going to be quite a ride.</p><p>&nbsp;</p><div
style="min-height:33px;" class="really_simple_share robots-nocontent snap_nopreview"><div
class="really_simple_share_facebook_like" style="width:100px;"><iframe
src="//www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.platfora.com%2Fintroducing-platfora-putting-a-new-face-on-hadoop%2F&amp;send=false&amp;layout=button_count&amp;width=100&amp;show_faces=false&amp;action=like&amp;colorscheme=light&amp;height=27&amp;locale=en_US"
scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:100px; height:27px;" allowTransparency="true"></iframe></div><div
class="really_simple_share_twitter" style="width:100px;"><a
href="https://twitter.com/share" class="twitter-share-button" data-count="horizontal"
data-text="Introducing Platfora &#8211; Putting a New Face on Hadoop" data-url="http://www.platfora.com/introducing-platfora-putting-a-new-face-on-hadoop/"
data-via=""  ></a></div><div
class="really_simple_share_google1" style="width:80px;"><div
class="g-plusone" data-size="medium" data-href="http://www.platfora.com/introducing-platfora-putting-a-new-face-on-hadoop/" ></div></div></div><div
style="clear:both;"></div>]]></content:encoded> <wfw:commentRss>http://www.platfora.com/introducing-platfora-putting-a-new-face-on-hadoop/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> </channel> </rss>