StreamSets
What they do: Provide a platform that manages data-in-motion and data operations for enterprises.
Problem they solve: Big Data complexity has led to “dataflow chaos,” according to StreamSets. Most companies manage their data with a combination of traditional systems (i.e. Oracle) and new platforms (Hadoop, Spark), which creates unnecessary complexity.
StreamSets argues that “increasing data volume, velocity, and variety are overwhelming enterprise data architectures and the people responsible for the day-in, day-out delivery of data to decision makers.
The problem of “dataflow chaos” goes back to the 90’s, when enterprises embarked on a mission to make data universally available for business intelligence through data warehouses. Supplying the warehouse led to a complex “hairball” of ad hoc data pipelines whose complexity and poor reliability led to a data management crisis. This problem, in turn, drove the creation of the data integration and data quality solutions that are in use in nearly every enterprise today. (I.e., the legacy part of today’s management equation.)
Now, the rise of Big Data and multi-cloud architectures is creating an order of magnitude of greater complexity and data sprawl, which “spawns a growing hairball/tangle of connections between numerous Big Data sources and old and new processing systems.”
Another problem is “data drift.” Because they are loosely structured and often generated by third-party systems – systems that can change often, without notice, and outside of your control – Big Data sources often mutate unexpectedly over time, causing data drift.
Combined, these issues make it nearly impossible for IT departments to keep pace with this ever-changing web of data movement. Instead, they react to each new problem ad hoc, further fueling dataflow chaos. When dataflow chaos takes hold, “confidence in the timeliness and trustworthiness of data erodes and the promise of a data-driven future is lost.”
How they solve it: StreamSets provides a data operations platform that helps enterprises “conquer data flow chaos.” Using StreamSets, businesses are able to establish a data operations center that manages data flows. will become essential for any data-driven business.
Two-time #Big50 winner @streamsets raises $20M, rolls out new product features to 'conquer data flow chaos.' Share on XThe platform consists of two products: 1. StreamSets Data Collector is an open-source data plane core. It lets businesses build any-to-any data pipelines that help overcome data drift. 2. StreamSets Dataflow Performance Manager (DPM) is a cloud-based control plane. IT can use it to manage and monitor the enterprise’s end-to-end dataflow operation through a living data map.
Headquarters: San Francisco, CA
CEO: Girish Pancha, who formerly served as chief product officer of Informatica.
Year Founded: 2014
Funding: In May 2017, StreamSets raised $20 million in Series B funding from Accel Partners, Battery Ventures and New Enterprise Associates (NEA) participated in the funding round. This brings total funding to date to $32.5M.
Competitors include: According to 451 Research, the “Total Data Market,” which consists of data platforms, data management, analytics, and data mining, will nearly double in size soon. 451 believes the market will grow from $60 billion in 2014 to $115 billion in 2019.
In May 2017, 451 revised its forecast upward, predicting that the Total Data market will reach $138.5B by 2021, growing at a CAGR of 11.5% from the end of 2016 to 2021.
The Total Data Market is a relatively uncrowded niche under the Big Data umbrella. Cask and Confluent are two startup competitors, while the more established Hortonworks competes here as well.
Customers Include: Cisco, Lithium, and Planet Labs. New customer wins since the last report include CBS Interactive, Cox Automotive, Elastic, RingCentral, and Scripps.
Why they’re in the Big 50-2017: StreamSets is a two-time winner! Since the last Big50, StreamSets has raised a big VC round, rolled out new product features, attracted positive attention from the analysts and media covering this space (Gartner, Fortune), and locked down impressive new customers, including CBS Interactive.