Click here to close now.

Welcome!

Python Authors: Ignacio M. Llorente, Carmen Gonzalez, Elizabeth White, John Wetherill, Trevor Parsons

Blog Feed Post

Cloudera and Cleversafe: A Strategic Combination For Enterprise IT

By

Cloudera and Cleversafe are totally different companies addressing different challenges. But the two firms have quite a bit in common. Here are key commonalities I’ve observed:

  • Both invest in real engineering and deliver enterprise-grade/quality capabilities
  • Both are proven to work at scale (including very large scale when required)
  • Both are led by CEOs that are highly regarded by their peers and the community, and both CEOs are very likeable people (I’ve met and worked with both Mike Olson and Chris Gladwin).
  • Both have used the services of my firm, Crucial Point (and that is most appreciated by me, by the way!).
  • Both are in the In-Q-Tel portfolio and are known to the national security community because of that.
  • Both firms partner with Carahsoft (which is, by the way, another strategic partner of Crucial Point’s).
  • Both are key thought leaders in the domain of Big Data, with Cloudera being known for its open source distribution of Apache Hadoop (CDH4) and their management capabilities over CDH, and Cleversafe being known for their fielding of modern object storage with the lowest cost/TB of any system on the market plus agile access and impressive I/O.
That said, these two firms really address different areas of enterprise data needs, and have built different capabilities that can be used by enterprises to address separate aspects of Big Data challenges.
Which is part of the reason I was excited to learn of cooperation between these two firms. When firms addressing different parts of a hard challenge collaborate it can mean great things for enterprise missions.
Here are a few thoughts on the nature of a well engineered solution that came from their work together:
  • In July 2012, Cleversafe announced that the are now working with Cloudera’s Distribution including Apache Hadoop (CDH) for new capabilities that enable the benefits of Cleversafe’s data storage and security with the power of MapReduce.  With this well engineered combination, the data for an enterprise is not stored in HDFS.  The benefits of HDFS are already provided by other Cleversafe functionality, so there is already fault tolerance and speed, for example. But even greater benefits are provided through this well engineered solution, including the elimination of single points of failure without the need for HDFS’s complete/multiple replication.
  • So basically you can store data using Cleversafe technology and get all the benefits there, and can run MapReduce jobs over the data leveraging Hadoop without using HDFS.
  • This well engineered solution enables data to be stored in conventional format on nodes where it is expected to be used for computation and enables MapReduce operation. This comes with the many other benefits of Cleversafe, including the ability to protect data without the overhead of massive network traffic and costly backup storage. It also removes challenges with Namenode issues since a Cleversafe cluster’s accesser nodes federate and cover for each other.
  • The bottom line result of this Cleversafe leveraging of Cloudera’s CDH:  Incredible cost benefits, fantastic disaster recovery/continuity of operations features, fast access to data from multiple locations, and an ability to run MapReduce jobs and leverage Hadoop-centric applications without using HDFS.
I liked the context provided by Andrew Brust at zdnet.com on this topic. He writes that:

Cleversafe swaps out HDFS
Assuming it works as advertised, Cleversafe’s company name is a fair reflection of its Hadoop architecture.  While other HDFS alternatives exist for Hadoop (for example, MapR‘s Hadoop distro, which can mount HDFS-compatible NFS volumes), Cleversafe’s Slicestor appliance nodes retain HDFS’ distributed nature and maintain fault tolerance too.  Cleversafe does this with “information dispersal” slices: spreading the data around different nodes in the cluster, employing Erasure Coding – a scheme that allows reconstruction of data from a subset of storage nodes, and eliminates single points of failure without the overhead of HDFS’ complete replication.

Meanwhile, the data is also stored in conventional format on the nodes where it is expected to be used for computation.  The conventional storage assures fast MapReduce operations, and the striped storage assures fault tolerance, without the need (and network traffic and management overhead) to keep multiple full copies of the data.

Namenode issues disappear as well, since a Cleversafe cluster’s accesser nodes federate and cover for each other, and the meta data is split up along with the data itself.  Although various high availability namenode technologies are appearing in the major Hadoop distributions now, they nonetheless still use a single central namenode at any given time.  Keeping a warm spare around is not the same thing as having meta data/directory services responsibilities shared among a collection of active nodes.

Although Cleversafe clusters are appliance-based, the appliances nonetheless use commodity processors and  storage.  The added value comes from tuning and optimization, and the unique storage software subsystem.  Cleversafe storage runs about $500 per Terabyte, and can be less depending on total storage size.  On the MapReduce side, Cleversafe uses Cloudera’s Distribution Including Apache Hadoop (CDH).

For more information see this July 2012 press release from Cleversafe:

Cleversafe First to Deliver Breakthrough Capabilities for Combined Storage and Massive Computation

First System to Support Storage and Analysis of Datasets at Previously Unattainable Scale with Unparalleled Reliability and Efficiency

Chicago, July 10, 2012 – Cleversafe Inc., the solution for limitless data storage, today announced plans to build the first Dispersed Compute Storage solution by combining the power of Hadoop MapReduce with Cleversafe’s highly scalable Object-based Dispersed Storage System. This solution will significantly alter the Big Data landscape by decreasing infrastructure costs for separate servers dedicated to analytical processes, reducing required storage capacity, and simultaneously improving data integrity. In addition, the company’s solution will reduce network bottlenecks by bringing together computation and storage at any scale, petabytes to exabytes and beyond.

Traditional storage systems are not designed for large-scale distributed computation and data analysis. Present implementations treat data storage and analysis of that data separately, transferring data from Storage Area Networks (SANs) or Network Attached Storage (NASs) across the network to perform the computations used to gather insight. In this manner the network quickly becomes the bottleneck, making multi-site computation over the WAN particularly challenging. Cleversafe solves this problem by combining Hadoop MapReduce alongside its Dispersed Storage Network (dsNet) system on the same platform and replacing the Hadoop Distributed File System (HDFS) which relies on 3 copies to protect data with Information Dispersal Algorithms thereby significantly improving reliability and allowing analytics at a scale previously unattainable through traditional HDFS configurations.

“For any company, the movement, management and storage of massive data stores for analytical purposes is already unmanageable,” said Chris Gladwin, CEO and President of Cleversafe. “Many companies have had to invest significant resources in both CAPEX and OPEX to manage the challenge of Big Data and to try and capitalize on the opportunity to gather insights from that data,” said Gladwin. “The key to reducing both cost and complexity is to combine computation with dispersed storage,” said Gladwin. “Cleversafe’s solution will provide infinitely scalable, reliable, and cost effective storage for data to support massive computation while enhancing the analysis workflow.”

Hadoop MapReduce, which is already being used broadly throughout the industry, represents only a partial solution to this problem. While it lends itself naturally to enabling computations where the data exists rather than transferring data to computation nodes, it has inherent scalability and reliability limitations. Current HDFS deployments utilize a single server for all metadata operations and 3 copies of the data for protection. Failure of the single metadata node could render stored data inaccessible or result in a permanent loss of data. Maintaining 3 copies of data at massive scale for protection leads to skyrocketing overhead and management costs.

Cleversafe’s dsNet system protects both data and metadata equally and is inherently more reliable. By applying the company’s unique Information Dispersal technology to slice and disperse data, single points of failure are eliminated. As data is distributed evenly across all Slicestor nodes metadata can scale linearly and infinitely as new nodes are added, thus reducing any scalability bottlenecks and increasing performance. Cleversafe’s unique approach delivers the powerful combination of analytics and storage in a geographically distributed single system allowing organizations to efficiently scale their Big Data environments to hundreds of petabytes and even exabytes today.

“There isn’t an industry today that’s untouched by Big Data or a company that wouldn’t benefit from the intrinsic value of that data if they could collect, organize, store and analyze it in a cost-effective manner,” said John Webster, Senior Partner at Evaluator Group. “Cleversafe’s approach to combining dispersed storage and Hadoop for analytics is a groundbreaking step for the industry and for any company to effectively bridge storage and large-scale computation,” said Webster.

No market segment has a more critical need to harness Big Data than the Government sector. Lockheed Martin is partnering with Cleversafe to develop a federal version of the Cleversafe Dispersed Compute Storage solution designed for the unique needs of federal government agencies.

“By combining the power of Hadoop analytics with Cleversafe’s Object-based Dispersed Storage solution, government entities will be able to significantly reduce their total cost of infrastructure as the amount of their mission critical data grows,” said Tom Gordon, CTO & VP of Engineering of Lockheed Martin’s Information Systems and Global Solutions-National. “The Federal community has been out in front of Big Data, well ahead of many other market segments, and needs technology solutions today that are well suited for Exabyte scale storage as well as massive computation,” said Gordon. “Taken Cleversafe’s approach with Hadoop across commodity hardware, these features deliver a new approach to bring the true potential of Big Data analytics into reach.”

Cleversafe’s object-based storage solution is 100 million times more reliable than traditional RAID-based systems and it doesn’t rely on replication to protect information. Its information dispersal capabilities reduce storage costs up to 90 percent while meeting compliance requirements and ensuring protection against data loss, whether it’s latent hardware errors, data corruption or malicious threats. With the combination of limitless scale, highly reliable storage and efficient analytics in the same platform, Cleversafe is solving the most challenging Big Data problems for customers in a very efficient manner.

Tweet This: @Cleversafe to build first storage-based compute solution based on its dsNet solution and Hadoop MapReduce.

About Cleversafe Inc.

Cleversafe has created a breakthrough technology that solves petabyte and beyond big data storage problems. This solution drives up to 90 percent of the storage cost out of the business while enabling secure and reliable global access and collaboration. The world’s largest data repositories rely on Cleversafe. To learn more about Cleversafe and its solutions, please visit www.cleversafe.com, call 312-423-6640 or email us at [email protected].

 

 

 

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley, former CTO of the Defense Intelligence Agency (DIA), is Founder and CTO of Crucial Point LLC, a technology research and advisory firm providing fact based technology reviews in support of venture capital, private equity and emerging technology firms. He has extensive industry experience in intelligence and security and was awarded an intelligence community meritorious achievement award by AFCEA in 2008, and has also been recognized as an Infoworld Top 25 CTO and as one of the most fascinating communicators in Government IT by GovFresh.

@ThingsExpo Stories
BroadSoft on Tuesday announced that it is a recipient of the 2014 Frost & Sullivan Market Leadership Award in the Hosted/Cloud Internet Protocol (IP) Telephony market for Latin America. According to Frost & Sullivan market research, the Latin America (LATAM) hosted/cloud Internet Protocol (IP) telephony market, including integrated unified communications and collaboration (UC&C) applications, is currently experiencing a rapid growth trajectory and is expected to exhibit a tenfold rise in annual revenues in the 2013-2020 period. With more than 600 cloud deployments internationally, BroadSoft w...
SYS-CON Events announced today that StorPool Storage will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. StorPool is distributed storage software that allows service providers, enterprises and other cloud builders to run data storage on standard x86 servers, instead of using expensive and inefficient storage arrays (SAN).
The best mobile applications are augmented by dedicated servers, the Internet and Cloud services. Mobile developers should focus on one thing: writing the next socially disruptive viral app. Thanks to the cloud, they can focus on the overall solution, not the underlying plumbing. From iOS to Android and Windows, developers can leverage cloud services to create a common cross-platform backend to persist user settings, app data, broadcast notifications, run jobs, etc. This session provides a high level technical overview of many cloud services available to mobile app developers, includi...
Temasys has announced senior management additions to its team. Joining are David Holloway as Vice President of Commercial and Nadine Yap as Vice President of Product. Over the past 12 months Temasys has doubled in size as it adds new customers and expands the development of its Skylink platform. Skylink leads the charge to move WebRTC, traditionally seen as a desktop, browser based technology, to become a ubiquitous web communications technology on web and mobile, as well as Internet of Things compatible devices.
GENBAND has announced that SageNet is leveraging the Nuvia platform to deliver Unified Communications as a Service (UCaaS) to its large base of retail and enterprise customers. Nuvia’s cloud-based solution provides SageNet’s customers with a full suite of business communications and collaboration tools. Two large national SageNet retail customers have recently signed up to deploy the Nuvia platform and the company will continue to sell the service to new and existing customers. Nuvia’s capabilities include HD voice, video, multimedia messaging, mobility, conferencing, Web collaboration, deskt...
VoxImplant has announced full WebRTC support in the newest versions of its Android SDK and iOS SDK. The updated SDKs, which enable audio and video calls on mobile devices, are now compatible with the WebRTC standard to allow any mobile app to communicate with WebRTC-enabled browsers, including Google Chrome, Mozilla Firefox, Opera, and, when available, Microsoft Spartan. The WebRTC-updated SDKs represent VoxImplant's continued leadership in simplifying the development of real-time communications (RTC) services for app developers. VoxImplant (built by Zingaya, the real-time communication servi...
SYS-CON Events announced today that Site24x7, the cloud infrastructure monitoring service, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Site24x7 is a cloud infrastructure monitoring service that helps monitor the uptime and performance of websites, online applications, servers, mobile websites and custom APIs. The monitoring is done from 50+ locations across the world and from various wireless carriers, thus providing a global perspective of the end-user experience. Site24x7 supports monitoring H...
SYS-CON Events announced today that Intelligent Systems Services will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Established in 1994, Intelligent Systems Services Inc. is located near Washington, DC, with representatives and partners nationwide. ISS’s well-established track record is based on the continuous pursuit of excellence in designing, implementing and supporting nationwide clients’ mission-critical systems. ISS has completed many successful projects in Healthcare, Commercial, Manufacturing, ...
Sonus Networks introduced the Sonus WebRTC Services Solution, a virtualized Web Real-Time Communications (WebRTC) offer, purpose-built for the Cloud. The WebRTC Services Solution provides signaling from WebRTC-to-WebRTC applications and interworking from WebRTC-to-Session Initiation Protocol (SIP), delivering advanced real-time communications capabilities on mobile applications and on websites, which are accessible via a browser.
SYS-CON Events announced today that B2Cloud, a provider of enterprise resource planning software, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. B2cloud develops the software you need. They have the ideal tools to help you work with your clients. B2Cloud’s main solutions include AGIS – ERP, CLOHC, AGIS – Invoice, and IZUM
SYS-CON Events announced today that Tufin, the market-leading provider of Security Policy Orchestration Solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. As the market leader of Security Policy Orchestration, Tufin automates and accelerates network configuration changes while maintaining security and compliance. Tufin's award-winning Orchestration Suite™ gives IT organizations the power and agility to enforce security policy across complex, multi-vendor enterprise networks. With more than 1...
SYS-CON Events announced today that Cloudian, Inc., the leading provider of hybrid cloud storage solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Cloudian, Inc., is a Foster City, California - based software company specializing in cloud storage software. The main product is Cloudian, an Amazon S3-compliant cloud object storage platform, the bedrock of cloud computing systems, that enables cloud service providers and enterprises to build reliable, affordable and scalable cloud storage solu...
“With easy-to-use SDKs for Atmel’s platforms, IoT developers can now reap the benefits of realtime communication, and bypass the security pitfalls and configuration complexities that put IoT deployments at risk,” said Todd Greene, founder & CEO of PubNub. PubNub will team with Atmel at CES 2015 to launch full SDK support for Atmel’s MCU, MPU, and Wireless SoC platforms. Atmel developers now have access to PubNub’s secure Publish/Subscribe messaging with guaranteed ¼ second latencies across PubNub’s 14 global points-of-presence. PubNub delivers secure communication through firewalls, proxy ser...
SYS-CON Events announced today that Gridstore™, the leader in hyper-converged infrastructure purpose-built to optimize Microsoft workloads, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Gridstore™ is the leader in hyper-converged infrastructure purpose-built for Microsoft workloads and designed to accelerate applications in virtualized environments. Gridstore’s hyper-converged infrastructure is the industry’s first all flash version of HyperConverged Appliances that include both compute and storag...
SYS-CON Events announced today that IDenticard will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. IDenticard™ is the security division of Brady Corp (NYSE: BRC), a $1.5 billion manufacturer of identification products. We have small-company values with the strength and stability of a major corporation. IDenticard offers local sales, support and service to our customers across the United States and Canada. Our partner network encompasses some 300 of the world's leading systems integrators and security s...
Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 16th Cloud Expo at the Javits Center in New York June 9-11 will find fresh new content in a new track called PaaS | Containers & Microservices Containers are not being considered for the first time by the cloud community, but a current era of re-consideration has pushed them to the top of the cloud agenda. With the launch of Docker's initial release in March of 2013, interest was revved up several notches. Then late last...
So I guess we’ve officially entered a new era of lean and mean. I say this with the announcement of Ubuntu Snappy Core, “designed for lightweight cloud container hosts running Docker and for smart devices,” according to Canonical. “Snappy Ubuntu Core is the smallest Ubuntu available, designed for security and efficiency in devices or on the cloud.” This first version of Snappy Ubuntu Core features secure app containment and Docker 1.6 (1.5 in main release), is available on public clouds, and for ARM and x86 devices on several IoT boards. It’s a Trend! This announcement comes just as...
SYS-CON Events announced today the IoT Bootcamp – Jumpstart Your IoT Strategy, being held June 9–10, 2015, in conjunction with 16th Cloud Expo and Internet of @ThingsExpo at the Javits Center in New York City. This is your chance to jumpstart your IoT strategy. Combined with real-world scenarios and use cases, the IoT Bootcamp is not just based on presentations but includes hands-on demos and walkthroughs. We will introduce you to a variety of Do-It-Yourself IoT platforms including Arduino, Raspberry Pi, BeagleBone, Spark and Intel Edison. You will also get an overview of cloud technologies s...
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Health care systems across the globe are under enormous strain, as facilities reach capacity and costs continue to rise. M2M and the Internet of Things have the potential to transform the industry through connected health solutions that can make care more efficient while reducing costs. In fact, Vodafone's annual M2M Barometer Report forecasts M2M applications rising to 57 percent in health care and life sciences by 2016. Lively is one of Vodafone's health care partners, whose solutions enable older adults to live independent lives while staying connected to loved ones. M2M will continue to gr...