Welcome!

Python Authors: Pat Romanski, Jyoti Bansal, Donald Meyer, Liz McMillan, Elizabeth White

Blog Feed Post

Big Data as Core, Big Data as Context, and Big Data as Buzzword Bingo

3711242567_7a2f9e6f13_zIt’s neither particularly newsworthy nor insightful to suggest that ‘Big Data’ gets everywhere these days, but two recent items reminded me of the gulf between credible execution of a big data play and the more questionable tacking of the big data meme onto an otherwise useful product.

Christmas is coming. Which means skating, and pantomimes (Captain Jack! And the Krankies!), and surprisingly expensive daughter shops, and pie with chicken and banana. But in amongst that lot, the weekend’s email and RSS brought news of

an ideal solution to store, manage and archive big data

and a

service built specifically for Fortune 1000 enterprises who want to rapidly explore how big data technology can unlock revenue from their data.

(both with my emphasis)

Infochimps has been around since 2009, and I’ve been following them with interest. CTO and Co-Founder Flip Kromer and I recorded podcasts in 2009 and early 2012, and we continue to meet up from time to time. From humble beginnings, the company grew to become one of a handful of credible Data Market offerings, before moving on to contribute key pieces of code to projects such as VMware’s Serengeti. Earlier this year, Infochimps’ broader ambitions began to become public as the Infochimps Platform rolled out. In August, the Platform gained streaming capabilities that helped propel it beyond any early reliance upon Hadoop. Then, this month, things got really interesting with the arrival of the Infochimps Enterprise Cloud. As Alex Williams reported for TechCrunch on Monday,

Infochimps data scientists and engineers developed the platform so they could collect lots of data and perform complex analytics along the way. A customer can pull in data from CRM systems and any of the other app silos where data pools then combine it with the data from Facebook, Twitter, and other services. The data flows into Infochimps’ data-delivery service and is cleaned up along the way. Data gets enriched, as needed, with other pieces of information such as demographic data.

The service works with any kind of database. Infochimps can implement any combination, including relational for SQL-like queries, and NoSQL for Hadoop jobs and big data storage. Analysis tools on the back-end provide the capability to create visuals and reports.

The company is setting itself some bold targets, seeking to speed up system deployments, making it easier for existing staff to do new things with data they already own, and freeing users to deploy a wide range of big data tools beyond the default of the cuddly elephant. And they’re targeting this directly at the Fortune 1000; companies with huge IT operations, demanding requirements, and an expectation of support, service and quality, all day, every day. For a small company of around 30 employees, which raised $1.55 million back in 2010 and hasn’t reported an investment since, that’s a big ask.

If even a fraction of what the Enterprise Cloud promises is available today, or demonstrably around the corner, then that team of 30 must be spending most of their time fending off a swarm of investors and acquirers. A nice problem to have, but a problem all the same.

I look forward to seeing real examples of the uses to which enterprise customers begin putting the Enterprise Cloud. I’ll also be watching with interest for rumours of acquisition or investment, both of which are bound to come.

The other piece of news also came from an established company. This time, consumer and small business backup provider Genie9. The company has a new backup product out, called Zoolz, and is making much of the integral “Cold Storage™ Technology” (Ugh!) that gives users reasonably straightforward access to Amazon’s very cheap Glacier storage service.

Personally, I achieve my backup and archival needs through a combination of DropBox, Google Drive, Spanning Backup, a Time Capsule and Arq (complete with its own non-™ hooks into Glacier). But that’s me. A one man band, with a particular set of devices and workflows, and it’s an arrangement that has grown up rather organically.

Zoolz makes perfect sense as a backup solution, and from a brief play with the tool it appears intuitive, capable, and affordable. The Glacier integration is also good, for those things you want to keep, but which you don’t need to access regularly. I have no problem with the tool at all, but what did (and does) bemuse me was the emphasis upon its role in meeting big data requirements.

Zoolz is designed with big data support in mind and will be a game changer to help companies move all their data to the cloud in a secure and fast way that is cheaper than tapes and traditional solutions.

Huh?

The web site devotes a whole page to the big data capabilities of Zoolz, but I’m singularly unconvinced. The whole point about big data, surely, is that you work with it? You pour it into very capable tools that allow you to hold it in (or close to) memory, and you chop and change it in a variety of ways whilst seeking insight? You don’t park it 3-5 hours away in an Amazon cold storage facility and think “job done,” just because Zoolz offers “photo preview” !

Zoolz (through Glacier) offers a place to park large volumes of data that you no longer wish to work with, but it does nothing at all to help people ingest, process, analyse or understand big data. Moving large volumes of data around is slow and expensive. Processes to work with data are often scripted or otherwise automated, and tied into workflows that make sense within the context of the analytic tools (like Hadoop, say) to be used. It’s wholly unclear that Zoolz’s pretty UI and consumer/small business workflows make any sense in that context whatsoever.

Personally, Genie9, I would be proud of what I’ve made in Zoolz. But I’d drop the ‘big data’ stuff. It doesn’t fit.

Bingo card image by Flickr user Sara

Read the original blog entry...

More Stories By Paul Miller

Paul Miller works at the interface between the worlds of Cloud Computing and the Semantic Web, providing the insights that enable you to exploit the next wave as we approach the World Wide Database.

He blogs at www.cloudofdata.com.

@ThingsExpo Stories
IoT is at the core or many Digital Transformation initiatives with the goal of re-inventing a company's business model. We all agree that collecting relevant IoT data will result in massive amounts of data needing to be stored. However, with the rapid development of IoT devices and ongoing business model transformation, we are not able to predict the volume and growth of IoT data. And with the lack of IoT history, traditional methods of IT and infrastructure planning based on the past do not app...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
"LinearHub provides smart video conferencing, which is the Roundee service, and we archive all the video conferences and we also provide the transcript," stated Sunghyuk Kim, CEO of LinearHub, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Internet of @ThingsExpo, taking place June 6-8, 2017 at the Javits Center in New York City, New York, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @ThingsExpo New York Call for Papers is now open.
"There's a growing demand from users for things to be faster. When you think about all the transactions or interactions users will have with your product and everything that is between those transactions and interactions - what drives us at Catchpoint Systems is the idea to measure that and to analyze it," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York Ci...
The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ...
WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web communications world. The 6th WebRTC Summit continues our tradition of delivering the latest and greatest presentations within the world of WebRTC. Topics include voice calling, video chat, P2P file sharing, and use cases that have already leveraged the power and convenience of WebRTC.
20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy.
Discover top technologies and tools all under one roof at April 24–28, 2017, at the Westin San Diego in San Diego, CA. Explore the Mobile Dev + Test and IoT Dev + Test Expo and enjoy all of these unique opportunities: The latest solutions, technologies, and tools in mobile or IoT software development and testing. Meet one-on-one with representatives from some of today's most innovative organizations
DevOps is being widely accepted (if not fully adopted) as essential in enterprise IT. But as Enterprise DevOps gains maturity, expands scope, and increases velocity, the need for data-driven decisions across teams becomes more acute. DevOps teams in any modern business must wrangle the ‘digital exhaust’ from the delivery toolchain, "pervasive" and "cognitive" computing, APIs and services, mobile devices and applications, the Internet of Things, and now even blockchain. In this power panel at @...
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
"A lot of times people will come to us and have a very diverse set of requirements or very customized need and we'll help them to implement it in a fashion that you can't just buy off of the shelf," explained Nick Rose, CTO of Enzu, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
The WebRTC Summit New York, to be held June 6-8, 2017, at the Javits Center in New York City, NY, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 20th International Cloud Expo and @ThingsExpo. WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web co...
Buzzword alert: Microservices and IoT at a DevOps conference? What could possibly go wrong? In this Power Panel at DevOps Summit, moderated by Jason Bloomberg, the leading expert on architecting agility for the enterprise and president of Intellyx, panelists peeled away the buzz and discuss the important architectural principles behind implementing IoT solutions for the enterprise. As remote IoT devices and sensors become increasingly intelligent, they become part of our distributed cloud enviro...
In 2014, Amazon announced a new form of compute called Lambda. We didn't know it at the time, but this represented a fundamental shift in what we expect from cloud computing. Now, all of the major cloud computing vendors want to take part in this disruptive technology. In his session at 20th Cloud Expo, John Jelinek IV, a web developer at Linux Academy, will discuss why major players like AWS, Microsoft Azure, IBM Bluemix, and Google Cloud Platform are all trying to sidestep VMs and containers...
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex softw...
WebRTC is about the data channel as much as about video and audio conferencing. However, basically all commercial WebRTC applications have been built with a focus on audio and video. The handling of “data” has been limited to text chat and file download – all other data sharing seems to end with screensharing. What is holding back a more intensive use of peer-to-peer data? In her session at @ThingsExpo, Dr Silvia Pfeiffer, WebRTC Applications Team Lead at National ICT Australia, looked at differ...
Fact is, enterprises have significant legacy voice infrastructure that’s costly to replace with pure IP solutions. How can we bring this analog infrastructure into our shiny new cloud applications? There are proven methods to bind both legacy voice applications and traditional PSTN audio into cloud-based applications and services at a carrier scale. Some of the most successful implementations leverage WebRTC, WebSockets, SIP and other open source technologies. In his session at @ThingsExpo, Da...
Growth hacking is common for startups to make unheard-of progress in building their business. Career Hacks can help Geek Girls and those who support them (yes, that's you too, Dad!) to excel in this typically male-dominated world. Get ready to learn the facts: Is there a bias against women in the tech / developer communities? Why are women 50% of the workforce, but hold only 24% of the STEM or IT positions? Some beginnings of what to do about it! In her Day 2 Keynote at 17th Cloud Expo, Sandy Ca...
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here