| By Tony Shan | Article Rating: |
|
| November 11, 2012 11:11 AM EST | Reads: |
855 |
Now the election in US is over. What differs the most from the last presidential election in 2008 is the impacts of new technologies such as blogs and social media. Interestingly, Nate Silver made a surprisingly accurate prediction of the election results in his famous FiveThirtyEight blog. His near-perfect forecast solidifies the relevance and significance of big data solutions. In my opinion, 4 key factors are the critical enablers to unlock the value of big data: Modeling, Algorithm, Statistics, and Semantics (MASS).- Modeling: First and foremost, a good model must be established to represent, capture, and ingest the large amount of data in a structured, unstructured, or semi-structured format. The nature of the data elements is a largely deciding factor for an appropriate data store.
- Algorithm: A sophisticated algorithm has to be developed to process the data in an optimized way. An easy-to-use coding method is needed to balance the local processing and global computation in a distributed fashion. For example, historical data can be tapped for generating valuable recommendations based on a user profile by means of the click-through rate and interest match metrics.
- Statistics: Statistical data analysis is becoming increasingly important, and open source packages like R make data mining more transparent. Growing commercial supports for R from the major vendors fuel the adoption, integration, and distribution of R.
- Semantics: Context-awareness is a must. Simple analysis is no longer sufficient for today's business. Complex analytics requires advanced techniques such as patterns and probabilistic reasoning. Vagueness is inevitable and got be dealt with properly to extract insights from massive data in a fuzzy way.
Read the original blog entry...
Published November 11, 2012 Reads 855
Copyright © 2012 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Tony Shan
Tony Shan works as a senior consultant/advisor at a global applications and infrastructure solutions firm helping clients realize the greatest value from their IT. Shan is a renowned thought leader and technology visionary with a number of years of field experience and guru-level expertise on cloud computing, Big Data, Hadoop, NoSQL, social, mobile, SOA, BI, technology strategy, IT roadmapping, systems design, architecture engineering, portfolio rationalization, product development, asset management, strategic planning, process standardization, and Web 2.0. He has directed the lifecycle R&D and buildout of large-scale award-winning distributed systems on diverse platforms in Fortune 100 companies and public sector like IBM, Bank of America, Wells Fargo, Cisco, Honeywell, Abbott, etc.
Shan is an inventive expert with a proven track record of influential innovations such as Cloud Engineering. He has authored dozens of top-notch technical papers on next-generation technologies and over ten books that won multiple awards. He is a frequent keynote speaker and Chair/Panel/Advisor/Judge/Organizing Committee in prominent conferences/workshops, an editor/editorial advisory board member of IT research journals/books, and a founder of several user groups, forums, and centers of excellence (CoE).
- Cloud People: A Who's Who of Cloud Computing
- Windows Azure IaaS Reaches General Availability
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- CollabNet And UC4 Announce General Availability Of Joint Enterprise DevOps Platform
- The Software Freedom Conservancy – Fundraising Campaign: Non-Profit Accounting Software
- MicroStrategy Announces General Availability of MicroStrategy 9.3.1
- AMAX Launches StorMax(TM) CFS, powered by IBM(R) General Parallel File System(TM) (GPFS(TM))
- New Relic Named Best Place to Work in the Bay Area for Second Year in a Row
- MicroStrategy Announces General Availability of MicroStrategy 9.3.1
- Project Floodlight Grows to the World’s Largest SDN Ecosystem; Global Users, Contributors and Partners Innovating Using Open Source SDN
- HotLink Debuts Amazon EC2 Plug-in for Microsoft SCVMM with Latest Release of HotLink Hybrid Express
- RightScale Supports Windows Azure Infrastructure Services General Availability
- Cloud People: A Who's Who of Cloud Computing
- Windows Azure IaaS Reaches General Availability
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- Portable Experimenter’s Platform, Powered by Raspberry Pi
- SUSE Receives Common Criteria Security Certifications
- Basho Announces Open Source Riak CS and General Availability of Riak CS Enterprise v1.3
- CollabNet And UC4 Announce General Availability Of Joint Enterprise DevOps Platform
- Granular Enforcement of Access to File Systems Featured in Latest Release of FoxT ServerControl
- The Software Freedom Conservancy – Fundraising Campaign: Non-Profit Accounting Software
- MicroStrategy Announces General Availability of MicroStrategy 9.3.1
- AMAX Launches StorMax(TM) CFS, powered by IBM(R) General Parallel File System(TM) (GPFS(TM))
- New Relic Named Best Place to Work in the Bay Area for Second Year in a Row
- Cloud People: A Who's Who of Cloud Computing
- Red Hat Named "Platinum Sponsor" of Virtualization Conference & Expo
- An Introduction to Ant
- Cloud Expo 2011 East To Attract 10,000 Delegates and 200 Exhibitors
- Google Web Toolkit: Finally Java Has Been Put into JavaScript!
- Cloud Expo, Inc. Announces Cloud Expo 2011 New York Venue
- AJAX World RIA Conference News - AJAX & RIA with Server-Side JavaScript
- Early Notes on GoogleApps
- President & CTO of 3tera Speaking Next Week at SYS-CON's Cloud Computing Expo November 19-21 in Silicon Valley
- Rating JRuby, Jython, and Groovy on the Java Platform
- Python Creator Guido van Rossum to Present the Next-Generation Python 3000
- Rackspace Cloud APIs Open Sourced




















