Month: November 2012

Leading real time Big Data Processor in the market

Solution Vendor Type Description
Storm Twitter Streaming Twitter’s new streaming big-data analytics solution
S4 Yahoo! Streaming Distributed stream computing platform from Yahoo!
Hadoop Apache Batch First open source implementation of the MapReduce paradigm
Spark UC Berkeley AMPLab Batch Recent analytics platform that supports in-memory data sets and resiliency
Disco Nokia Batch Nokia’s distributed MapReduce framework
HPCC LexisNexis Batch HPC cluster for big data
DRUID METAMARKET Stream Analytics and streaming

Undertanding Big Data, Analytics for Enterprise Class Hadoop and Streaming Data

Brief Background (We are seeing all the time)

Big data applies to the Information that can’t be processed with traditional processors and  tools but same traditional tools should not be mixed Big Data solutions. Increasingly   Enterprises are having challenges how to access the wealth of information but they don’t know how to get the values of it because it’s sitting in most raw form unstructured or semistructured format.

Thinking of BIG Data Solution why?

Big Data can be interpreted in many different way and that is why it’s conforming as V3. Velocity, Volume and varieties that characterizes it.

  1. Analyzing the raw unstructured, semiunstcrtured data from wide variety of sources. Big Data Solution can work on structured content too.
  2. Big Data are ideal for iterative and  exploratory analysis where business has not any predefined formula just like Traditional BI solutions where BI Solution providers has proprietary fixed formula for specific industries like Retail etc
  3. Big Data solution are ideal when analysis has to be done on whole set of data not on sample of data else it wouldn’t be as effective.

Traditional BI Tools has been always working on data that is per-processed while BIG data does analysis when data is in motion not in the rest most of the time. That’s where Stream Processing with low latency with high volume of data is key factor in Big Data technologies.

Broader use case for Big Data Solutions

  1. IT logs analytic
  2. Fraud Detection patterns
  3. The social media pattern
  4. Energy Sector
  5. Health care
  6. Retails sector
  7. Patterns for modeling and Management

Big Data Platform available in the market

Broader Technical Spectrum/Stack in Big Data provided by different vendors

Leading Research Investment in Big data

XML Based Workflow Engine

In this article, I would like to explore the power of XQuery, XML and XSLT in terms of workflow based application, workflow here described in very general way , one can think of
documents moving around different-2 people, where each can perform different-2 task, processing for insurance claim, or claim from goverment spent money on minimal job guarnatee program or bug report systems are examples that comes in mind.

XML fits very well for workflow application becuse it gives very good adaptibility to integrate with any external application.

We will talk about very highly configurable enterprise application re-usability , loosely coupled services are basic requirement. Till recently XQuery has been termed as Query language but it is pure server side language like jsp and php with query language capability.
This blog is for how to create your own custom very generic work flow engine, the conceptualization is based on JBoss JBPM workflow engine but this is very light weight and minimal taken implementation. XQuery and XML Database like Exist is enough, no JAVA code is required . This blog will be the first one coming with sample examples running…Please drop a mail on for code.