Big Data Pitfalls

Avoid Simpson’s paradox:
This paradox refers to a phenomena where the association between a pair of variables (X; Y) reverses sign upon conditioning of a third variable, Z regardless of the value taken by Z. If we partition the data into subpopulations, each representing a specic value of the third variable, the phenomena appears as a sign reversal between the associations measured in the disaggregated subpopulations relative to the aggregated data, which describes the population as a whole.

Right ML algorithms usage: use the right approach for machine learning algorithms, find the appropriate algorithm for your specific problems. Ex. If you need a numeric prediction quickly, use decision trees or logistic regression.

Keep in mind the Prisoner’s Dilemma: like in “cigarette manufacturers endorsed the making of laws banning cigarette advertising, understanding that this would reduce ad costs for parties and increase profits across the industry”, so it is with the business strategy and down to big data processing.

Consider Gödel’s Theorem: any system of computation you can construct (numbers theory etc.) that it is true, it cannot be ultimately proved from the rules within that computational construct. The system in a way transcends itself. Thus the way to the strong AI for example.

Keep in mind the exponentially powerful quantum computers of the future. For example build different, resistant cryptographic algorithms against the qubits future powers.

Cognitive Computing

The aim of cognitive computing is to mimic human thought processes in a computerized model. Using self-learning cognitive algorithms that use data mining, machine learning, pattern recognition, and natural language processing, the computer can imitate the way the human brain works.
Cognitive systems analyze the huge amount of data which is created by connected devices (not just the Internet Of Things) with diagnostic, predictive and prescriptive analytics tools which observe, learn and offer insights, suggestions and even automated actions.
Cognitive Computing and Machine learning addresses the challenge of passing the boundary of traditional data analytics algorithms, which spotlights the development of swift efficient cognitive algorithms.
These cognitive or machine learning algorithms enable real-time processing of huge volume of data, deliver precise predictions of various types such as recommending right products, customer segmentation, detecting fraud and risks, customer retention etc. Cognitive Computing and Machine learning supports these functions by creating a set of cognitive or machine learning algorithms that differ from the traditional statistical techniques. The emphasis is on real-time and highly scalable predictive/cognitive models, using fully automated methods that make data scientist tasks easier.