Big Data Pitfalls

Avoid Simpson’s paradox:
This paradox refers to a phenomena where the association between a pair of variables (X; Y) reverses sign upon conditioning of a third variable, Z regardless of the value taken by Z. If we partition the data into subpopulations, each representing a specic value of the third variable, the phenomena appears as a sign reversal between the associations measured in the disaggregated subpopulations relative to the aggregated data, which describes the population as a whole.

Right ML algorithms usage: use the right approach for machine learning algorithms, find the appropriate algorithm for your specific problems. Ex. If you need a numeric prediction quickly, use decision trees or logistic regression.

Keep in mind the Prisoner’s Dilemma: like in “cigarette manufacturers endorsed the making of laws banning cigarette advertising, understanding that this would reduce ad costs for parties and increase profits across the industry”, so it is with the business strategy and down to big data processing.

Consider Gödel’s Theorem: any system of computation you can construct (numbers theory etc.) that it is true, it cannot be ultimately proved from the rules within that computational construct. The system in a way transcends itself. Thus the way to the strong AI for example.

Keep in mind the exponentially powerful quantum computers of the future. For example build different, resistant cryptographic algorithms against the qubits future powers.

Job Breakthroughs

Startup vs. Larger Company:
Working for a smaller company is that you get to make more of an impact: Working in a larger corporation might have more benefits or a higher salary but a startup is where you can really make a difference and see the influence your work is having on the business. You’re heavily involved in each stage of production and your opinion is more likely to carry weight than at a larger, more structured operation. Decentralization of big companies would be done through tokenization. The shares will be done through ICOs.
Jobs in IT:
In artificial intelligence, the Internet of Things, data security, virtual reality and augmented reality: big data engineer, full-stack developer, security engineer, IoT architect and VR/AR engineer. The skills needed to succeed in the IT jobs of tomorrow revolve around security certifications, programming and applications development, proficiency with cloud and mobile technologies, and other specialized skill sets giving also way to the hybrid IT roles that bind the business to IT.
Data Scientists: it is essential for data scientists to work with languages like R, Python, SAS, Hadoop, Netezza in which they apply their knowledge in statistics, mathematics (algebra), matices (multivariable) calculus. And to have a knowledge in platforms like MapReduce, GridGain, HPCC, Storm, Hive, Pig, Amazon S3.

The user as valuable “in the network” resource. Their actions should be monetized and generate income. We are producing valuable data even now by only navigating on FB, Google and other social networks which the system themselves uses it to become better (the long therm plan is building the future AI systems together). The “Internaut” will be one of the nicest job of the future.