Big Data Pitfalls

Avoid Simpson’s paradox:
This paradox refers to a phenomena where the association between a pair of variables (X; Y) reverses sign upon conditioning of a third variable, Z regardless of the value taken by Z. If we partition the data into subpopulations, each representing a specic value of the third variable, the phenomena appears as a sign reversal between the associations measured in the disaggregated subpopulations relative to the aggregated data, which describes the population as a whole.

Right ML algorithms usage: use the right approach for machine learning algorithms, find the appropriate algorithm for your specific problems. Ex. If you need a numeric prediction quickly, use decision trees or logistic regression.

Keep in mind the Prisoner’s Dilemma: like in “cigarette manufacturers endorsed the making of laws banning cigarette advertising, understanding that this would reduce ad costs for parties and increase profits across the industry”, so it is with the business strategy and down to big data processing.

Consider Gödel’s Theorem: any system of computation you can construct (numbers theory etc.) that it is true, it cannot be ultimately proved from the rules within that computational construct. The system in a way transcends itself. Thus the way to the strong AI for example.

Keep in mind the exponentially powerful quantum computers of the future. For example build different, resistant cryptographic algorithms against the qubits future powers.

Cognitive Computing

The aim of cognitive computing is to mimic human thought processes in a computerized model. Using self-learning cognitive algorithms that use data mining, machine learning, pattern recognition, and natural language processing, the computer can imitate the way the human brain works.
Cognitive systems analyze the huge amount of data which is created by connected devices (not just the Internet Of Things) with diagnostic, predictive and prescriptive analytics tools which observe, learn and offer insights, suggestions and even automated actions.
Cognitive Computing and Machine learning addresses the challenge of passing the boundary of traditional data analytics algorithms, which spotlights the development of swift efficient cognitive algorithms.
These cognitive or machine learning algorithms enable real-time processing of huge volume of data, deliver precise predictions of various types such as recommending right products, customer segmentation, detecting fraud and risks, customer retention etc. Cognitive Computing and Machine learning supports these functions by creating a set of cognitive or machine learning algorithms that differ from the traditional statistical techniques. The emphasis is on real-time and highly scalable predictive/cognitive models, using fully automated methods that make data scientist tasks easier.
http://www.cognub.com/

IIoT Platforms

GE’s Predix, Siemen’s MindSphere, and the recently announced Honeywell Sentience are likely to be on any short list of industrial cloud platforms. But they aren’t the only ones in this space. Cisco’s Jasper, IBM’s Watson IoT, Meshify, Uptake, and at least 20 others are competing to manage all those billions of sensors that are expected to encompass the Industrial Internet of Things (IIoT).

Sample providers: Amazon AWS, AT&T M2X, Bosch IoT, Carriots, Cumulocity, GE Predix, IBM Watson IoT, Google Cloud IoT Core, Intel IoT, Cisco Jasper, Losant IoT, Microsoft Azure, PTC ThingWorx (connected to Windchill/PDMLink), SAP Hana Cloud, Thethings.io, C3IoT, Uptake, Amplia IoT, XMPRO, Meshify, TempoIQ, Bitstew Systems, Siemens MindSphere, AirVantage, Honeywell Sentience, Schneider Electric’s Ecostruxure, Alibaba Cloud will roll out its big-data service, called “MaxCompute”, and Parker Hannifin’s Voice of the Machine IoT platform.

GE Predix: is a platform-as-a-service (PaaS) specifically designed for industrial data and analytics. It can capture and analyze the unique volume, velocity and variety of machine data within a highly secure, industrial-strength cloud environment. GE Predix is designed to handle data types that consumer cloud services are not built to handle.

Siemens MindSphere is an open platform, based on the SAP HANA (PaaS) cloud, which allows developers to build, extend, and operate cloud-based applications. OEMs and application developers can access the platform via open interfaces and use it for services and analysis such as the online monitoring of globally distributed machine tools, industrial robots, or industrial equipment such as compressors and pumps. MindSphere also allows customers to create digital models of their factories with real data from the production process.

Honeywell Sentience is the recently announced cloud infrastructure by Honeywell Process Solutions. It is a secure, scalable, standards-based “One Honeywell” IoT platform, that will be able to accelerate time-to-market of connected solutions, lower the cost-to-market, and enable new innovative SaaS business models. It will have the ability to run global security standards embedded throughout the solution and make applications that are plug & play and scalable.

C3 IoT is a PaaS that enables organizations to leverage data – telemetry from sensors and devices, data from diverse enterprise information systems, and data from external sources (such as social media, weather, traffic, and commodity prices) – and employ advanced analytics and machine learning at scale, in real time, to capture business insights for improved operations, enhanced customer engagement, and differentiated products and services. C3 IoT is led by Silicon Valley entrepreneur Thomas Siebel. It has closed deals with the U.S. State Department and the French utility ENGIE SA, based on C3 IoT’s focus on machine-generated data.

Uptake: is a predictive analytics SaaS platform provider that offers industrial companies the ability to optimize performance, reduce asset failures, and enhance safety. Uptake integrates data science and workflow connectivity to provide high-value solutions using massive data sets. In 2015, it entered into a partnership with heavy construction equipment manufacturer Caterpillar to jointly develop an end-to-end platform for predictive diagnostics in order to help Caterpillar customers monitor and optimize their fleets more effectively.

Meshify is an Industrial IoT platform for tracking, monitoring, analyzing devices. The Meshify suite of tools provides all the features needed to deploy, monitor, control, and analyze the results of an IoT solution. Despite being a young technology business, it has a growing portfolio of clients with industrial-oriented companies, including Henry Pump, Sierra Resources, Stallion Oilfield Services, Gems Sensors & Controls and MistAway Systems.

http://www.iotcentral.io/