Predictive Analytics is the branch of machine learning that is putting all the data to work. It takes large data sets and uses mathematical algorithms to form predictive models. Then statistical methods such as regression analysis are used to find the variables that influence the models. Finally, machine learning platforms use those predictive models to find patterns in the past that can allow predictions of patterns in the future.
Human behavior can be seen as a series of patterns that repeat, both individually and as a group, over time. This statistical fact doesn’t negate the possibility of free will; it allows us the use of our free will to repeat patterns of behavior that are most comfortable to us, considering the social, cultural, and family pressures that also influence us.
The analysis of patterns of behavior in humans as a group has been the work of historians, who can look back at great sweeps of time and see patterns that repeat. With the amount of data being collected now through online interactions and geotracking tools such as GPS, machine learning platforms are engaged in how to mine that huge amount of data for the very specific data needed to answer questions.
Predicting the future has been an art in which intuition based on expert knowledge and experience was used to make a predictive analysis. Those who are considered masters in their work combine experience with knowledge, and can see patterns from the past and predict patterns into the future. But we are constrained by the depth and breadth of experience and knowledge we can acquire; we are further constrained by unconscious bias and other human attributes.
We can build unconscious bias into our data collection, but with large enough data sets, machine learning platforms can see beyond that into purely human patterns of historical behavior. Once those predictive models have been developed, using clustering, decision trees, and other models, patterns can be identified. Those patterns of human behavior form the basis of predictive analytics.
Once the data has been mined and patterns of human behavior in the past have been identified, one further step needs to be taken to ensure that those past behaviors can be predicted into the future. Regression analysis is the statistical method used to find the relationship between variables and how those variables affect the pattern.
Consider a subset of men with lung cancer who live in Utah. Researchers were expecting to find that a high percentage of them smoked cigarettes in their youth. But regression analysis of all demographic variables found other significant patterns: 28% of them smoked as adults, but 80% of them lived within ten miles of an abandoned uranium mine.
Once regression analysis identifies the relationship between variables in the models, predictive analytics can provide very accurate, and very detailed, information about what patterns of behavior to expect. From this information, business decisions can become specific and precise.
An example would be an analysis of hem lines in fashion over the last two hundred years. Using good data, the AI identifies clusters and patterns. Regression analysis identifies the effects of cultural variables. Predictive analytics tells a designer working on a new line that the market for mini skirts is close to the end of its run. The designer contacts her factory and orders the current run of mini skirts to be cut in half, and a longer model developed. Next fall, there will not be 200,000 mini-skirts sitting in stores on clearance sale.
The data is the critical piece in the success of predictive analytics. In the above example, the data that is considered is not just a history of fashion and hem lines. The data sets also include age demographics–we are, across the world, getting older. It looks at the number of women heads of state and CEOs of multinational corporations. It looks at the population change in religious and cultural groups that use more conservative clothing. It looks at every bit of data we can think to give it, and uses these large data sets to form models.
The ability to access and analyze data from many cultures and geographic regions is one way the current machine learning platforms can produce models that deal with global human behavior. In the past, when we depended on subject matter experts, predictions were limited by geography and culture.