When we think about Machine Learning/Artificial Intelligence (ML/AI) we often think about the kinds of algorithms that learn how to achieve human-like performance on tasks by starting with no pre-conditional assumptions. For example, language models like the BERT algorithm can learn how to achieve human-like performance on tasks ranging from reading comprehension to text-generation. But prior to being trained, BERT makes no assumptions about the structure of the language it must LEARN, and there is no human intervention in its training process.
Despite their impressive performance on some tasks, recently there has been a growing consensus in the machine-learning community that this so-called “training from scratch”-approach is not a viable path towards achieving “general artificial intelligence”. Just as human intelligence in a child is a result of both nature and nurture – in the form of intervention from parents and caregivers – it seems likely that some form of human intellectual intervention should also be necessary for machines to achieve human-level intelligence. This revelation has given rise to two significant trends related to nature and nurture:
- Human-in-the-loop Training Protocols: Numerous protocols aimed at incorporating human verification/validation into the training of machine learning algorithms, generally referred to as Human-And-Model-in-the-Loop-Evaluation-and-Training (HAMLET), have recently been developed for a wide variety of purposes. My favorite example is the adversarial HAMLET, which has produced some of the most robust models for natural language understanding to-date. In this example, human annotators are tasked with generating increasingly challenging training examples that specifically target the weaknesses of a given language model, so that the algorithm grows stronger with each iteration:
“Explainable” Neural Network Architectures: Many of the recent advances in machine learning have been the product of neural network architectures that utilize human-interpretable “graphs”. Sometimes the structures of these graphs are constrained by knowledge from human experts, thus explicitly imbuing the ML/AI model with intelligence from human experts (as well as the biases of those experts). This awesome paper (Relational inductive biases, deep learning, and graph networks) from Cornell University is chock full of examples of ways you can use human intelligence to design neural networks in ways that respect the physics of the problem you are trying to solve:
In attention-based networks such as transformers, the graph structure is computed on-the-fly as a function of the input data, thus imbuing the machine intelligence with “context-dependence”. These attention-graphs may be included in the model output, thus allowing human-experts to interpret the causal chain of events leading up to a prediction. In the example below of English-to-German translation, the transformer can tell us which English word was most important for predicting each German word of the translation.
Summary: If you are being presented with a ML/AI solution where the developers claim to achieve state-of-the-art performance using untrained, out-of-the-box algorithms be very skeptical. It is critical that solution providers are able to demonstrate performance on a wide set of data and that the results are able to be validated. As ML/AI continues to grow in popularity, there will be an onslaught of those with little or no technical knowledge of the field pitching their wares. For the foreseeable future, real state-of-the-art machine learning requires customization and validation both of which can be time-consuming and pain-staking.
Michael Park is a researcher and developer in the field of applied machine learning with a background in theoretical particle physics. When he's not writing code, you'll probably find him pondering the origins of the universe while jamming out to some true-school underground Hip Hop.