An Executive Primer for Artificial Intelligence

Artificial Intelligence (AI)/Machine Learning (ML) stands to be one of the digital transformation technologies to lead. AI has been around for over 50 years as a research discipline. Most recently, these technologies have realized significant interest due to cloud computing, open-source models, and big data. While still nascent in some respects, it is now more accessible than ever to companies. Much of the technology has focused on model development. As this discipline evolves, other technologies to support models in operation (MLOps) and deployment will accelerate acceptance, which has traditionally been custom efforts.

Many companies tend to report having AI in production by way of embedded AI in their off-the-shelf software. They gain the benefits, yet it is not a differentiated capability. Differentiated capability initiates from companies with a data-driven culture, combined with a mature data science capability. There are still challenges that prevent many companies from leveraging this at scale. Analyses indicate a common set of themes: lack of vision, strategy, and support by leadership. Almost any major project lacking these elements will fail. For the topic of AI as a transformational capability, it has great potential for Intelligent Automation by driving automated decisions in processes to improve efficiency at scale.

Challenges remain in adoption. One of the persistent challenges is the poor executive understanding of what AI/ML can do. Without this, AI becomes a series of unaligned projects without vision, clear goals, or value for investment. Poor understanding of business value facilitates a loss of faith. This facilitates a collection of data and solution silos that do not work well together and add cost. Poor collaboration and coordination erode momentum and support.

A recent report from O’Reilly, “AI Adoption in the Enterprise 2020”, the survey highlights this point. For many organizations, “… the lack of ML and AI skills isn’t the biggest impediment to AI adoption”. Respondents identified a lack of institutional support as the most significant issue. The next top challenge is the difficulties in identifying appropriate business use cases. The report wraps up with a lack of skilled people (#3) and lack of data combined with poor data quality (#4).

McKinsey’s “Global Survey: The State of AI in 2020” indicates “Respondents at AI high performers rate their C-suite as very effective more often than other respondents do. They also are much more likely than others to say that their AI initiatives have an engaged and knowledgeable champion in the C-suite.”

The “2021 Enterprise Trends in Machine Learning” report by Algorithmia further supports this point. Governance is by far the top challenge for AI/ML deployment with more than half ranking it as a concern. Organizational alignment is one of the biggest gaps in AI/ML maturity.

We will review what AI can do, and the types of problems it can solve. Aligned with the appropriate business context, this aids in identifying where AI can add value.

Executive Summary

AI/ML solves specific types of problems. For executives, understanding these solution sets is key to defining the business problems applicable to AI/ML models.
Understand the challenges and mitigate their risk. Build a data-driven culture across the organization.
Develop an approach for identifying opportunities. Start with a well-defined business challenge and identify hypotheses. After a good problem definition, then focus on data and testing the hypothesis before full-scale implementation.
Gain inspiration from other organizations and their success. It’s not always straight forward. These organizations are always improving and experimenting with solutions.

What Problems Does AI solve?

Artificial intelligence is the science of using data to train computers to perform tasks that typically require human intelligence to complete. The “machine” learns from data it receives by identifying patterns and relationships within the data itself. Thus, the model correlates input data to an inferred answer (i.e., probability-based decision). Fundamentally AI/ML is automation technology. Focusing on decisions, vs. predictions enables automation in the enterprise leveraging one’s data.

The state of the art today is that a basic AI/ML model provides solutions to a specific set of problem types. More complex models coordinate multiple models (e.g., an ensemble model), each answering a specific question, to derive a more complex answer.

AI Solution Types

The key to leveraging AI/ML in the business is to define the problem. To properly define the problem, it helps to understand what types of questions a model can answer. These simplified solution scenarios form the basis for the business problems to which they apply.

Predict a Value (Regression)
This is a type of model that predicts a specific numerical value. This is especially useful in Intelligent Automation where this value can be used to automate processes such as ordering, forecasting, pricing, warranty reserve, or demand. For automation, this facilitates data entry or process decisions.

One example is forecasting the sales demand for a product, based on a set of input data such as previous sales figures, consumer sentiment, and weather. Another example is predicting the price of real estates, such as a building, using data describing the property, location, and other attributes.

Classification
This type of model categorizes new data (input) as belonging to one of a fixed number of categories. One analogy is a sorting exercise. Imagine where you are seeking to place an object into one of a set of boxes, where each box holds items of a specific type. Binary classification is where there are only two categories, such as “spam” or “not spam”. Multi-classification is where there are many categories determined by the business problem. Computer vision is a type of classification system (see below).

Examples of binary classification include: email spam (spam/not spam), churn prediction (a customer will churn/not churn), or conversion (buy/not buy); Credit scoring is a scenario where a person is classified into a fixed set of rating categories (high, medium, low risk); Document classification for privacy or confidentiality (public, internal-only, confidential, top-secret). For customer service, this could be identifying email intent (product support, account issue, general inquiry, etc.)

Anomaly Detection
This type of model determines whether specific inputs are out of the ordinary, i.e., an anomaly or outlier. This is very much like a binary classification, except the number of examples in each class are unevenly distributed. Most of the examples belong to a “normal” class, and a smaller number represents the exception class (i.e., anomalies). This is useful in Intelligent Automation where detection of a potential outlier initiates a process, action, or notification.

For instance, a system could be trained on a set of historical vibration data associated with operating a piece of machinery. This model determines whether a new vibration reading suggests that the machine is not operating normally (e.g., prediction maintenance). Another example in financial areas is fraud detection.

Clustering (Cluster analysis)
These models create a set of categories from data where a “cluster” is data sharing a set of common or similar characteristics. These models interpret data to find natural groupings. It is a common data analysis activity often used with other techniques.

Cluster analysis is widely used in market research when working with multivariate data from surveys and test panels. Market researchers use cluster analysis to partition the general population of consumers into market segments and to better understand the relationships between different groups of consumers/potential customers, and for use in market segmentation, product positioning, new product development, and selecting test markets. This may also be used to group documents that have similar traits. Multiple cluster schemes based on different types of attributes and cross associate documents (e.g., documents with specific legal clauses).

Recommendations (Association Rule Learning)
These models predict recommendations, based on a set of training data. Another way to state this is that it associates data with other data.

A common example of recommendations are systems that suggest the “next product to purchase” for a customer, based on the buying patterns of similar individuals, and the observed behavior of the specific person. Another simple rule would be if I buy hot dogs, then I’m likely to buy hot dog buns (or ketchup or mustard, or relish). Working with natural language processing (see below), this is used to classify support tickets and recommend responses.

Computer Vision (Pattern Recognition)
These models identify patterns in the data (images/video, speech patterns, text) to assign to a label. A label depends on the situation. It could be abstract (person, house, dog, tree, traffic sign), or it could be specific (speed sign vs stop sign, a specific person’s name via facial recognition).

An example of classification is the evaluation of acceptable product quality (defects) coming from a manufacturing line. In medical imaging, it is similarly used for identifying disease (e.g., cancer cells) or injuries (e.g., broken bone). For a document image, this may be used for extracting words/values from a document (e.g., optical character recognition).

Natural Language Processing (NLP)
This is a diverse space that has disciplines targeting specific capabilities. NLP is one area that is very dynamic and evolving with research. These are some of the most common types:

Processing text in images (Optical character recognition or OCR) uses ML to extra text from images. An example is extracting data from an invoice.
Sentiment analysis identifies and extracts subjective information from text, for instance, feelings, thoughts, judgments, or assessments about a particular topic, event, or company.
Speech recognition translates a sound clip of someone speaking into a textual representation (speech to text).
Speech generation is the reverse of recognition, translating text to a spoken representation (text to speech).
Named entity recognition map elements in text to proper names (e.g. people or places), as well as the type of the name (e.g. person, location, organization, etc.) This is useful in document processing.

AI Challenges

Winning Hearts and Minds In the O’Reilly “AI Adoption in the Enterprise 2020”, the top bottleneck is the “Company culture does not yet recognize needs for AI”. Many AI initiatives stall due to a lack of support or understanding from the executive suite. Leading organizations establish a data-driven culture. Data science is a business capability to drive value over time. It needs support, as well as budget, at a strategic level to drive the importance of data-driven decisions in the organization.

Collaboration is Key
Another common scenario is distrust by business managers. Employees may fear what they do not understand. Besides, they may fear an impact on their jobs. One common comment is “How can this software be as good as our people?” Business involvement is important in finding and defining the problem to be solved. Consider Involving the business and IT in building the solutions at every stage.

Benchmark and Track Metrics
Gaining support from the organization requires building trust. Benchmark the decision process that the model will improve. Test the hypothesis before operationalization. Track the benefits in production. Think about how you communicate your success. Success will enable trust in future projects.

Mind the Data
Models are built on data. Data is a key factor that determines the quality of the output of the model. A good data management strategy goes a long way. Creating “data agility” is an important strategy to get access to the right data at the right time.

Data must be discoverable and accessible to the data scientists. Model builders spend much of their time determining what data is needed, finding the data, and evaluating it.
Data in organizational silos need to be rationalized (e.g., different systems may use different names for the same data fields or different units).
Data must be of high quality. The old saying “garbage in, garbage out” is amplified in AI.
Data provenance must be known to understand how it changes as it moves across your enterprise (systems)
Establish security and privacy policies for your data.

Discipline via Governance
Governance helps manage risk, creates focus, and enables the team.

In the O’Reilly “AI Adoption in the Enterprise 2020”, companies found fairness, bias, and ethics to be the #2 challenge. Sometimes the bias already exists in the decision process. Sometimes data selection can insert bias. Creating policy and guidelines helps mitigate embedding bias into a model.
Models have a lifecycle and must be updated. The same report as above identified model degradation as the #4 risk. Data is dynamic and changes over time, thus models must be updated on some schedule. For each model, facilitate teams establishing how and when they should be refreshed.
Create auditability on the model input/output. If something does go awry, understand where and, if possible, why.

Identifying Opportunities for Machine Learning

Identify a Clear Use Case
Start with the problem you want to solve. Focus on problems that would be difficult to solve with traditional programming. ML is good at finding patterns that get too complicated to solve with traditional programming. In the “2021 Enterprise Trends in Machine Learning” report, one common reason that AI/ML projects fail is due to solving the wrong problem. That is one that does not solve a business need or fits poorly into the business context.

Know the Problem Before Focusing on the Data
Often, we come with our assumptions and even bias. Create your hypothesis. This process is often about experiments with data to prove there is a data-based solution to the problem.

Know your Data
ML creates a decision based on historical data by finding correlations. If you do not have the data, ML will not help. In this case, you may start with a rule-based or heuristic system and collect data. ML does not naturally determine causation which is a more difficult problem.

Understand the Predictive Power
Is there a correlation between the data and prediction? This is where you determine if your features (i.e., data columns) can be predictive of the hypothesis. We may need to test several solutions. You may need to remove or add data. Do not assume if a model performs well in your training data, it will perform well in a production scenario. The model needs to generalize well to new (unseen) data (e.g., production data).

Focus on Decisions (vs. Predictions)
A decision is an action taken on the output of the model. Predictions, such as traditional Business Intelligence (BI), are better at finding interesting things in your data. They stop short at automating processes as a differentiator. There is a difference between the two in how to frame the problem, although sometimes similar. AI/ML is best suited for making decisions.

Examples from AI Leaders

There are some great leaders in data science and AI already out there. Many have technology blogs that discuss the problem they sought to solve, and how they solved it. This is a great place to learn from the best.

Netflix

Monitors its video streaming for poor performance (anomalies) to meet customer expectations.
The Netflix app personalizes everything on the page offering movies and TV shows relevant to each member such as video recommendations and artwork.
The search feature in the app leverages natural language models as well as collaborative filtering (recommendations).
Netflix marketing uses personalization algorithms to send recommendations via email and notification. The model(s) determine what to send and to whom. Their budget allocation algorithms decide what to advertise, to whom, and for how much.

Internally, ML performs a security function to identify anomalous access of corporate applications by employees and act if malicious.

Uber

Marketplace forecasting to predict user supply and demand in a particular location/time to direct drivers.
Hardware capacity planning to determine the “right” number of servers to avoid costly over-provisioning vs. under-provisioning (performance).
Uber Eats provides a recommendation engine for foods (or restaurants) most likely to appeal to an individual user.

The Customer Obsession Ticket Assistant (COTA) uses ML to help drive customer support across 5 channels. This enables routing support tickets to the appropriate area or recommends resolutions (to user or agent).

Google

Uses AI to manage data center energy and reduce cooling costs by 40%.
Clustering models are used for data generalization, compression, and privacy in products such as YouTube videos, Google Play apps, and Music tracks.

Spotify

Personalization in real-time for our > 248 million monthly active users (MAU’s), by combining machine learning using individual listening history, musical choices, duration of play, and willingness to act on recommendations.