Artificial intelligence (AI) is the demonstration and simulation of human intelligence by machines, especially computer systems. AI, robotic process automation (RPA), the Internet of Things (IoT) and blockchain are among the next-generation technologies that are considered disruptive in nature. AI in particular is a step ahead of other emerging technologies and is changing how enterprises do business. It allows systems to make human-like decisions involving judgment and adapt to new environments.
Seventy-two percent of business leaders believe that AI is the technology of the future.1 Some enterprises already use AI to drive their business processes. Some examples include:
- A multinational conglomerate and manufacturer of electronic systems and equipment has applied AI-based scheduling systems to warehouse management, resulting in an 8 percent increase in productivity through order prioritization and picking efficiency, a 15 percent boost in sales, and a 27 percent increase in order rates. The conglomerate is also applying AI to other areas such as finance, transportation and utilities across more than 50 projects.2
- One of the leading ride-share companies relies on AI for daily operations, including the calculation of fares.3, 4
- One of the leading credit card companies relies on ML to avoid US$25 billion in fraud annually. Its AI techniques power more than 100 applications, allowing the real-time examination of transactions for indicators of fraud.5
When such tools are used in business processes and internal controls, auditors must evaluate their impact on audit procedures. In the first example in the preceding list, how would an audit professional test the validity of purchase orders created by a system based on the AI model’s decision to buy? In the third example, when management relies on the AI model to identify fraud, how can it ascertain that fraudulent transactions are detected appropriately?
Regulators and professional bodies have yet to provide frameworks for using these technologies or guidance on how to assess them. These technologies are evolving too fast for audit professionals and regulators to keep pace. Further, considering the complexity of these technologies, it is an ongoing challenge for audit professionals to provide assessment services.
There are some specific risk factors and controls related to the AI process that audit professionals should consider. Figure 1 illustrates the basic ML process.
To begin the process, an AI program or algorithm is built and tested with test data. These test data shape the logic within the AI program as it learns from various types of data scenarios. Once it has been tested, the AI program processes live data based on the logic learned from the test data, providing a result. The feedback from each result is analyzed by the AI program as its logic evolves to better handle the next live data scenario. At each of the three steps in the ML process, certain risk factors must be addressed.
Data Input Process
In the data input process, potential risk factors include data bias, incomplete test data, and inappropriate or unauthorized collection of data.
Data Bias
Seventy-six percent of chief executive officers
(CEOs) are concerned about unintended bias
creeping into AI algorithms or decision-making
models.6 Data bias occurs when two data sets are
not treated equally. For example, if the program developer has a personal bias toward a certain
ethnic group that is reflected in the test data fed to
the AI model, the AI program will function in a
biased manner toward that ethnic group,
It is important to have controls in place to prevent bias from affecting the AI model. This may include having independent users review the test data for bias or periodically reviewing the results of the AI model to ensure that bias has not developed.
IN THE ABSENCE OF REAL-WORK CHARACTERISTICS IN THE TEST DATA, THE AI ALGORITHM CANNOT LEARN AND DEVELOP ACCURATE LOGIC TO HANDLE LIVE DATA.
Incomplete Test Data
An AI model cannot work effectively if the test data
set is incomplete and does not cover all possible
scenarios. In the absence of real-world
characteristics in the test data, the AI algorithm
cannot learn and develop accurate logic to handle
live data.
It is also important for the test data to have the right mix of a training data set and a validation data set. The training data set is used to teach the AI model to carry out the decision-making process and data processing. The validation data set is then used to test or validate the AI model’s accuracy. The appropriate mix of training data and validation data depends on factors such as the volume of data, the complexity of decision-making and the number of possible scenarios. A deliberate, documented process should be in place to confirm that there is an appropriate mix of training and validation data sets. Figure 2 shows one possible mix.
Inappropriate or Unauthorized Collection of Data
Several data input points may be required to design
an effective decision-making process as part of an
AI model. With the prevalence of social media, ecommerce and IoT, it may be difficult to implement
an effective data-collection process that screens
data points for appropriate and authorized use. If
data are captured in an unauthorized manner
without the individual’s knowledge, this may violate
local data privacy laws. Controls need to be in place
to validate each data source and ensure that no
unauthorized data are captured as part of either test
data or live data.
AI Algorithm Process
The potential risk factors in this step include inaccurate AI algorithm logic, insufficient documentation and auditability, system override of manual controls, ineffective security, and unclear accountability.
CONTROLS NEED TO BE IN PLACE TO VALIDATE EACH DATA SOURCE AND ENSURE THAT NO UNAUTHORIZED DATA ARE CAPTURED AS PART OF EITHER TEST DATA OR LIVE DATA.
Inaccurate AI Algorithm Logic
The AI algorithm logic may not be accurate in some
scenarios, or the initial design may need to be
revised over time. When small AI models are
outsourced to service providers for development,
users may not thoroughly validate the accuracy of
the logic due to their reliance on the service
provider. This may have a direct impact on the
results generated by the AI model.
It is important that the AI model be aligned with current business objectives. Enterprises often alter their processes to adapt to the changing business landscape. A recent example is the COVID-19 pandemic, which necessitated some significant changes in the supply chain, among other operations. If appropriate changes to business models are not communicated in a timely manner to AI models, business disruption may result.
On a periodic basis, management should test and review the AI algorithm logic to ensure that it fulfills its intended purpose. An inventory or listing of AI models, along with their current algorithm logic, can be helpful when making process changes to ensure that AI models remain aligned with business objectives.
Insufficient Documentation and Auditability
of the AI Algorithm
Even if an algorithm is working accurately and
fulfilling its intended purpose, without a
documented algorithm logic, it is difficult to
understand the decision-making process of the AI
model. AI algorithms evolve quickly and, in the
absence of algorithm auditability, it is difficult to
understand how the algorithm model is designed
and how results are reached. In such cases, it may
be difficult to explain the results of the AI model to
external stakeholders. Also, the EU General Data
Protection Regulation (GDPR) states that anindividual has a right to understand how a decision
was reached by an automated decision-making
system and contest that decision.7
As a control, AI algorithm logic should be documented and explained appropriately to cover all aspects of the program. AI algorithm auditability needs to be enabled at inception to understand how the algorithm reached a certain judgment.
System Override of Manual Controls
Similar to the situation of management overriding controls, with AI, there is a risk of the system overriding manual controls. Some decisions may be so complex or their results so impactful that manual intervention is necessary. The risk is that an AI model may evolve over time to think that human intervention is not required. Recently, the chief justice of India said, “Artificial intelligence should never be allowed to substitute human discretion in judicial function.8
A system of manual intervention should be designed to prevent AI system override. Periodic manual reviews of transactions can be set up to ensure that those requiring manual intervention are processed as intended.
Ineffective AI Model Security
AI models are vulnerable to external attacks. An
attack might take the form of false-positive or
false-negative data inputs, which can cause the AI
model to evolve in an undesirable direction. This is
known as “adversarial machine learning.” Recently,
a leading systems security company demonstrated
how a self-driving car can be fooled to drive over the
speed limit by making a tiny sticker-based
modification to the speed limit sign.9
Security of the AI model should be part of the AI governance framework. Appropriate controls to identify duplicate data inputs, fake inputs and the like should be in place. Further, the risk factors arising from cybersecurity and general IT controls (i.e., access management, change management, program development and computer operations) surrounding the AI model and infrastructure should be addressed appropriately.
Unclear Accountability for AI Algorithm Results
In the age of AI, where decisions and judgments are
increasingly driven by the system, it is common for
management to rely on the system for various
controls. Thus, there may not be clearly defined
responsibilities for the actions taken by the AI
algorithm. Accountability for the AI model’s decisions
or results can be difficult to ascertain. As such, a
clearly documented responsibility framework,
including periodic reviews of the AI model, should be
in place. The framework should encompass control
owners for all phases of the AI process.
Results/Feedback Process
This process is subject to inadequate updates, inadequate review of results and recurring incorrect feedback loops.
Inadequate Update of the Model or Algorithm
The feedback process is a means to update the AI
model and logic to deal effectively with new
scenarios. If the AI algorithm is designed to read
only selected data points in the feedback process, it
may not be updated to reflect all changes.
Controls need to be in place to ensure that the AI model considers all relevant data points during the feedback process. For example, if an AI model is used to calculate a ride-share fare, along with the estimated time and distance, it is important to consider the drop-off location (i.e., whether it is in a remote area), so that the fare reflects the driver’s return trip to an urban area.
AN AI MODEL CAN GET STUCK IN A RECURRING FEEDBACK LOOP THAT IS ESSENTIALLY NOT TRUE.
Inadequate Review of Results Produced by
the AI Model
Over time, there may be a tendency to become more
reliant on the AI model for transaction processing,
and there may not be appropriate oversight of the
results produced by the AI model. Without a regular
review of results, the evolving AI model can deviate
from its intended purpose and, in some cases,
cause problems. A leading software company had
to issue an apology after its AI-based Twitter
account started to tweet racist comments.10 Repercussions can be even more serious if an AI
model used for business operations goes rogue. Manual reviews of the results produced by the AI
model must be conducted. These reviews can be
performed at an aggregate level or an entity level, as
appropriate. When required, manual changes can be
made to the AI model by following appropriate
change management procedures.
Recurring Incorrect Feedback Loops
An AI model can get stuck in a recurring feedback
loop that is essentially not true. For example, an AI
model denies a customer’s loan application based
on past rejections. This rejection is then considered
to reject future loans, and so on, even though the
customer’s creditworthiness may have changed.
The controls to address incorrect feedback loops are the same as those used to review results, along with some specific data analytics on the results produced to identify recurring incorrect feedback loops.
Other Testing Procedures for AI Models: Confusion Matrix
One of the most effective ways to test an AI model is to use a confusion matrix.11,12 This method tests the AI model’s logic using a black-box approach with a sample set of transactions; the sample transactions depend on population, different classes of transactions and possible scenarios. A confusion matrix has four types of transactions: true positives, true negatives, false positives and false negatives. When assessing an AI model, the auditor wants to find that most values are either true positives or true negatives. Figure 3 provides an example of a confusion matrix. The results of the sample in figure 3 are as follows:
- True positives (49)—The auditor’s prediction is yes, and the AI model output is yes.
- True negatives (48)—The auditor’s prediction is no, and the AI model output is no.
- False positives (2)—The auditor’s prediction is yes, and the AI model output is no.
- False negatives (1)—The auditor’s prediction is no, and the AI model output is yes.
The accuracy of the AI model can also be computed using the following formula:
Accuracy = (True Positives + True Negatives)/Number of Transactions
In the example, the accuracy is 97 percent: (49 + 48)/100. An accuracy close to 100 percent is desirable.
Conclusion
The technology landscape is changing faster than ever, and it will only get more complex. There will be even less transparency when it comes to understanding how systems operate. Emerging technologies bring new audit challenges, and it is essential that audit practitioners keep up to date on these developments and their impact on the audit function. The risk factors that are specific to AI can be addressed with the recommended procedures and controls examined here. These procedures will continue to evolve as these technologies become better understood.
Endnotes
1 PricewaterhouseCoopers (PwC), “PwC Releases Report on Global Impact and Adoption of AI,” 25 April 2017, http://www.pwc.com/us/en/press-releases/2017/report-on-global-impact-and-adoption-of-ai.html
2 Hitachi, “Take on This Unpredictable Business Age Together With Hitachi AI Technology/H,”2017, http://social-innovation.hitachi/-/media/project/hitachi/sib/en/solutions/ai/pdf/ai_en_170310.pdf
3 Hermann, J.; M. Del Balso; “Meet Michelangelo:Uber’s Machine Learning Platform,” Uber Engineering, 5 September 2017, http://eng.uber.com/michelangelo-machine-learning-platform/
4 Desai, V. J.; “Uber Hitches Ride With Machine Learning for Better CX,” ETCIO, 5 March 2020, http://cio.economictimes.indiatimes.com/news/strategy-and-management/uber-hitches-ride-with-machine-learning-for-better-cx/74485521
5 Pahuja, R.; “Here’s How ML Helps Visa in Preventing $25 Billion Fraud,” ETCIO, 22 January 2020, http://cio.economictimes.indiatimes.com/news/digital-security/heres-how-ml-helps-visa-in-preventing-25-billion-fraud/73498545
6 PricewaterhouseCoopers, “Artificial Intelligence Is Coming: Is Your Business Ready?” 2017, http://www.pwc.ch/en/publications/2017/pwc_artificial_intelligence_is_coming_2017_en.pdf
7 Information Commissioner’s Office (ICO),“Fully Automated Decision Making AI Systems:The Right to Human Intervention and Other Safeguards,” United Kingdom, 5 August 2019
8 ANI, “AI Should Never Be Allowed to Substitute Human Discretion in Judicial Functioning: CJI,” ETCIO, 27 January 2020
9 Povolny, S.; S. Trivedi; “Model Hacking ADAS to Pave Safer Roads for Autonomous Vehicles,” McAfee, 19 February 2020, http://www.mcafee.com/blogs/other-blogs/mcafee-labs/model-hacking-adas-to-pave-safer-roads-for-autonomous-vehicles
10 Lee, D.; “Tay: Microsoft Issues Apology Over Racist Chatbot Fiasco,” BBC, 25 March 2016, http://www.bbc.com/news/technology-35902104
11 Stehman, S. V.; “Selecting and Interpreting Measures of Thematic Classification Accuracy,” Remote Sensing of Environment, vol. 62, iss. 1, 1997, p. 77 89, http://doi.org/10.1016/S0034-4257(97)00083-7
12 Powers, D. M. W.; “Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation,” Journal of Machine Learning Technologies, vol. 2, iss. 1, 2011, p. 37–63,” http://bioinfopublication.org/files/articles/2_1_1_JMLT.pdf
Niral Sutaria, CISA
Is a manager at a Big Four professional services firm and has eight years of experience in IS audits, internal controls assessment and business process controls review. He is a member of ISACA® and the Institute of Chartered Accountants of India (ICAI). He can be reached at sutarianiral@gmail.com.