DECISION INFORMATION FOR AUDITORS TO ASSESS LITIGATION RISK: APPLICATION OF MACHINE LEARNING TECHNIQUES

How to cite this paper: Lu, Y.-H., Lin, Y.-C., & Gu, F.-C. (2022). Decision information for auditors to assess litigation risk: Application of machine learning techniques. Corporate Ownership Control, 19 (3), 133–146. Fraud cases have become more common in recent years, highlighting the role of auditors‘ legal liability. The competent authorities have called for stricter control and disciplinary measures for auditors, increasing auditors‘ legal liability and litigation risk. This study used machine learning (ML) techniques to construct a litigation warning model for auditors to assess audit risk when they evaluate whether accept or terminate an engagement, thus improving audit quality and preventing losses due to litigation. Otherwise, a sample matching method comprised of 64 litigated companies and 128 non-litigated companies was used in this study. First, feature selection technology was used to extract six important influencing factors among the many variables affecting auditors‘ litigation risk. Then a decision tree was used to establish a litigation warning model and a decision table for auditors‘ reference. The results indicated that the eight outcomes provided by the decision table could effectively distinguish the level of a litigation risk with an accuracy rate of 92.708%. These results can provide useful information to aid auditors in assessing engagement decisions.


INTRODUCTION
Recently, serious fraud cases around the world caused massive losses for investors and creditors and shook public confidence in the capital market. The integrity of management, as well as the profession and ethics of auditors, were also called into question. In 2004, Taiwan saw a quick succession of fraud cases involving Procomp Informatics, Infodisc Technology, and Summit Computer Technology which led the Financial Supervisory Commission (FSC) of Taiwan to issue warnings regarding or cancel the certification of auditors. Otherwise, according to deep pocket theory (Calabresi, 1970), when a company is charged with fraud, creditors and investors often pursue litigation against well-paid certified public accountants (CPAs), despite a lack of audit failure, to obtain more compensation for losses (Carcello & Palmrose, 1994). Fraud cases not only directly put auditors at risk from litigation or sanction but also come with large legal costs and can cause substantial harm to reputations (Bonner, Palmrose, & Young, 1998).
In Taiwan, auditor litigation was less and investors couldn't confront large companies or auditors alone since huge litigation expenses before the Enron. In 2002, the Securities Investor and Futures Trader Protection Act was announced by the government, and Securities and Futures Investors Protection Center was established at the same time. Recently, the Securities and Futures Investors Protection Center has helped investors to sue many illegal companies and their auditors. A stricter legal environment lets large audit firms have begun to emphasize client screening and risk management. Recently, Deloitte Taiwan established a Reputation and Risk Department to assess clients' industry status and level of risk; KPMG Taiwan established an independent assessment team and a client risk screening team to determine whether new clients should be accepted; Ernst & Young (EY) Taiwan also established a risk management committee to investigate clients (Liu, Wang, & Lai, 2009). However, this decision-making process is extremely complex as underestimating the risk may lead to future litigation and damage reputations. Therefore, investigating the factors influencing litigation against auditors and providing auditors with risk evaluation information is important for both practice and academics. Particularly, developing a user-friendly litigation warning model that can be used in everyday auditing is the most crucial for auditors. In the early phases of audit work, machine learning (ML) enables auditors to access unbiased and more accurate information by collecting data using rules developed with machine learning algorithms (Cho, Vasarhelyi, Sun, & Zhang, 2020).
Machine learning techniques, such as decision trees and artificial neural networks (NN), are superior to traditional statistical methods, such as logistic regression (LR) and discriminant analysis in constructing detection models (Varetto, 1998; Cristianini & Shawe-Taylor, 2000; Min & Lee, 2005). Mitchell (as cited in Cho et al., 2020, p. 1) provided a widely referenced definition of machine learning: ‗‗The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience''. The main contribution of machine learning and the primary difference between machine learning and other algorithms is its predicting power, which arises from the processes of training and testing datasets (Cho et al., 2020). A common strategy used is to discover a pattern in a training dataset. This pattern is then used to classify and/or predict the behavior of new samples ( Kaplan & Williams, 2013) that examines the associations between auditors' litigation and audit firm characteristics or abnormal accruals, this study employed the feature selection approach to extract six critical factors to create our litigation warning model. As compared to audit firm characteristics and abnormal accruals, the credit risk index is the most important factor influencing the litigation risk for auditors in Taiwan and serves as a reference for other developing countries. Second, the prediction performance of our machine learning model is superior to that of the logistic regression and the discriminant analysis. This result underscores the value of applying ML to assist auditors in assessing their litigation risk. Finally, while extant litigation warning models focus on improving prediction accuracy, our study focused on constructing a classification model and a decision table that included eight classification rules from which the auditors can better assess litigation likelihood. The auditors can use this model to screen out potentially risky clients and decide which audit engagements can be accepted. Section 2 of this study reviews the literature on the litigation risk against auditors and data mining techniques. Section 3 describes the steps taken to construct the warning model, variable measurements, sample selection, and sources of data. Section 4 summarizes the results and analysis, and Section 5 provides a conclusion and suggestions.

Affecting factors of auditors' litigation risk
Assessing auditors' litigation risk is a complex procedure with many affecting factors. Arens, Elder, and Beasley (2014) indicated that engagement risk analysis can provide a framework. Engagement risk is the risk that the auditor or audit firm will suffer harm after the audit is finished, even though the audit report was correct. For example, if a client declares bankruptcy after an audit is complete, the likelihood of a lawsuit against the CPA firm is reasonably high. When auditors modify audit evidence for engagement risk, it is done by control of acceptable audit risk. Acceptable audit risk is a measure of how willing the auditor is to accept that the financial statements may be materially misstated after the audit is completed and an unmodified opinion has been issued. According to Statement on Auditing Standards of Taiwan (hereafter, SAS of Taiwan) No. 51, audit risk is affected by inherent risk, control risk, and detection risk. Otherwise, not only audit risk but also the deep pocket theory and the stricter legal environment let the litigation risk against auditors increase. Therefore, this study explored the affecting factors of litigation against auditors from prior literature and classified them into the following types: inherent risk, control risk, detection risk in audit risk, and legal environment. All of the factors and related literature are shown in Table 1. This risk is correlated to the auditor's or audit firmєs audit quality and characteristics. DeAngelo (1981) showed that the quality of audit services is defined to be the market-assessed joint probability that a given auditor will both 1) discover a breach in the client's accounting system, and 2) report the breach. The discovery of a breach is dependent on the professional competency of the auditor which is measured by audit firm size or industry expert usually (Kim et

Affecting factors of litigation risk: Legal environment
The 2001 Enron case not only impacted the US capital market but also increased the attention paid to capital market regulatory and supervisory systems in other countries. Auditing standards require auditors to identify fraud risks during the planning stages of their audits and then design audit procedures to investigate the identified risks (American Institute of Certified Public Accountants AICPA, 2002). In Taiwan, a similar requirement of fraud risks assessment is provided by SAS of Taiwan No. 43 and increases auditors' liability. Otherwise, fraud cases from 2004 led the FSC to issue warnings about or cancel the certification of several CPAs. The FSC made major revisions to the Certified Public Accountant Act, which increases auditors' civil and criminal liabilities. In 2008, the court issued criminal sentences to the two CPAs who were involved in the China Rebar. Lin and Lin (2010) warned the accounting and auditing profession that the criminal liability of auditors will increase in the future when cases similar to the China Rebar happen again.

Use of data mining techniques in the formation of the warning model
Studies on predicting or detecting models developed rapidly after the first use of univariate discriminant analysis by Beaver (1966), in which sample matching was used to predict financial crises in sample US companies. For example, Altman (1968) used multiple discriminant analysis (MDA) to construct a bankruptcy detection model; the results selected 22 financial ratios and developed the Z-score model often used in later studies ( Frawley, Piatetsky-Shapiro, and Matheus (1992) stated that machine learning techniques find potential data hidden in previously unknown valuable information. Simply put, machine learning techniques efficiently search databases for useful knowledge and principles by finding patterns and relationships. Bose and Mahapatra (2001) introduced machine learning techniques used to deal with four problem types in the business area. The first type consists of a prediction problem, which examines past observed values for an attribute to infer a future value for the attribute; for example, stock returns prediction model (Tsai, Lin, Yen, & Chen, 2011) or a credit rating prediction model (Tsai & Chen, 2010). The second type consists of classification problems, which define analyzed attributes and create classes; for example, Ravisankar, Ravi, Raghava Rao, and Bose (2011) used machine learning techniques such as multilayer feed forward neural network (MLFF), support vector machines, genetic programming (GP), group method of data handling (GMDH), logistic regression, and probabilistic neural network (PNN) to identify companies that resort to financial statement fraud. Tsai, Lu, and Yen (2012) used feature selection in data mining to screen important variables affecting intangible assets, creating an intangible asset assessment and classification model to aid investors in determining whether companies have intangible assets. Kuzey, Uyar, and Delen (2014) used a decision tree and neural network to create a corporate value classification model. The third type consists of an association problem, which determines which related items should be grouped; a commonly used technology of this type is association rules. For example, Lu, Tsai, and Yen (2010) used association rules to find six factors that influenced corporate values for Taiwanese businesses. The fourth type consists of a detection problem, which combines prediction and classification functions. In this problem, machine learning can infer future values of attributes according to past values' and then classify them.  Coats and Fant (1993) used neural networks and MDA to create financial distress models base on the five ratios: working capital/total assets, retained earnings/total assets, earnings before interest and taxes/total assets, the market value of equity/book value of total debt, and sales/total assets. This study examined 282 firms that were in operation from the period 1970-1989. Half of the firms of the sample were used to develop NN and MDA models and the rest served as a test sample. The test results suggest that the NN approach is more effective than MDA for the early detection of financial distress. Chaveesuk, Srivaree-Ratana, and Smith (1999) explored three of the most well-known supervised neural network paradigms: backpropagation, radial basis function, and learning vector quantization, for the task of rating US corporate bonds. Using generally available historic data, bonds are assigned to ratings based on a classification scheme. Comparisons were made with logistic regression and multiple regression models on both the data set used to create the predictive models and on new data. The results indicated that back-propagation neural networks (BPNs) were the superior method. Min and Lee (2005) used 1,888 firms including bankruptcy and non-bankruptcy cases and applied SVM to the bankruptcy prediction problem in an attempt to suggest a new model with better explanatory power and stability. The study used a grid-search technique using 5-fold cross-validation to find out the optimal parameter values of kernel function of SVM and compared its performance with those of MDA, LR, and three-layer fully connected back-propagation neural networks. The experiment results show that SVM outperforms the other methods. In summary, prior research has found that the warning model prediction accuracy of machine learning techniques is superior to that of traditional statistical methods.

Research sample
The sample for this study consisted of 64 companies listed on the Taiwan  Investors Protection Center was established and has helped investors to sue many illegal companies and their auditors. However, the litigation cases are limited and there are not many samples of auditors being sued together in Taiwan. Major fraud cases in recent years have been discovered after many years (e.g., Wirecard, and Ya Hsin Industrial Co., Ltd.). This study tried to find out the characteristics of high litigation risk companies by using matching samples.
Therefore, we confirm that the non-litigation companies are still legal after many years. Financial industries were not included in the sample since the risk assessment and industry characteristics of these industries are much different from others. Sample matching was conducted by the methods used in Beaver (1966); non-litigated companies within the same period, similar industries, and with similar asset scales acted as the control sample. A 1:2 sample matching method (Coats & Fant, 1993) comprised of 64 litigated companies and 128 non-litigated companies was used in this study. Table 2 shows that the litigated companies covered 13 industries; 56.3% of this sample was the electronics industry sample (28 companies) and the construction industry sample (8 companies). Many litigation cases happened in 2007 and 2008, a period after the China Rebar case. Industry  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  Total  Food  3  3  Rubber  1  1  2  Textile  1  5  6  Electrical equipment  1  1  1  1  4  Wire and cable  1  1  2  Biochemical  1  1  2  Steel  1  2  1  4  Glass and ceramics  1  1  Electronics  1  3  2  3  4  5  1  3  4  2  28  Construction  2  1  3  2  8  Aviation  1  1  Tourism  1  1  Trade and consumer goods  2  2  Total  10  9  7  1  5  9  11  2  3  4  3  64 Moreover, the cross-validation method is used to construct prediction models, to avoid sample variability and minimize any biasing effect (Tam & Kiang, 1992). Specifically, this study considers a 10-fold cross-validation method, since this is the most commonly used strategy to examine the performance of classifiers. Further, it is based on dividing the whole dataset into 10 equal parts, from which 90% of the dataset is selected and used for model training, and the other 10% is used for model testing. Therefore, every subset is trained 9 times and tested once, and from this, the average prediction performance is obtained.

Variable definition and measurement
This study examined the factors influencing audit or litigation. Influencing factors were divided into inherent risk, control risk, detection risk, and legal environment.  Inventory growth rate (Average inventory for current period/average inventory for previous period) - 1 10 Sales growth rate (Sales revenue for current period/sales revenue for previous period) - 1 11 Operating profit ratio Operating profit ratio/sales revenue 12 Operating profit growth rate (Operating profit for current period/operating profit for previous period) - 1 13 Return on operating assets Operating profit for the past four quarters/average total assets 14 Annual  Figure 1. Decision trees were first proposed by Quinlan (1986) in applied machine learning for dimension reduction and categorical data algorithms. Decision trees are comprised of roots, nodes, branches, and leaf nodes. While creating the decision tree, attribute selection measures screen for variables suitable for classifying data; the selected variables can be seen as key influencing factors for data sorting. The advantages of creating a decision tree are that parameters do not need to be set and it is applicable for exploring knowledge and finding key variables. Association rules are also known as shopping basket analysis; these rules extract intercorrelated knowledge hidden within the data to help find algorithms for important factors more closely associated with dependent variables. Two key measures are used to calculate the strength of the associations between variables. The first is support, which is the percentage of the number of times an item appears in the data. The second is confidence, which is the prediction strength of the rule. For example, the support of A for B is the percentage of A ∪ B and the confidence of A for B is the ratio of A ∪ B to A. The variables included in the rules that meet the minimum support and confidence are considered key influencing factors.
To assess the performance of decision tree and association rules in feature selection, the extracted features were then input into a multilayer perceptron (MLP) neural network which is most widely used in the many predictions and forecasting domains (Tsai & Wu, 2008). Three steps were taken in the evaluation of feature selection performance in this study. First, the dataset including all features was used to train and test the MLP model as the basis for the evaluation. Second, the extracted features from the decision tree and association rules were separately used to train and test the MLP models for comparison. Third, the performances of each model including prediction accuracy, type I and type II error rates, and the feature extraction rate

Classification
Classification is one of the most important techniques in machine learning and is used to categorize the data to be processed according to attributes. A classification technique commonly used in prior research, the decision tree, was chosen for this study. This method was compared with the traditional statistical methods of logistic regression and discriminant analysis. As a classification technique, the decision tree uses a known example to create a tree-shaped structure and induce rules for the example. Advantages of a decision tree not provided by logistic regression and discriminant analysis include the creation of a decision table and the easy interpretation of the extracted rules. The efficient data processing complies with the objective of this study to create an easily understood and convenient litigation early warning model and decision table.
A decision table was used to present and analyze decision situations. The columns in Table 4 can be seen as conditions and actions, whereas the rows are test items. The conditions are factors related to the decision, and the actions are possible outcomes for the decision (for example, whether or not litigation will be taken against the auditor). The value for the corresponding subset is presented under each condition; each action input is distributed to the corresponding action. Therefore, each row in the decision table is a classification rule (Martens et al., 2008). The decision rules produced by the decision tree in this study can help auditors assess the risk of litigation and decide whether to accept an engagement or expand audit procedures to reduce the risk of litigation.

Model performance evaluation
For feature selection performance evaluations, the prediction accuracies, type I and type II error rates, and feature extraction rates of the two models created using feature selection tools were compared with the benchmark model made without feature selection. The method used to calculate prediction accuracy is shown in Table 5. This was the ratio of correct prediction data to total data (equation (1)). Type I errors are the incorrect rejection of a true null hypothesis. In the context of this study, this was the probability that the outcome where litigation would not be taken against the auditor was mistakenly classified as the outcome where litigation would be taken against the auditor (equation (2)). Type II errors are the incorrect acceptance of a false null hypothesis. This was the probability that the outcome where litigation would be taken against the auditor was mistakenly classified as the outcome where litigation would not be taken against the auditor (equation (3)). Analysis of variance (ANOVA) was also used to determine whether the differences in performance between the three models were significant. For classification technique performance evaluations, the prediction accuracies of the early warning models created using the decision tree, logistic regression, and discriminant analysis were compared. The evaluation methods were the same as those for the feature selection processes. The area under the receiver operating characteristic (ROC) curve (AUC) 1 (equation (4)) was also used to determine the predictive accuracy of the models. Equations for prediction accuracy and type I/type II error rates: (1) (2) 1 Sokolova and Lapalme (2009) pointed out that the ROC curve indicates the trade-off between the true positive rate and false positive rate for the performance evaluation of a binary variable. The ROC curves for each classification model can be drawn for comparison, where the AUC serves as the indicator for the models' performance. The AUC values range from 0 to 1. An AUC of 1 indicates a perfect model, an AUC between 0.5 and 1 indicates that the model is better than random guessing and has predictive value, and an AUC less than or equal to 0.5 indicates that the model is equivalent to random guessing and has no predictive value.

. Feature selection results
The results in Table 6 show that as the benchmark model without a feature selection tool, the average training and testing times were the longest; moreover, the large number of dependent variables caused interference which lead to the poorest prediction accuracy (74.167%), type I error rate (19.375%), and type II error rate (38.750%) among the three models. The 10 variables chosen using the association rules effectively reduced the average training and testing times, but this model had poor prediction accuracy (71.292%), type I error rate (18.094%), and type II error rate (49.938%). The 6 variables chosen using the decision tree effectively reduced the average training and testing times and had the best prediction accuracy (85.625%), type I error rate (7.250%), and type II error rate (28.625%) among the three models. In addition, ANOVA was used for comparative analysis of the performances. Table 7 shows that the performances of the decision tree, association rules, and benchmark models were significantly different. The decision tree model had the best performance, followed by the association rules and benchmark models. According to the above, feature selection reduced both the training and testing times as well as interference from an excessive number of variables, improving the model's accuracy, efficiency, and effectiveness (Questier et al., 2005;Sugumaran et al., 2007;Lin, Lu, & Tsai, 2019). Moreover, the accuracy rate of 85.625% confirms that the six factors including credit risk index, stock price fluctuation, client importance_firm, accounts receivable ratio, audit report lag, and CPA tenure extracted from the decision tree can be seen as key factors for determining auditors' litigation risk.
The first factor was the credit risk index 2 with the greatest information gain in the decision tree. Credit risk mainly measures the corporate risk of bankruptcy. Prior studies have found that the main reason for litigation against auditors was related to client bankruptcy or financial distress (Pierre & Anderson, 1984;Palmrose, 1987;Lys & Watts, 1994). The second key factor was stock price fluctuation; when a company's stock prices fluctuate greatly, the possibility for litigation against the auditor increases. Because stock prices are determined by the company's financial situation and negative information, failed shareholder investments often involve litigation against auditors (Carcello & Palmrose, 1994). The third factor was client importance_firm.
This study found that the importance of each client was a significant factor affecting litigation against auditors. The distribution of profits in Taiwanese audit firms is correlated to the contribution of each auditor's fees; therefore, auditors accept auditing cases with high risks of litigation to contribute more to the firm (Lee & Chen 2004). The fourth key factor was the accounts receivable ratio. The uncertainty of accruals may result in potential errors in assets evaluation or operational doubts; for example, underestimation of allowance for doubtful accounts, or manipulation of accruals to cover up financial difficulties (Francis & Krishnan, 1999). The fifth factor was audit report lag. This lag is defined as the time between the last day of the fiscal year and the day of an audit report. The possibility of fraud and manipulation increases as a company's financial situation worsens. To avoid the risk of litigation, auditors embellish auditing procedures, which lengthens the audit period (Bamber, Bamber, & Schoderbek, 1993). The final key factor was CPA tenure. Chen, Lin, and Lin (2008) and Lee and Lin (2005) found that the lengths of tenure of the firm and the individual CPA both influence audit quality positively. CPA tenure helps maintain the quality of financial statements; in consideration of the risk of litigation, auditors retain better clients to reduce risk.

Classification results
After analysis of the feature selection results, the six extracted features were used to create an early warning model for litigation against auditors and a decision table using a decision tree. These were then compared to logistic regression and discriminant analysis methods which use the same six extracted features to provide auditors with a reference when selecting clients. The performance assessment results are shown in Table 8. Inspection of the AUC values for the three models revealed that all AUC was greater than 0.5, indicating that all three models have predictive value. The decision tree model developed in this study had an accuracy of 92.708% and the type I and type II error rates were 1.563% and 18.750%, respectively. The performance of this model was superior to those created using logistic regression and discriminant analysis.  Table 9 shows the decision table with eight key rules for decision-makers to assess the litigation risk. Rule 1 in the decision table indicates that when the credit risk index is 9 3 , auditors have a high 3 TCRI rates on a scale from 1 to 9 rather than the international practice of using the English alphabet. Grading is relative; i.e., 1 is better than 2, 2 is better than 3, etc., and 9 is the worst. Scores of 7-9 mark the high risk group. These companies usually have had long-term losses, have broken even but have poor quality accounting information, or have broken even but have weak litigation risk. Thus, when a company faces worsening operations and is labeled a high credit risk, a financial crisis may lead to a lawsuit from financial structure and poor fluidity. Therefore, scores of 7-9 indicate high risk and high financial stress. Scores of 5-6 mark the moderate risk group. These companies usually have a stable financial structure but poor or unstable profits, or have good profits but a weak financial structure; these companies are less able to withstand financial downturns than companies with the top four scores. Scores of 1-4 mark the low risk group. These companies usually have stable profits and financial structures, maintain moderate to high fluidity, and are able to withstand financial downturns. Therefore, scores of 1-4 indicate low risk.

Warning model decision information
the investors against the management and auditors for compensation (Pierre & Anderson, 1984;Palmrose, 1987;Lys & Watts, 1994). However, Rule 8 indicates that when the credit risk index is less than or equal to 6 (i.e., a company labeled a low or moderate credit risk), the company's financials are more stable and credit risk and the risk of the financial crisis are low; thus, auditors have a low litigation risk. Rules 2-7 in the decision table consider multiple variables in determining whether a client is a high or low litigation risk. Rule 2 indicates that when the credit risk index is 7 or 8 and the stock price fluctuation is less than 0.37, the company has stable performance in the securities market as the stock prices have not caused large fluctuations; therefore, auditors have a low litigation risk. Rule 3 indicates that when the credit risk index is 7 or 8 and the stock price fluctuation is greater than 0.37, if the client importance_firm is greater than 0.52 (the client's fees account for no less than 52% of the firm's entire income), then auditors have a high litigation risk. The main reason for this is that auditors may rely on economic factors and lose their independence (Reynolds & Francis, 2000); auditors should avoid economic dependence on a single client which may affect their audit quality. Rules 4-7 indicate that when the credit risk index is 7 or 8, the stock price fluctuation is greater than 0.37, and the client importance_firm is less than 0.52, then the risk of facing litigation depends on the accounts receivable ratio, audit report lag, and CPA tenure. Rule 5 indicates that under the conditions above, if the accounts receivable rate is greater than 16.18 (i.e., the accounts receivable ratio account for over 16.18% of the total assets), and the audit report lag is over 84.5 days, then auditors have a high litigation risk. Rule 7 indicates that in the conditions above, if the audit report lag is less than 84.5 days and the auditor has been appointed for less than 5 years, then auditors also have a high litigation risk. In addition, Rules 4 and 6 indicate that auditors have a low litigation risk. The main discrepancy between these rules is in the influences of CPA tenure and audit report lag; when the client has a higher ratio of accounts receivable ratio, if the CPA has been newly appointed and requires more time to complete the audit, this indicates that the auditor has little knowledge of the company, which increases the risk of litigation (Pierre & Anderson, 1984). Thus, auditors should broaden the scope of audits to reduce this risk.

Results discussion
According to the above results, in addition to having better accuracy than logistic regression and discriminant analysis, the decision tree also provides a decision table that illustrates the associations and rules between important factors and the litigation risk.
Recently years have seen a growing trend of artificial intelligence (AI), especially machine learning, application in auditing (Perols et al., 2017;Bao et al., 2020). Global Big 4 public accounting firms are actively exploring adopting AI and machine learning techniques in their audit services. For example, KPMG is constructing an intelligent audit platform, Clara, which embodies cognitive and predictive technologies 4 . EY is embedding AI technologies in their audit process, especially applying AI to document reading and interpretation, adopting automation to improve audit efficiency, and using drones to assist inventory examination 5  PricewaterhouseCoopers (PwC) is building AI platforms to detect abnormal transactions in the general ledger, especially for cash-related accounts 7 . The reference created from the research also can be used as a supplementary tool when auditors evaluate litigation risk to reduce risk probability and prevent damage to the reputations of both the auditor and the audit firm.

CONCLUSION
After the recent series of fraud cases, investor protection mechanisms have urged the competent authority to revise the legal system to increase the legal liability of auditors, thus preventing fraud and litigation due to audit failure and improving audit quality and the degree of confidence in financial statements. Changes in the legal environment and laws have made it more difficult for auditors to assess whether to accept clients and to avoid indemnification and a damaged reputation due to litigation. The warning model and decision table for auditor litigation developed in this study can serve as a reference for auditors. Prior literature regarding litigation risk against auditors was reviewed to collect affecting factors for both audit risk-related and the legal environment. Feature selection was then used to extract critical variables, after which, categorization techniques constructed a representative and convenient warning model and decision table for auditors' reference.
Two procedures were used to construct the warning model. First, 39 influencing variables were collected from prior literature and a decision tree was used to select six key factors: credit risk index, stock price fluctuation, client importance_firm, accounts receivable ratio, audit report lag, and CPA tenure. Compared to the extracted variables using other selection tools, the variables chosen using the decision tree had higher accuracy and lower type I and type II error rates; therefore, these variables can be seen as key influencing factors for auditors' litigation risk. Second, the six extracted factors were used in a decision tree to construct a litigation warning model.. The results showed that the accuracy rate was 92.708% and the type I and type II error rates were below 10%; eight categorization rules were extracted which were compiled in a decision table.
This study contributed a new concept regarding the use of machine learning to determine the influencing factors of auditors' litigation risk. This study reviewed relevant literature and collected all influencing factors. Then, key affecting factors were extracted using feature selection methods, effectively increasing the prediction accuracy of the model. The warning model established using machine learning techniques in this study was higher than that for models created using logistic regression and discriminant analysis, demonstrating the value of applying machine learning techniques in other relevant areas.
Because the sample used in this study consisted of companies listed on the Taiwan Stock Exchange Corporation, the legal liability of Taiwanese auditors was weaker than that of auditors in developed countries. However, the complexity of auditor litigation is increasing. The factors extracted during feature selection can be seen as the key factors influencing the litigation risk for auditors in Taiwan and serve as a reference for other developing countries. Additionally, the objective of this study was to create a warning model for auditor litigation. The difference between this and prior studies predicting auditor litigation risk was that this study did not aim to improve model accuracy but to create an understandable classification model founded on rules for auditors to use when assessing auditrelated risk. According to the valuable information from the warming model, following audit planning strategies may improve audit quality and lower the litigation risk.
Finally, while this study collected all influencing factors for auditors' litigation risk from previous literature, some other factors may have been left out or measured using different methods. Therefore, future studies can include other variables to construct a more complete and accurate warning model. Affecting factors of litigation risk may exist in huge variations in different industries. Therefore, the issue of auditor litigation risk in various industries is interesting in future studies. Otherwise, the litigation cases are limited and there are not many samples of auditors being sued together in Taiwan. Future studies could extend the research period and increase the research sample to confirm the research results.