REVENUE FORECASTING FOR EUROPEAN CAPITAL MARKET- ORIENTED FIRMS: A COMPARATIVE PREDICTION STUDY BETWEEN FINANCIAL ANALYSTS AND MACHINE LEARNING MODELS

How to cite this paper: Kureljusic, M., & Reisch, L. (2022). Revenue forecasting for European capital market-oriented firms: A comparative prediction study between financial analysts and machine learning models. Corporate Ownership & Control, 19(2), 159–178. https://doi.org/10.22495/cocv19i2art13 Copyright © 2022 The Authors This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0). https://creativecommons.org/licenses/by/


INTRODUCTION
Digitalization and technological advances are substantially affecting and transforming capital markets. Benefits for investors include new investing opportunities in globalized markets, as well as improved access to real-time information and trading, resulting in cost and time savings (Gomber, Koch, & Siering, 2017). However, at the same time, requirements for investors to identify, review and assess all investment relevant information are also increasing. In this context, sell-side financial analysts act as information intermediaries between firms and capital market participants, and in this role, they reduce information asymmetry and contribute to market efficiency (Healy & Palepu, 2001;Schipper, 1991). Using information from publicly available and private sources, they prepare reports in which they issue various estimates such as expected earnings, expected revenues, or target prices (Ramnath, Rock, & Shane, 2008). This information facilitates investors to assess and modify their current and potential investment decisions (Palepu, Healy, & Peek, 2016).
However, analysts cover only a limited number of firms, and their valuation process is often characterized as a black box (Bradshaw, 2011;Ramnath et al., 2008). Furthermore, previous research identifies conflicting interests between analysts, investment banks, and covered firms that influence analyst forecast properties, suggesting a lack of financial analysts' independence (Lim, 2001;Lin & McNichols, 1998). For instance, financial analysts tend to bias their forecasts optimistically, in order to improve their access to management and to increase broker-trading volumes (Jackson, 2005;Lim, 2001). Moreover, reputational issues seem to influence analysts in their decision to publish certain forecast figures (Ertimur, Mayew, & Stubben, 2011;Ramnath et al., 2008) and various determinants appear to affect analyst forecast properties (Clement, 1999 Although expected earnings attract the most attention from investors and are regularly covered by analysts, expected revenues also play an important role in the investors' valuation process of a firm's value and prospects (Keung, 2010;Lorenz & Homburg, 2018). Analysts attempt to satisfy this need for information, and they have increasingly issued supplementary revenue forecasts in the past (Ertimur et al., 2011). In previous research, however, there are only a few studies related to analysts' revenue forecasts. These studies provide evidence of both value and contract relevance of analysts' revenue forecasts (Edmonds, Leece, & Maher, 2013;Keung, 2010;Rees & Sivaramakrishnan, 2007), and also of determinants that influence analysts' forecast accuracy and disclosure behavior (Bilinski & Eames, 2019;Ertimur et al., 2011;Lorenz & Homburg, 2018). Therefore, forecasted revenues are value-relevant on the one hand, but are not entirely independent on the other hand, and hence are limited in terms of accuracy, transparency, and objectivity. Accordingly, the question arises as to whether technological advances in the form of machine learning and predictive analytics are suitable for overcoming these limitations in analysts' revenue forecasts.
Predictive analytics is a term that has become increasingly present in the academic world as well as in practice (Cockcroft & Russell, 2018). The basic idea behind predictive analytics is to use actual and historical data to determine correlations that can be generalized and applied to future occurrences. In contrast to conventional, manual forecasts, predictive analytics automatically determines the influencing variables (Siegel, 2016). Therefore, it is particularly suitable for forecasts whose drivers are unclear so far. Another reason for the increasing interest in predictive analytics is the rapid development of processing power, which has doubled on average every eighteen months over the past decades (Mack, 2011). At the same time, the cost of data storage has halved, making it possible to process and analyze large and complex volumes of data (Dutta & Hasan, 2013). This development, known as "Moore's law", is accompanied by an accelerating pace of technological change which enables searching for complex patterns in everlarger data sets. Therefore, it is conceivable that more sophisticated patterns could not be identified so far due to lower technical possibilities, but can now be discovered and conclusions can be drawn from them (LeCun, 2019).
A variety of different algorithms and methods are subsumed under the term predictive analytics. In general, a distinction is made between supervised learning and unsupervised learning, each representing different learning methods. If the prediction model is trained based on historical values and the actual output is used as the target variable, it is classified as supervised learning. This learning method is suitable for recurring prediction problems that involve substantial data with high information quality (Jiang, Gradus, & Rosellini, 2020). By contrast, unsupervised learning does not include any historical results that are integrated into the forecast model during the training phase. This mainly applies to new, previously unknown prediction problems (Sutskever et al., 2015). Both learning methods are applicable in the context of machine learning or deep learning, to generate accurate predictions (Géron, 2019). Since a firm's financial statements provide a historical and publicly available database, supervised learning is particularly suitable for revenue prediction models. This enables the identification of complicated relationships in historical accounting data that can be used for predicting future developments.
Based on the weaknesses of analysts' revenue forecasts in terms of transparency, accuracy, and objectivity, combined with the theoretical advantages of predictive analytics models, the following research questions arise: RQ1: Can predictive analytics provide comparable or even better one-year-ahead revenue forecasts than financial analysts? RQ2: Can predictive analytics provide consistent and accurate results across industries and over time?
The research questions will be addressed in our study (Binz, Schipper,  Estimate System (I/B/E/S) and merging them with firm and macroeconomic data from Refinitiv Eikon and Eurostat for firms listed in EU blue-chip indices. The sample selection procedure yields a final sample of 3,000 firm-year observations from 2010 to 2019. By applying various predictive analysis models trained exclusively on publicly available real-world data, we aim to ensure practical relevance, as our results are replicable. Furthermore, we intend to show that through the use of predictive analytics, reliable one-year-ahead revenue forecasts work even without insider information of the firms and can compete with corresponding consensus I/B/E/S analyst forecasts.
The remainder of this paper is organized as follows. Section 2 reviews the literature that deals with analysts' revenue forecasts and the basics of predictive analytics. Section 3 presents the research methodology including sample selection, prediction quality measures, and model selection criteria. The main results are presented in Section 4, while Section 5 discusses the practical and scientific significance of the results. Finally, Section 6 provides conclusions.

Analysts' revenue forecasts
In recent decades, most research has focused on analysts' earnings forecasts (Beyer, Cohen, Lys, & Walther, 2010;Ramnath et al., 2008). However, analysts provide the capital market with additional forecast values. In particular, revenue has become the second most common forecast value after earnings (Ertimur et al., 2011;Lorenz & Homburg, 2018). Nevertheless, only a few studies deal with the topic of analysts' revenue forecasts.
Since revenue flows are a key element in investors' fundamental analysis (Keung, 2010;Penman, 2013) and are often included in valuation models, it is not surprising that these studies reveal the value relevance of analyst revenue forecasts. Swaminathan and Weintrop (1991) examine market expectations using value line forecasts of revenues and earnings, and they find, for a small sample of companies, that revenues embody incremental information content beyond earnings. In an extension, Ertimur, Livnat, and Martikainen (2003) observe that investors value a surprise in revenues more than a surprise in expenses around preliminary earnings announcements. They explain this result with the higher persistence of revenues in contrast to expenses. Further, Rees and Sivaramakrishnan (2007) also state that revenue forecasts are important in the investor's valuation process since the capital market rewards a firm that meets analysts' revenue forecasts separately from meeting earnings forecasts. Additionally, the related work of Keung (2010) documents that earnings forecast revisions accompanied by revenue forecast revisions cause greater capital market reactions than stand-alone revisions of earnings forecasts. Thus, regarding these results, analyst revenue forecasts provide supplementary information beyond other forecast values, that are incorporated by financial market actors, particularly investors.
Besides the question of value relevance, Edmonds et al. (2013) examine the question of whether analysts' revenue forecasts are also contractrelevant in terms of management compensation. They confirm this link between value and contract relevance, as they find that CEOs receive smaller bonus payments when they fail to meet analysts' revenue expectations. Furthermore, research frequently raises the question of why financial analysts publish revenue forecasts as a supplement to earnings forecasts. In this context, Mest and Plummer (2003) detect, in an early study, that analysts' optimistic bias is smaller for revenue forecasts than for earnings.
They conclude that optimistically forecasted revenues are less suitable for analysts to gain or improve access to a firm's management and private information. In line with these results, Hunt, Sinha, and Yin (2012) state that analysts' optimistic bias and forecast errors decrease when issued earnings forecasts are disaggregated into revenues and expenses, and furthermore, this effect depends on the persistence of disaggregation. Ertimur et al. (2011) identify reputational reasons for an analyst's decision to accompany earnings forecasts with forecasted revenues, revealing that more reputable analysts are less willing to publish supplementary revenue forecasts. They argue that lesser-known analysts benefit from publishing disaggregated earnings forecasts since they can highlight their abilities in order to build up their reputation and improve their career prospects. By contrast, established analysts face greater reputation costs than gains from disaggregated earnings forecasts, as inaccuracies in the forecasts are easier to identify. He and Lu (2018) support these findings. Furthermore, they document that mandatory IFRS adoption improves the information environment of analysts, resulting in more accurate forecasts, less reputation-damage risk, and thus, more supplementary revenue forecasts. Bilinski and Eames (2019) expand these results and show that the decision to issue disaggregated earnings forecasts also depends on the underlying revenues and expenses quality.
In a different study, Lorenz and Homburg (2018) examine the determining factors of analysts' revenue forecast accuracy. They note that accuracy depends primarily on forecast and analyst characteristics, including forecast frequency and analysts' forecasting experience. In addition, they find that analysts with poor forecasting performance are more likely to stop forecasting revenues because they attempt to avoid negative consequences for their future career.
Moreover, there is a limited amount of research focusing on the development of revenue prediction models. In an early work, Nissim and Penman (2001) incorporate a mean reversion effect in their forecast of percentage revenue growth. Fairfield et al. (2009) further develop this approach by including industryspecific mean reversion effects. In a different model, Curtis et al. (2014) estimate future revenues of retail companies, distinguishing between revenue growth in sales-generating units and growth in revenue per unit. Their model provides revenue estimates for a sample of 87 firms that are, firstly, more accurate compared to mean reversion models, and secondly, almost as good as analysts' revenue forecasts.
To summarize the previous literature, revenue forecasts seem to be value-relevant for the capital market, particularly for investors, and contractrelevant in terms of management compensation. However, the decision of analysts to issue supplementary revenue forecasts depends on reputation incentives at the analyst level. Furthermore, various determinants seem to influence the accuracy of revenue forecasts published by analysts. Despite these identified limitations in analysts' revenue forecasts, only a few studies attempt to develop alternatives. These alternatives mainly focus on either mean reversion effects or specific industries.

Using predictive analytics for performance measures
Previous research in the field of predictive analytics focuses mainly on predicting earnings rather than revenues. Although these studies indicate that machine learning models can provide accurate predictions (Binz et  , even though there are many facts in favor of forecasting revenues rather than earnings. On the one hand, revenues are less susceptible to window dressing, since earnings are influenced by significantly more accounting standards (Lin et al., 2014), which offer firms many implicit and explicit accounting options to manage their earnings. On the other hand, current studies show that the relevance of revenues has significantly increased in recent years for investors, while the relevance of earnings has decreased (Barth et al., 2021;Chandra & Ro, 2008). This could be explained by the increasing mismatch between revenues and costs, which is due to the non-capitalization of intangibles, resulting in a lower quality of reported earnings (Lev, 2018;Srivastava, 2014). Therefore, earnings forecasts alone cannot predict the firms' long-term development and should rather be supplemented by revenue forecasts.
The forecasting of performance measures can be considered as part of empirical accounting research. In contrast to descriptive or explanatory studies, forecasting studies analyze whether input variables can be used to predict a particular output variable. Forecasting studies are usually an iterative process, with the aim of finding the best possible forecast model for the given prediction problem (Ding, Lev, Peng, Sun, & Vasarhelyi, 2020). This form of application and design-oriented research has its origins in information systems and is known as design science research (Hevner, March, Park, & Ram, 2004). The aim of this research is to develop an artifact that is capable of effectively and efficiently solving an existing problem (Peffers, Tuunanen, Rothenberger, & Chatterjee, 2007). In the context of forecasting studies, the development of prediction models can be considered as an IT artifact creation (Gregor & Hevner, 2013). This research process has numerous iterations as multiple IT artifacts (prediction models) are created and evaluated in order to find the best possible one (Kogan, Mayhew, & Vasarhelyi, 2019). In the context of predictive analytics, there are plenty of forecasting models that are basically applicable and comparable for predicting performance measures. The following Figure 1 illustrates how revenue forecasts can be categorized as design science research and for which environmental factors and foundations must be taken into account. The objective of predictive analytics is to determine relationships between input variables, based on patterns in data sets that can be used to forecast future developments. For this purpose, machine learning, as well as deep learning methods, can be considered (Ongsulee, 2017 Design cycle subsumes self-learning algorithms that are capable of learning correlations without programming the decision rules explicitly. The current and potential areas of application of machine learning include:  Tasks that have previously been handled by a multitude of rules.
 Complex problems that conventional methods can only solve inadequately.
 Tasks that require considerable adaptability.  Extraction of knowledge from large data sets (Géron, 2019).
Deep learning represents a current further development of machine learning and can be used for similar tasks. With the increasing complexity of data sets, the requirements for reliable prediction increase. Multi-layer neural networks, known as deep learning, are often used for pattern recognition and forecasting of complex correlations (Goodfellow, Bengio, & Courville, 2016). Neural networks are based on the structure of the human brain and consist of several interconnected units, the neurons. These are modelled on the biological arrangement of the human brain and can recognize complex patterns independently (Dongare, Kharde, & Kachare, 2012). Due to their connectionist architecture, especially non-linear and nonmonotonic relationships can be identified. Deep learning has had its major research breakthroughs in recent years in image, text, and speech recognition (Schmidhuber, 2015).
Both machine learning and deep learning have a wide range of applications and can be used for classification or regression tasks. If historical data with high data quality are available, supervised learning is the method that can be applied (Ghassami, Khodadadian, & Kiyavash, 2018). By contrast, unsupervised learning is used for new problems for which historical data are not available, or the circumstances have changed in such a way that it is no longer comparable (Raschka & Mirjalili, 2019). With regard to the data in accounting systems, it can be noted that they are mostly ruleoriented and structured, including a good history (Borthick & Pennington, 2017;Hopwood, 1972). Due to the sufficient data availability in accounting systems, supervised learning methods are particularly suitable. As part of this learning method, a variety of different input features are provided to the prediction model, so that the model can learn correlations relating to the actual output. Afterwards, the robustness of the model is tested on unseen data, for which the prediction model does not know the actual output. Depending on whether the problem is one of regression or classification, different metrics need to be applied to evaluate the robustness. Only if the prediction model performs well on the training and test data, it can be considered as robust.
In general, any performance measure can be forecasted, if data are available with high information quality and if these data correlated with the forecasting problem. If the forecast aims to predict only a rise or fall, classification models are suitable. By contrast, regression models can be used for point estimations (Géron, 2019). Many prediction models can be used for classification and regression problems. For example, neural networks can provide point predictions or be augmented with a softmax function to classify the output into certain classes, based on the probability distribution (Goodfellow et al., 2016).

Sample selection and variable selection
Our sample consists of firms that were listed in a EU-15 country blue-chip index for at least one year between 2010 and 2019. We focus on blue-chip firms, as they typically represent leading companies in their country. They attract the attention of both investors and analysts and therefore should be covered by an adequate number of analysts. The sample period is chosen since it reflects a period of relative economic stability in the EU. Thus, we assume that a firm's business development and corresponding analyst estimates are less affected by economic volatility. Finally, the 15 EU countries included are those that adopted the IFRS for the preparation of consolidated financial statements on a mandatory basis at the time of its introduction in the EU in 2005.
Starting with 699 firms, in a first step cross-listed firms are assigned to their country of domicile. Next, firms with non-December 31 fiscal year ends are excluded. Firms that were first listed on a stock exchange after 2010, or whose listing was discontinued during the sample period, are also omitted because they do not provide data for the entire sample period. Furthermore, we match the remaining firms with the consensus mean revenue forecasts 1 from the I/B/E/S. We only consider revenue forecasts submitted by April 30 of the respective fiscal year, in order to establish a temporally comparable information basis between analysts and our models, since the models only are based on data from the previous fiscal year. In the last step, we merge the firms with the corresponding revenue values and additional financial statement information using Refinitiv Eikon. The additional firm-specific variables comprise all Refinitiv Eikon items concerning the consolidated income and cash flow statement, as well as the consolidated balance sheet, shifted back by one year. Excluding firms with missing data, the final sample consists of 300 individual firms providing consensus revenue forecasts and financial statement information for each sample year, resulting in a total of 3000 firm-year observations. Table 1 reports on the sample selection process in detail and lists the 15 EU countries considered. Besides the data obtained at the firm level, we add several macroeconomic variables at the country and EU level from Eurostat, also shifted back by one year. These variables include, for example, information on the gross domestic product (GDP), unemployment rate and inflation trend at the country level, or interest rates at the EU level. These macroeconomic variables are used within the study as they measure economic developments in European domestic markets that impact firmspecific revenues but are not directly reflected in firms' fundamentals. Furthermore, we categorize the firms according to the 12-industry classification scheme by Fama and French (2021), so as to incorporate industry-fixed effects.

Prediction quality measures
This study includes four quality measures in order to assess and compare the prediction quality of revenue prediction models with the prediction quality of analysts' revenue forecasts. The focus thus lies on prediction accuracy.
Mean absolute percentage error (MAPE ̅̅̅̅̅̅̅̅ ): One of the most common measures for evaluating evaluate prediction accuracy is the (MAPE ̅̅̅̅̅̅̅̅ ) (Gneiting, 2011;McKenzie, 2011). Using the absolute percentage errors between predicted values and observed values, it is calculated as follows: ) 100 (1) where, Rev i is a firm's actual revenue figure in a given fiscal year and where Rev î is the corresponding prediction value. ) th variate, when i=N and N is odd.
) th variate) , when i=N and N is even. (2) Median symmetric accuracy ( ): Since the measurement of absolute percentage errors in equation (1) and equation (2) is asymmetric with respect to over-forecasting and under-forecasting (Makridakis, 1993;Tofallis, 2015), the median symmetric accuracy ( ) in accordance with Morley, Brito, and Welling (2018) (Legates & McCabe, 1999). Therefore, it is a useful measure for analyzing how well a model fits the observed data. Its values range between 0 and 1, with values close to 1 indicating a better fit. The formula is as follows: All four prediction quality measures are applied to the analysts' revenue forecasts (Rev Analyst ) obtained from I/B/E/S, and the results are shown in Table 2. In addition, Table 2 lists the descriptive statistics of the actual revenue values (Rev) and Rev Analyst for the sample period. The latter reveals a mean of €18.6 bn. for Rev (Rev Analyst = €17.6 bn.) and a median of €6.4 bn. for both the actual and the predicted revenues. Moreover, the lowest value in the actual data is €2.5 mil. (Rev Analyst = €3 mil.) and the highest €363.2 bn. (Rev Analyst = €359.7 bn.).
Considering prediction accuracy, MAPE ̅̅̅̅̅̅̅̅ Analyst is 59.48% and MAPẼ Analyst is 5.21%. Since the median symmetric accuracy only deviates slightly from MAPẼ Analyst , these results suggest the presence of outliers in the analysts' revenue forecasts compared to the actual revenues. The final measure, R Analyst 2 , is 95.42%, indicating a good fit between forecasts and actual revenues. In the following sections, these computed outcomes are taken as reference values for evaluating the prediction quality of the revenue prediction models included in this study.

Model specifications
The revenue forecast requires an artificial adaptation of the data set to ensure that the prediction models can learn correlations in the data and be tested on previously unseen data. For this purpose, the data is divided into a training and a test data set. Previous studies have shown that the best possible results are achieved when 80% of the entire data set is used for training purposes and 20% for testing purposes (Mohanty, Hughes, & Salathé, 2016; Rácz, Bajusz, & Héberger, 2021). However, the appropriate splitting is still contextdependent and correlations in the data need to be checked before (Xu & Goodacre, 2018). In our study, we compared k-fold cross-validation with different train/test splits (80/20, 70/30, 60/40, 50/50). The most accurate revenue predictions could be obtained by using 80% of the data for training purposes. Since revenues are recalculated each year and are independent of the previous year in terms of their accounting, a rolling forecast was not used in this study. Instead, a random split is used to find patterns in the data, that are independent of a time series.
Before the input data can be transferred to the forecast models, it is necessary to make futher adjustments. These adjustments, also known as data preprocessing, aim to optimally prepare the data set for making predictions (García, Luengo, & Herrera, 2015; Kotsiantis, Kanellopoulos, & Pintelas, 2006). The data preprocessing steps are shown in Figure 2, based on relevant practical manuals for machine learning, including Géron (2019) and Chollet (2018). In addition to the general description of the individual steps, Figure 2 also contains specific actions that we conducted and addresses potential risks.
After data preprocessing, the question arises as to which forecasting models should be applied. In principle, regression models that provide point estimates are suitable for predicting revenues in the following financial years. In such cases, linear regression is suitable as a baseline model, for generating forecasts based on previous revenues and representing a minimum benchmark for forecasting accuracy. In addition, different machine learning and deep learning models can be used to identify the best possible prediction model. According to Jiang et al. (2020), decision trees, random forests and neural networks are commonly used for supervised regression problems. Therefore, these prediction models are also considered for our forecasting problem. Furthermore, other forecasting models are applied, which have already been successfully used in research. These include lasso (Tibshirani, 1996) and ridge regression (Hoerl & Kennard, 1970), as well as gradient boosting regression (Friedman, 2001), CatBoost (Dorogush, Ershov, & Gulin, 2018), LightGBM (Ke et al., 2017) and XGBoost (Chen & Guestrin, 2016). This leads to ten prediction models, which are compared with each other in this study. Except for linear regression, all prediction models have hyperparameters that can be tuned to improve robustness and prediction accuracy. By using grid search, we aim to identify the best possible configuration for each prediction model within the predefined grid. This approach is consistent with other scientific studies in machine learning but has the disadvantage that a better optimum may exist outside the specified grids (Bergstra & Bengio, 2012). To minimize the likelihood of such circumstances, grids are used that have already been applied in previous studies and have shown good results 2 .

RESEARCH RESULTS
Identifying the most appropriate model for revenue forecasting is an iterative process that ranges from model selection and optimization to model rejection (Cawley & Talbot, 2010). As mentioned above, this empirical accounting research is part of design science research. The use of machine learning in accounting is still a new area of research with little evidence on the suitability of predictive models for specific problems (Bertomeu, 2020). In particular, it remains unclear whether one predictive model outperforms the others under all circumstances (Amani & Fadlalla, 2017). By comparing models with numerous different algorithms with the same training and test data, insights into their general applicability for revenue forecasting can be derived from our study. This comparative approach is further supported by the advantage of IFRS in providing comparable financial information across industries and over time (Yip & Young, 2012).
The results of the selected prediction models regarding the prediction quality measures from subsection 3.2 are presented in Table 3. For comparability, Table 3 also includes the prediction quality measures of the financial analysts.
Starting with a multiple linear regression as a baseline model, this model assumes a linear relationship between the dependent variable and the explanatory variables. It uses the ordinary least squares (OLS) approach to estimate the model parameters. The results show a comparable MAPE ̅̅̅̅̅̅̅̅ (57.51%) and R 2 (95.26%) to the financial analysts' predictions, but a considerably worse MAPẼ (12.53%) and (12.53%). Therefore, the model appears to estimate outliers that deviate considerably from the actual revenues, similar to the financial analysts. In addition, ridge and lasso regressions are performed to enhance the prediction quality of the multiple regression model. As the (untabulated) results for the quality measures even deteriorate, the application of a linear regression model to the training and test data set does not seem to improve prediction quality in comparison to financial analysts.
In the next step, decision-tree-based models are used to predict firms' revenues. We focus on two models, starting with a single decision tree. This model follows a hierarchical structure and starts with an initial root node from which the tree splits into several branches that terminate in new nodes. Each node represents a subset of the initial data set and the split follows the underlying algorithm. In the applied decision tree model, this algorithm comprises the classification and regression trees (CART) algorithm. The terminal nodes at the end of a branch also called leaves, represent the predicted outcomes. The results in Table 3 document significant improvements for most of the prediction quality measures, compared to the linear regression model. MAPE ̅̅̅̅̅̅̅̅ , MAPẼ , and decrease to 17.86%, 9.86%, and 10.25%, respectively. In comparison to the analysts' predictions, the decision tree yields a much smaller MAPE ̅̅̅̅̅̅̅̅ . However, the improved median measures are still lower than those of the analysts. Considering R 2 , the value of 90.7% is still good, but lower than for the linear regression and for the financial analysts.
Based on these mostly improved results, a random forest is performed as an extension of the decision tree model. A random forest differs from a decision tree insofar as it ensembles a large number of individual decision trees. The terminal prediction value then corresponds to the most frequently predicted value in these individual decision trees. Thus, it benefits from this large Step 2: Handling-missing values Imputation procedures that replace missing values can increase the information quality of data sets. Implementation in our study: Checking whether missing values can be supplemented by estimates. Risks: Identification of incorrect patterns due to inappropriate imputation procedures.
Step 1: Data vectorization With correct mathematical coding of categorical variables, machine readability is ensured without reduction of information quality. Implementation in our study: Express data by binary vectors. Risks: Incorrect coding can significantly reduce the information quality of the data set.
Step 3: Standardization The standardization and scaling of input variable allows a homogenous consideration of all input features, as well as efficient learning of patterns. Implementation in our study: Scaling numbers using a MinMaxScaler. Risks: Downstream problems in the interpretation of the results.
Step 4: Feature extraction By using statistical methods for dimensionality reduction and extraction of new input variables, the computation time can be decreased, whereas the information quality of input features can be increased. Implementation in our study: Using Pearson coefficient for multicollinearity and feature target correlation checks. Risks: Reduced information quality of the data set due to excessively high dimension reduction.
number of decision trees and their predictions. The random forest model applied improves all prediction quality measures. MAPE ̅̅̅̅̅̅̅̅ is 13.29%, MAPẼ is 6.69% and is 6.85%, indicating that the predictions are affected less by outliers. Furthermore, MAPE ̅̅̅̅̅̅̅̅ is considerably lower than in all other models and, in particular, lower than the MAPE ̅̅̅̅̅̅̅̅ of the financial analysts. The R 2 of 96.19% is also slightly greater than the analysts' R 2 . However, the median measures deviate about 1.5 percentage points from the corresponding analysts' values, but they are the lowest values of all prediction models applied. Considering these results, the random forest model provides competitive revenue forecasts.
The next model uses deep learning techniques to forecast firm revenues. As described in Section 3, multi-layer neural networks are inspired by the structure of the human brain, for which the network consists of artificial neurons. It is particularly good at recognizing patterns in big data sets. Using these advantages of neural networks, the results show a MAPE ̅̅̅̅̅̅̅̅ of 31.4%, a MAPẼ of 12.37%, a of 12.85%, and R 2 of 95.44%. Thus, MAPE ̅̅̅̅̅̅̅̅ is better than the analysts' and linear regression's MAPE ̅̅̅̅̅̅̅̅ , but it is much lower than in the decision-tree-based models. The median measures, in turn, are at the same level as those in the linear regression. Thus, they are lower than the corresponding measures of the decisiontree-based models and the analysts. However, 2 is slightly higher than for the linear regression as well as for the financial analysts, but lower than for the random forest. Derived from these results, the neural network provides better results than the linear regression model, but the predictions are not as accurate as those of the financial analysts and decision-tree-based models, in particular those of random forest. For the sake of transparency, the used hyperparameters of the respective predictive analytics model for the revenue prediction are listed in Table A.2 (see Appendix). In addition, Table A.3 includes the feature importance of the models, showing that they use varying, but mostly firm-specific variables, with last year's revenue being the most important factor in all models. The last group of prediction models in the analyses uses boosting algorithms, namely gradient boosting, XGBoost, LightGBM, and, as the most recent algorithm, CatBoost. These boosting algorithms have in common that they rely on ensemble learning methods combining individual models sequentially and incorporating in each iteration the error of the last iteration (Bentéjac, Csörgő, & Martínez-Muñoz, 2021). Although these boosting algorithms differ in their techniques, they optimize the prediction by learning from mistakes and, theoretically, this should reduce the error between predicted and actual revenue values (Mayr, Binder, Gefeller, & Schmid, 2014). Comparing the median measures in Table 3 with those of the linear regression and the neural network, the boosting models perform better, while XGBoost has the lowest MAPẼ (7.67%) and (8.03%) of all boosting models. Regarding MAPE ̅̅̅̅̅̅̅̅ , the results are more heterogeneous, e.g., MAPE ̅̅̅̅̅̅̅̅ is 53.35% in the gradient boosting model, but decreases to 17.17% in the XGBoost model. Considering the final prediction quality measure 2 , the results in Table 3 indicate a good fit of all boosting algorithm models with values ranging from 95.42% (XGBoost) to 96.76% (LightGBM). Thus, although the 2 is higher than the analysts' 2 , all other prediction quality measures are lower than those of the analysts. Furthermore, the boosting models perform partly better than the other models, but their results cannot compete with the prediction quality of the random forest.
Therefore, and across all prediction models used to forecast firm revenues, random forest performs best with regard to the four prediction quality measures. Furthermore, it is partially superior to the financial analysts, in particular, its MAPE ̅̅̅̅̅̅̅̅ is considerably lower. However, the other prediction models also yield prediction quality measures that are at or above the level of the financial analysts. With regard to our first research question, we conclude that the application of predictive analytics methods provides qualitatively comparable revenue predictions for a large number of firms that are as accurate as, or even more accurate than the financial analysts' predictions.
Moreover, Table A.4 provides supplementary analyses, that examine the prediction power of the models at the industry level. The results show comparable or better prediction quality measures, than the analysts, for the majority of industry sectors, e.g., for finance & insurance companies. Except for a few industry sectors, e.g., manufacturing, the findings confirm those found for the complete data set and therefore demonstrate the industryindependent applicability of machine learning models. In addition, Table A.5 shows that robust forecasts can also be provided over annual time periods. For each year, from 2010 to 2019, the random forest provides comparable or even better forecasts, than financial analysts. Even with changes in accounting standards (mandatory application of IFRS 15 in 2018), it still provides reliable forecasts, that are only slightly less accurate than those of financial analysts. Regarding to our second research question, we thus conclude that predictive analytics can be successfully applied across different industries and over a longer period of time.
Our results are supported by previous studies that also use accounting data for different forecasting purposes and in which the random forest also outperforms other algorithms. These studies deal with cash flow forecasting (

DISCUSSION OF THE RESULTS
Several implications for financial analysts, investors, and researchers emerge from these results. As previous studies point out the value relevance of revenue forecasts for the capital market (Keung, 2010;Rees & Sivaramakrishnan, 2007), investors demand reliable revenue forecasts to review and make their investment decisions (Palepu et al., 2016). For this purpose, they usually rely on financial analysts' predictions, but, among other things, the number of revenue forecasts is limited by reputational concerns of the analyst (Ertimur et al., 2011;He & Lu, 2018), and the forecasts may be biased by individual analyst's characteristics (Lorenz & Homburg, 2018). Furthermore, the forecasting process of financial analysts is, due to the black box character, to some extent, intransparent (Bradshaw, 2011;Ramnath et al., 2008). The results show that machine learning is useful for overcoming these constraints. As the employed models incorporate only publicly available information, they increase the transparency and objectivity of the forecasting process. This is accompanied by greater independence of the forecasting process since the models do not rely on access to private information from management. However, the absence of private information does not affect prediction quality, in particular, in the random forest forecast. Thus, this study demonstrates that machine learning is able to compensate for this advantage of financial analysts. Separating the forecasting process from the analyst's characteristics, such as forecasting experience, further enhances independence. The study also shows that machine learning embodies the ability to increase the number of predictions, as well as to provide investors with accurate predictions for a large number of firms across all industries at the same time. Therefore, investors benefit from predictive analytics methods and machine learning in terms of improved transparency, objectivity, and quantity of revenue forecasts. They are a time-saving alternative to traditional financial analysts and are helpful for reducing information asymmetry between firms and potential or current investors, thus promoting capital market efficiency.
Besides investors, financial analysts also gain from the findings in several ways. Since this study demonstrates that machine learning predicts revenues at a comparable level to financial analysts, they can use them as baseline predictions. The financial analysts can then either use them as a starting point for their own prediction process or review their forecasted values with those of machine learning to validate or revise the initial prediction.
Ideally, they incorporate different predictive analytic methods into their prediction model to benefit from their predictive power. Moreover, it can be expected that the prediction models achieve performance improvements due to the additional private information available to the analysts. The additional disclosure of the predictive analytics methods used within the analyst report further increases the transparency of the analyst's estimation process. Similar to investors, financial analysts may also benefit from time savings by using machine learning, as it can reduce the preparation time of their forecasts. This would additionally enable them to increase their coverage. Both improvements in prediction quality and an increased number of covered firms, represent potential factors that positively contribute to an analyst's reputation and credibility. Eventually, these have positive impacts on their career prospects. However, analysts need to develop or improve their skill set to take advantage of these benefits. These skills include understanding predictive analytics methods and models, the interpretation and validation of the obtained results, and as crucial points, programming the models and managing the database. Furthermore, it is important to keep an eye on current developments at both hardware and software levels, which can positively impact the prediction process.
Moreover, our results show that predictive analytics is an essential part of empirical accounting research. The application of machine learning algorithms to accounting datasets enables verifying, disproving, as well as learning new correlations. In contrast to descriptive or explanatory accounting studies, forecasting studies focus on future developments (Bertomeu, 2020;Shmueli, 2010). Forecasting revenues can be interpreted as a starting point for exploring more complex issues in accounting with modern algorithms. Earnings, for example, are influenced by many more factors and have significantly more scope for accounting policy, making it much more challenging to predict (Lev, Li, & Sougiannis, 2010). Due to the higher granularity of earnings, revenue forecasts can be used as an input feature for a prediction model, which forecasts companies' annual results. Furthermore, our results can be used to verify whether the random forest also provides the best predictions for comparable research questions. However, the number of available algorithms is continuously increasing, so that it is possible for a new algorithm to disprove the superiority of a prediction model in a relatively short time. Therefore, future research should include novel algorithms for prediction studies that have not yet been developed.

CONCLUSION
This study examines the potential of recent developments in the field of predictive analytics and machine learning as to improve annual revenue forecasts. Based only on publicly available firmspecific and macroeconomic data, the comparative analysis of various prediction models shows that the applied models simultaneously provide comparable or even more accurate forecasts than those of sell-side analysts. In particular, the random forest performs best out of the 10 models included in the analysis, resulting in a much lower ( ̅̅̅̅̅̅̅̅ ) of 13.29% compared to the analysts' ( ̅̅̅̅̅̅̅̅ ) of 59.48%. These results indicate the predictive power of machine learning, and several implications can be derived for the work of analysts and researchers, as well as the decision-making process of investors. Analysts are encouraged to take advantage of this power to improve their prediction processes, leading to more and better predictions. Investors, therefore, benefit from more transparent forecasts that are less biased by analyst characteristics, e.g., forecasting experience or access to a firm's management, which facilitate their investment decisions and reduce information asymmetry. For both analysts and investors, this results in time and cost savings and may increase capital market efficiency. Moreover, this study points out the positive effects that predictive analytics may have on empirical accounting research. In addition to the prediction of future revenues, these effects mainly include new approaches and methods for processing and analyzing large datasets in order to identify patterns in accounting data. This allows researchers to examine new research questions, as well as to readdress existing research, e.g., earnings management.
However, this comparative analysis is subject to some limitations. First, the underlying sample consists only of constituents of the blue-chip indices from 15 EU first-time adopter countries of IFRS. Therefore, it does not cover the full variability of firms within these countries or the EU, which impedes the generalization of results and drawing of inferences. Second, the study covers the years from 2010 to 2019 an economically stable sample period, and thus does not yield conclusions on the performance of predictive analytics models in times of challenging economic events, e.g., financial crises. Third, although the analysis considers a broad range of different predictive analytics models and machine learning techniques, new algorithms may already exist or are being developed that generate even more accurate predictions. Furthermore, it cannot be excluded that with other hyperparameters that are beyond our specified grids, better results could be achieved. Consequently, future research should focus on expanding the underlying dataset in terms of firms, countries, and time periods, as well as incorporating recent developments in the field of predictive analytics and machine learning.
In summary, this comparative prediction study demonstrates that predictive analytics and machine learning provide powerful models for predicting future revenues and indicates their benefits for various users of accounting data. Furthermore, these are expected to increase over time, due to anticipated improvements and developments at the software and hardware levels.