Volume 31, Issue 3 pp. 658-680
Original Article
Full Access

How Do Auditors Address Control Deficiencies that Bias Accounting Estimates?

Elaine G. Mauldin

Elaine G. Mauldin

University of Missouri

Search for more papers by this author
Christopher J. Wolfe

Christopher J. Wolfe

Texas A&M University

Search for more papers by this author
First published: 07 August 2013
Citations: 10
Accepted by Alan Webb. We sincerely thank the participating firm for providing participants and expert insights into the audit process. We are grateful for the comments of Alan Webb and the two anonymous reviewers. We also thank Vairam Arunachalam, Michelle Diaz, Dave Farber, Erin Hamilton, Scott Jackson, Steve Kaplan, Brian Fitzgerald, James Hunton, Tom Omer, and Texas A&M workshop participants. Christopher Wolfe gratefully acknowledges the Mays Business School for providing financial support while completing this research.

Abstract

Auditors commonly rely on reviewing management's estimation process to audit accounting estimates. When control deficiencies bias the estimation process by creating omissions of critical inputs, standards require that auditors replace or supplement review of management's estimation process with tests that can identify the omissions. Importantly, overreliance on reviewing management's estimation process when it has been biased by a control deficiency can result in auditor acceptance of an inappropriate accounting estimate. We use an experiment to examine whether auditors recognize the insufficiency of increased sampling of a biased estimation process and their selection of alternative tests to replace or supplement review of the biased estimation process. We find that a significant minority (33 percent) of Big 4 senior auditors erroneously increase tests of management's biased estimation process. We also find that auditors have difficulty selecting alternative tests to replace or supplement review of management's biased estimation process, frequently choosing tests that are either ineffective or inefficient. Our findings suggest that auditors often reach inappropriate judgments about the capability of audit evidence to address control deficiencies and that nonsampling risk (judgment risk) may be a larger risk than auditors realize.

1 Introduction

The well-established relation between internal control deficiencies and audit evidence appears straightforward. When auditors identify significant internal control deficiencies, they modify substantive tests to address the increased risk of material misstatement (PCAOB 2007, ¶B6; IAASB 2008a, ¶A46; AICPA 2006b, ¶121; AICPA 2006c, ¶¶70-74). Yet, audit inspection reports indicate that auditors often fail to appropriately modify substantive tests when ineffective controls are discovered (PCAOB 2008), and audit partners we interviewed agree that auditors often have difficulty modifying substantive tests when responding to identified control deficiencies. To shed light on the underlying reasons for this difficulty, we design a contextually rich experimental case and examine how auditors map a control deficiency into modifications of substantive tests.

Audits of accounting estimates provide the context for our study. Accounting estimates comprise much of the quantitative information in financial statements and represent an important component of auditor judgment and decision making (Griffith, Hammersley, and Kadous 2013; Peecher, Solomon, and Trotman 2010). In addition, audit inspections frequently attribute audit errors in accounting estimates to overreliance on incomplete or inaccurate management processes, suggesting a practical context where auditors fail to appropriately modify substantive tests for control deficiencies (PCAOB 2010b, 2008). We examine control deficiencies that cause errors of omission in an estimation process, resulting in an incomplete and biased estimation process. Our focus is on whether auditors recognize the insufficiency of reviewing the biased estimation process and how they select alternative tests to replace or supplement such review.

We first analyze auditor judgments about the insufficiency of increased review of management's biased estimation process (increasing sample size) following a control deficiency. When control deficiencies cause omissions in management's estimation process, standards and research indicate that testing within the process cannot identify omissions from the process (AICPA 1980, ¶17; Bell, Peecher, and Solomon 2005, 28; Griffith et al. 2013). Since required internal control deficiency documentation should identify the bias in the estimation process and auditors have well developed knowledge structures for internal control errors, we expect that most auditors will recognize that increasing sample size is insufficient (PCAOB 2007; Zimbelman 1997). However, psychology research finds that over 20 percent of people select biased evidence sources even after they are told of the bias (Soll 1999). Assuming that auditors have evidence beliefs in keeping with the general population, we expect that a significant minority of auditors make a similar mistake and judge that increasing sample size is sufficient.

We also expect that seeing substantive test results from reviews of the biased estimation process influences auditors' tendency to mistakenly increase such review. Standards allow “dual-purpose” tests that combine substantive tests and tests of control (IAASB 2008a, ¶22; AICPA 2006c, ¶33). As such, the potential exists for auditors to see substantive test results before addressing the control deficiency. Substantive tests from reviewing the biased estimation process generate falsely favorable results, because the tests will agree to the flawed estimate produced, but neither the process nor the test includes the omissions. Ex ante, it is unclear how auditors will react to seeing falsely favorable substantive test results that come from reviewing the biased estimation process. Competing theories suggest that auditors may either be misled because the test results are representative of a properly functioning process or be more aware of the bias because the falsely favorable results confirm the bias (Hackenbrack 1992; Hoffman and Patton 1997; Glover 1997; Wegener and Petty 1995, 1997).

We test our expectations using a case-based experiment with 81 Big 4 senior auditors. The experimental context is revenue recognition for long-term sales contracts determined by the percentage-of-completion method. Seeded control deficiencies, which systematically increase the likelihood of overstating current revenue, stem from omissions to management's estimation process (AICPA 1981). Prior to identifying the control deficiencies in the estimation process, planned substantive tests involve reviewing management's biased estimation process. As predicted, we find a significant minority (33 percent) of auditors judge that increasing sample size sufficiently addresses the control deficiency. We also find that seeing the falsely favorable substantive test results, on average, does not influence auditors' tendency to increase sample size. All reported findings are robust across auditor experience classifications. In addition, a supplemental sample of 14 managers produces a pattern of responses similar to our main results.

Next, we analyze how auditors select alternative tests to replace or supplement reviewing management's incomplete and biased estimation process. Standards and research indicate that auditors must select a test based on independent evidence, outside the biased estimation process, that is capable of identifying the omission (AICPA 1980, ¶17; Bell et al. 2005, 28). Further, the type of independent evidence needed to identify the omission depends on the omission's source. We analyze omissions from two different sources that cause bias in the estimation process, omitted data from externally prepared documents held by the client and omitted management judgment inputs.

If the omission in management's estimation process stems from data found on externally prepared documents held by the client, control deficiency documentation should identify the specific documents involved in the omission (PCAOB 2004). Identifying these documents should illustrate the effectiveness of using them to adjust the estimation process, thereby offering the auditor both a means of identifying the bias and a mental model of how to audit the biased process. Transfer learning theory suggests that this mental model will make auditors realize that generating an independent estimate would also effectively identify the bias (Cree and Macaulay 2000; Ellis 1965; Haskell 2001). Faced with two effective alternatives, auditors should choose using documents to adjust the estimate because it is more efficient than developing an auditor-generated estimate. However, following Payne, Bettman, and Johnson (1993), we assume that auditors act based on their individual choice preferences when trading off two effective solutions. As such, we expect that auditors equivalently select between using externally prepared documents to adjust the estimate and developing an auditor-generated estimate.

If the omission stems from management's judgment inputs to the estimate, developing an auditor-generated estimate provides the most effective independent evidence, because externally prepared documents are unlikely to contain the judgment evidence needed to adjust the estimate (IAASB 2008b, ¶¶A87, A91, A124–25). Here, control deficiency documentation focuses on the judgment omission, as opposed to an evidence source that could identify the omission, because deficiency documentation is not required to include alternative test strategies (PCAOB 2004). Without a specific document, auditors must create a mental model of how to audit the biased estimation process on their own, a task that imposes high cognitive demand. This in turn makes them susceptible to using heuristics (Kool, McGuire, Rosen, and Botvinick 2010). Given that the most common audit tests involve examining documents held by the client, we expect that the availability heuristic leads auditors to make the less effective choice of using documents to adjust the estimate, instead of the more effective choice of developing an auditor-generated estimate (Blay 2005; Tversky and Kahneman 1974).

We test these expectations using the previously discussed sample and experimental case. As predicted, when the bias is from externally prepared documents, we find that about one-half the auditors (54 percent) choose the more efficient alternative test, adjusting the estimate using documents. When the bias is from management judgment inputs, we find that most auditors (63 percent) choose to adjust the estimate using documents, even though this alternative is less effective than developing an auditor-generated estimate. Together, the results suggest that auditors often make inefficient or ineffective alternative test choices depending on the source of omission caused by the control deficiency.

Our study contributes to both practice and research. For practice, we provide evidence about the relation between control deficiencies and substantive tests in the integrated audit. Prior to SOX, auditors rarely relied on controls, potentially causing gaps in their ability to map internal control deficiencies into substantive test modifications (Allen, Hermanson, Kozloski, and Ramsay 2006; O'Keefe, Simunic, and Stein 1994; Waller 1993). Anecdotally, regulators have expressed concerns about such gaps. We provide theory-consistent empirical evidence that auditors often reach questionable, optimistic judgments about the capability of audit evidence to address control deficiencies. For research, we extend Griffith et al.'s (2013) field study data by providing experimental evidence about how overrelying on management's estimation process can occur when testing accounting estimates. In addition, we provide new empirical evidence that nonsampling risk (judgment risk) may be a larger risk than auditors realize, both confirming theory (Peecher, Schwartz, and Solomon 2007; Bell et al. 2005) and building on Budescu, Peecher, and Solomon's (2012) simulation results.

The next section develops hypotheses. Section 3 describes the experimental methods and section 4 presents results. Section 5 discusses the study's implications and limitations.

2 Background and hypotheses

Control deficiencies increase the risk of material misstatement, and auditors must modify substantive tests to offset this risk (PCAOB 2007, ¶B6; IAASB 2008a, ¶A46; AICPA 2006b, ¶121; AICPA 2006c, ¶¶70–74). Modifications involve increasing the sample size of planned tests or selecting alternative test strategies. As shown in Figure 1, when auditors plan on auditing accounting estimates by reviewing management's estimation process and subsequently identify a control deficiency, substantive test modifications depend on the control deficiency's source. The standards-based decision tree in Figure 1 indicates that increasing sample size is only sufficient if the estimation process can identify the errors caused by the control deficiency. If the estimation process cannot identify these errors, then auditors must select a test based on independent evidence, outside the estimation process, that is capable of identifying the errors (AICPA 1980, ¶17; Bell et al. 2005, 28). Figure 1 prescribes the two decision points when errors of omission bias management's estimation process: the first being the sufficiency of increasing sample size and the second being the selection of alternative tests to replace reviewing the biased estimation process.

Details are in the caption following the image
Modifying accounting estimate substantive tests to address identified control deficiencies

Contract revenue recognition provides an example illustrating the decisions in Figure 1. Management's estimation process for contract revenue recognition involves estimating the contract percentage of completion based on aggregating contract costs to date and estimating future costs (AICPA 1981). Reviewing management's estimation process for contract percentage of completion represents the most common substantive test for this estimate (Larson and Brown 2004; Griffith et al. 2013). If a control deficiency causes only mechanical errors in the calculation of contract percentage of completion, an increased sample of management's estimation process, tested with the correct calculation, can identify errors in the estimate. Conversely, a control deficiency that causes omissions of future costs in the calculation of contract percentage of completion cannot be identified by reviewing management's estimation process because the process only includes the costs that management has included in the estimate. Thus, increased review of the biased estimation process is insufficient. Instead, auditors need to find an evidence source outside management's estimation process that will identify the particular omission.

Increased sampling of a biased estimation process

We expect that most auditors understand bias caused by a control deficiency. Audit standards require documentation of each identified control deficiency to support the internal control audit opinion (PCAOB 2007). This documentation identifies the nature of omissions in the estimation process, creating awareness that the process is in fact biased. Auditors also frequently encounter errors created by control deficiencies, so their knowledge structures for such errors are well developed (Zimbelman 1997). Finally, higher order strategic reasoning is not necessary to evaluate bias caused by control deficiencies, as management does not attempt to hide such bias (Wilks and Zimbelman 2004). Because we expect that most auditors understand bias caused by control deficiencies, we expect that most auditors recognize the insufficiency of increased sampling of a biased estimation process.

However, we also expect a significant minority of auditors will consider increased sampling of a biased process to be sufficient. Across several experiments, Soll (1999) examines how people evaluate biased evidence in tasks involving military intelligence, blood tests, and weight-measuring scales. He finds that some people systematically select and rely on biased evidence. In one of the experiments, Soll (1999) prepped his participants by defining the concept of biased evidence and identifying the amount of bias in the experimental task. In this setting, Soll (1999) finds that just over 20 percent of participants still increase testing of a biased evidence source. When queried as to why they did this, participants offer logic that precluded replacing the biased evidence, such as the belief that consistent use of an evidence source holds constant confounding attributes and therefore is a superior strategy regardless of the type of error in the evidence. Soll (1999) concludes that some people have systematic discrepancies between their beliefs about evidence and the normative use of a biased evidence source.

We expect the systematic discrepancies that Soll (1999) observes in the general population also occur in the auditor population, because auditors are not routinely trained on the normative use of a biased evidence source. As such, we expect that a significant minority of auditors systematically increase review of the biased estimation process. Accordingly, we hypothesize:

Hypothesis 1. A significant minority of auditors will judge that increased sampling of a biased estimation process is a sufficient response to the control deficiency that caused the bias.

If auditors combine tests of control and substantive tests, as allowed by standards, then the control deficiency can be discovered after auditors have begun substantive tests (IAASB 2008a, ¶22; AICPA 2006c, ¶33). Although we expect that Hypothesis 1 applies regardless of the presence or absence of test results, we also expect that seeing test results will influence auditor judgment. In the presence of test results, auditors possess two pieces of information when they determine the sufficiency of increased sampling of the biased process to address the control deficiency: (1) substantive test results from the review of management's estimation process, and (2) internal control documentation indicating that a deficiency has biased the estimation process. When the control deficiency is an omission from the estimation process, substantive test results will agree to the recorded accounting estimate, but neither the substantive test results nor the accounting estimate will include the omitted data. Such substantive test results are falsely favorable, and theory supports two competing predictions regarding auditor behavior upon seeing these substantive test results.

First, the falsely favorable substantive test results are representative, but not diagnostic, of an accurate accounting estimate. Prior research on fraud risk judgments indicates that auditors have difficulty ignoring representative, nondiagnostic information (Hackenbrack 1992; Hoffman and Patton 1997; and Glover 1997). Further, Budescu et al.'s (2012) simulation results suggest that audit risk can increase with increased sampling of biased evidence, because the representativeness heuristic causes the biased evidence to distort auditor expectations. If auditors use the representativeness heuristic, reviewing the falsely favorable substantive test results creates the illusion of an effective estimation process, making them more likely to judge that increased sampling of the estimation process is sufficient (Bell et al. 2005; Budescu et al. 2012; Griffith et al. 2013):

Hypothesis 2a. In the presence, as compared to the absence, of substantive tests results from a biased estimation process, auditors are more likely to judge that increased sampling of the biased estimation process is a sufficient response to the control deficiency that caused the bias.

In contrast to the representativeness heuristic, seeing substantive test results, with knowledge of the bias caused by the control deficiency, could lead auditors to engage in more reflective thought about the results. The flexible correction model postulates that, when a person believes bias exists, they correct their judgment for the bias if motivated and able (Wegener and Petty 1995, 1997). Following this model, auditors would acquire an understanding of the biased estimation process from the control deficiency documentation and then attempt to eliminate the bias from the substantive test results (Wegener and Petty 1995, 1997). The reflective thought required to remove the biasing should create awareness by auditors that increased sampling of the biased process is insufficient. If auditors behave consistent with the flexible correction model, seeing the substantive tests results and recognizing that they are falsely favorable could reinforce the need to not rely on tests reviewing management's estimation process. As a result, auditors would be less likely to judge that increased sampling of the estimation process is sufficient:

Hypothesis 2b. In the presence, as compared to the absence, of substantive tests results from a biased estimation process, auditors are less likely to judge that increased sampling of the biased estimation process is a sufficient response to the control deficiency that caused the bias.

Selecting alternative tests

When control deficiencies cause omissions in management's estimation process, biasing the accounting estimate, auditors should acquire evidence independent of the flawed estimation process to identify the bias (PCAOB 2007, ¶B6; IAASB 2008b, ¶¶A87, A124–25). As shown in Figure 1, such evidence can take one of two forms: (1) documents held by the client but created by parties external to the client, or (2) an independently developed estimate generated by the auditor (IAASB 2008b, ¶A91). If the source of bias stems from omissions found on externally prepared documents, either using these documents to adjust the estimate or an auditor-generated estimate can effectively identify the bias (IAASB 2008b, ¶¶A87, A91). Alternatively, if the source of bias stems from management judgment inputs, a setting in which externally prepared documents are incapable of identifying the bias, only an auditor-generated estimate can effectively identify the bias (IAASB 2008b, ¶¶A87, A91, A124–25; AICPA 1980, ¶17).

When the source of a control deficiency stems from omission of data found on externally prepared documents, the deficiency documentation should describe the specific documents involved, the evidence examined and conclusions reached (PCAOB 2004). Consequently, the control deficiency documentation identifies the documents needed to adjust the estimate and thereby detect the bias in management's estimate. With access to documentation that identifies both the bias in the estimation process and the externally prepared documents at their source, we expect that auditors readily understand that they could adjust the estimate using the documents. Identifying an effective test helps auditors create a mental model of how to audit the biased process, facilitating transfer learning (Cree and Macaulay 2000; Ellis 1965; Haskell 2001). We expect that knowledge about the effectiveness of using externally prepared documents to detect the bias in the estimation process transfers to evaluating the effectiveness of developing an independent estimate to detect the bias. In sum, auditors are likely to recognize that using the documents to adjust the estimate is effective and then transfer such recognition to conclude that independently developing their own estimate is also effective.

Using documents to adjust the estimate should be more efficient than independently developing an estimate. While independently developing an estimate may be more effective, because the auditor's estimation process is completely outside of management's, any added assurance from an independent estimate is small and unlikely to justify its cost when client-held documents can effectively detect the bias in management's estimate. When the source of a control deficiency stems from the omission of data found on externally prepared documents, using these documents to adjust the estimate is effective and the most efficient means of auditing the biased estimate. Regardless, we do not expect that auditors will necessarily select the most efficient means of detecting the bias in management's estimate. Instead, following Payne et al. (1993), we expect that auditors will make an efficiency/effectiveness trade-off by individually weighting efficiency and effectiveness according to their individual preference. Assuming that auditor preference across this trade-off is similarly distributed, we expect that auditors equally select between using documents to adjust the estimate and independently developing an estimate:

Hypothesis 3a. Auditors will choose equally between using documents to adjust a biased client estimate and developing an auditor-generated estimate when modifying substantive tests to address control deficiencies stemming from omitted data found on externally prepared documents.

Control deficiencies can also stem from omissions of a critical management judgment to the estimation process. In this case, independently developing an estimate is a more effective substantive test modification than using documents to adjust the estimate, because an externally prepared document will not contain the omitted judgment. Moreover, control deficiency documentation is unlikely to identify the need for an auditor-generated estimate, because deficiency documentation does not identify the tests needed to address the deficiency (PCAOB 2004). As such, auditors must rely on their cognitive resources, including long-term memory, to determine the appropriate procedure for detecting the bias in the estimation process.

Reliance on cognition and long-term memory places demands on auditors' cognitive resources. Human avoidance of cognitive demand has been observed and theorized in psychology since at least the 1920s (Kool et al. 2010). Such avoidance is a natural, automatic response that promotes simplified decision strategies through the use of decision heuristics (Gigerenzer and Goldstein 1996; Payne et al. 1993; Simon 1955). We expect that auditors revert to the availability heuristic when determining tests to detect bias from omissions of management judgment (Kool et al. 2010; Tversky and Kahneman 1974). The availability heuristic postulates that more frequently experienced items are easier to retrieve from memory and therefore appear the most valid response to a decision (Ranzilla, Chevalier, Hermann, Glover, and Prawitt 2011; Green 2008; Libby and Frederick 1990; Butt 1988). Because the majority of auditor testing experience involves using externally prepared documents, tests involving externally prepared documents are likely to be highly available in auditor memory (Blay 2005; Libby 1985; Butt 1988). As such, when bias in the estimation process stems from omissions of management judgment, we expect that most auditors will select using documents to adjust the estimate instead of choosing to develop an auditor-generated estimate:

Hypothesis 3b. A majority of auditors will choose using documents to adjust a biased client estimate versus developing an auditor-generated estimate when modifying substantive tests to address control deficiencies stemming from omitted management judgment inputs.

The implications of Hypothesis 3b are troublesome. Specifically, the predicted number of auditors choosing to adjust the estimation process with externally prepared documents is worrisome, because the diagnostic value of doing this is very low when the bias stems from management judgment inputs.

3 Experimental method

Participants

Eighty-seven auditors attending one Big 4 firm's national training for experienced audit seniors participated in the study. Though the audit of estimates is complex, seniors execute nearly every step in the audit process, including selecting an audit approach (Griffith et al. 2013). Further, the audit partner who reviewed the experimental materials indicated that senior auditors should be able to respond correctly to the experiment's biased accounting estimates.

Experimental task and manipulations

We employ a between-participants experimental design with two treatments: (1) substantive test results based on reviewing management's incomplete and biased estimation process (absent or present), and (2) the source of omissions that create bias (externally prepared documents or management judgments). We describe the treatments in sequence within the experimental task. We randomly assign participants to experimental treatments and ask them to complete a case study.

All participants receive the same background materials, which describe the audit client as a publicly traded company that designs, manufactures, and installs automated production systems. Sales contracts for the production systems routinely take a year or more to complete, and revenue for uncompleted contracts is recognized via the percentage-of-completion method. The materials offer a detailed explanation of the process for estimating and recognizing revenue. Finally, the background materials describe interim audit judgments and initial risk assessments, interim control tests with no deficiencies noted, and planned substantive tests for contract revenue.

After the background, the case describes a significant deficiency in revenue recognition for uncompleted contracts identified during year-end updates to control tests. In all treatments, the deficiency involves the omission of a critical input to contract cost-to-complete estimates used in recognized revenue calculations. Cost-to-complete estimates project future contract costs, including materials, subcontractor costs, and production efficiencies/inefficiencies. Engineering prepares cost-to-complete estimates, corroborates them with subcontractors and field supervisors, and documents the estimates in cost analysis reports. The deficiency in cost-to-complete estimates occurs when the lead engineer responsible for the estimates abruptly departs in the fourth quarter. His inexperienced replacement fails to perform required procedures for cost-to-complete estimates, resulting in consistently low estimates. In all treatments, the case summarizes the significant deficiency conclusion as follows:

Systematically understating estimated cost to complete results in systematically overstating revenue recognized for contracts in progress. Based on magnitude and likelihood of financial misstatement, the audit team assessed an internal control significant deficiency.

Controls over cost-to-complete estimates had been effective prior to the fourth quarter's personnel change. So, the planned substantive tests for estimated cost to complete appropriately involve reviewing management's estimation process as documented on engineering's cost analysis reports. However, the control deficiency's omissions create bias in engineering's cost analysis reports, making increased sampling of the cost analysis reports insufficient and requiring the selection of alternative tests.

In the first treatment, we manipulate the absence/presence of substantive test results from reviewing the biased cost analysis reports. In one condition, the case informs participants that planned tests have not yet been completed. In the other condition, the case informs participants that planned substantive tests and tests of control were performed jointly. When tests are performed jointly, the case describes the substantive test results from reviewing the biased cost analysis reports, which validate management's estimated recognized revenue and support a conclusion that revenue is fairly stated. All participants evaluate whether it is sufficient to increase the sample size of the tests from the initial audit plan in response to the control deficiency.

In the second treatment, we manipulate the source of bias in the cost analysis reports. In one condition, the lead engineer fails to perform procedures examining “documents from subcontractors on estimates to complete their share of the work.” In this control deficiency, omissions in the cost analysis reports stem from cost-to-complete estimates found on externally prepared documents held by the client. In the other condition, the lead engineer fails to perform procedures “corroborating engineering estimates with estimates provided by project supervisors.” In this control deficiency, omissions in the cost analysis reports stem from management judgment inputs about field-determined future production costs. Participants respond to the conditions in this treatment by selecting an alternative to reviewing the biased cost analysis reports. The alternatives involve either using the externally prepared documents to adjust the estimate (vouching) or developing an auditor-generated estimate by confirming the entire cost-to-complete estimate with the customer (confirming).

When the source of bias is an omission from externally prepared documents, both vouching and confirming should effectively identify the bias, though vouching is more efficient than confirming. For bias from an omission of management judgment, confirming should effectively identify the bias. Vouching is less effective because it will only verify the subcontractor component of the estimate contained on documents, not the field-determined future production costs. As a result, vouching will not verify the entire cost-to-complete estimate; and thus for bias stemming from management judgment inputs, confirming is more effective.

To verify our prescriptive response, we conducted structured interviews with five audit partners from different international firms who all have percentage-of-completion experience. Each partner indicated that increased sampling of the biased cost analysis reports was insufficient and that auditors should perform alternative substantive tests. With regard to choosing alternative substantive tests, each audit partner indicated that vouching is the preferred procedure when bias stems from externally prepared documents, because it is effective and less costly than confirming. Each audit partner also indicated that confirming is the preferred procedure when bias stems from management judgment inputs, because only confirming can account for all project costs, not just those found on externally prepared documents.

Experimental materials were pilot tested on graduate students with audit experience and changed where necessary. An audit partner from the firm that provided participants reviewed the final version of the materials. The partner noted the realism of the case and indicated that it is representative of judgments made on integrated audits.

Experimental administrators read a script introducing the experiment and distributed envelopes containing an information sheet, general instructions, and three packets of materials. Each packet was retrieved from the envelope, completed, and replaced in the envelope before the next packet was begun. Packet one contained the experimental case and the related response scales. Packet two contained a training tutorial and a copy of the experimental case with the related response scales. Packet three contained the experimental checks, debriefing questions, and demographic questions. Administrators monitored completion of the task and collected the completed packets. The experiment took about one hour to complete.

Dependent variables

Auditor judgments are captured on 21-point bipolar scales that range from − 10 to + 10 with zero as the midpoint (see Appendix). For the dependent variable capturing the sufficiency of increasing sample size judgments, scale anchors are “increasing sample size is not sufficient” and “increasing sample size is sufficient.” For the dependent variable capturing alternative audit procedure judgments, scale anchors represent “vouching” and “confirming.” To enhance the clarity of the results discussion, we use count data based on the number of auditors on each side of the bipolar scale midpoint in our primary analyses. All results are inferentially identical when using the original scale data (not tabulated).

4 Results

Table 1 presents the participant profile. After eliminating six incomplete responses, our final sample is 81 auditors. The auditors had an average 43.2 months (3.5 years) of experience. On average, they chose substantive test procedures on six (6.3) clients, participated in planning the audit on six (5.7) clients, participated in four (4.1) SOX 404 audits, and observed significant control deficiencies on two (2.1) clients. Additionally, the auditors changed audit plans in response to a control deficiency on an average of three (2.8) clients. Direct experience with percentage-of-completion accounting was relatively low at an average of 4.8 on a 21-point scale anchored on “none” and “a great deal.” However, the auditors assessed their understanding of the percentage-of-completion case materials high, at an average of 17.3 on a 21-point scale anchored on “not well” and “very well.” Auditor demographics and self-assessments do not vary across treatments (all p-values > 0.10). It appears that the auditor participants in our study are adequately experienced in evaluating control deficiencies and formulating revisions to audit plans, and they appear to have understood the case materials.

Table 1. Participant profile—81 auditors
  Mean Std. dev.
Months of audit experience 43.21 9.61
Number of times involved in choosing substantive test procedures 6.34 3.47
Number of times involved in the planning phase of the audit 5.72 2.85
Number of SOX 404 audits 4.12 2.80
Number of clients with significant control deficiencies 2.09 2.15
Number of clients where a control deficiency changed the audit plan 2.84 2.27
Experience with the percentage-of-completion method 4.81 6.38
Understanding of percentage of completion in the case materials 17.30 2.80
  • Notes:aAuditor demographics and self-assessments do not vary across treatments (all p-values > 0.10).
  • bSelf-assessments are made on 21-point scales with 20 representing high knowledge, experience, and understanding.

Manipulation and other checks

Manipulation checks, measured with 21-point scales ranging from zero to 20, verify that participants understood their assigned treatment. Scales are shown in the Appendix. Mean responses about whether auditors had already completed planned substantive tests were 1.7 (10.2) for the absence (presence) of results, on the correct end of the scale (= 6.04, < 0.001). Mean responses about the source of bias were 8.2 for the bias from externally prepared documents condition and 16.4 for the bias from management judgment condition, on the correct end of the scale (= 5.97, < 0.001).

Other experimental checks verify that the auditors understood the key elements of the experimental setting. To assure that auditors understood the revenue effect of the control deficiency, we asked if uncompleted contract revenue was “overstated” or “understated.” The mean response is 1.2 on the “overstated” end of the scale (scale point = 0) and response differences between treatments is insignificant (= 0.34). We also asked two questions to assure that auditors understood the alternative tests for addressing the control deficiency. The first question asked whether the tests were “not different” (scale point = 0) or “one required confirmations, one did not” (scale point = 20). The mean response is 18.3 on the correct end of the scale with no difference between treatments (= 0.41). The second question asked auditors about the “difference in the needed hours to complete” between vouching and confirming. The mean response is 17.4 on the “confirmation procedures required more hours” end of the scale (scale point = 20) with no difference between treatments (= 0.45).

Hypotheses tests

Hypothesis 1 indicates that a significant minority of auditors will judge that increased sampling of a biased estimation process is a sufficient response to the control deficiency that caused the bias. Consistent with Hypothesis 1, we find that 26 (32.9 percent) auditors judged that increased sampling is sufficient. To test Hypothesis 1, we apply a 95 percent confidence interval to the frequency count to evaluate whether or not the interval includes zero. As shown in Table 2 and supporting Hypothesis 1, we find that 26 ± 8.01 (22.8 percent–43.1 percent) auditors increased sampling of the biased estimation process to address the control deficiency. While most auditors judge that increased sampling is insufficient (53 vs. 26, χ2 = 9.23, < 0.01), a significant minority believe the opposite. Similar to findings in the psychology literature, a significant minority of auditors choose to increase tests of the biased estimation process after being told explicitly that the process is biased.

Table 2. Sufficiency of increased sampling of the biased estimation process
image

Hypothesis 2a posits that in the presence of results from reviewing the biased estimation process, auditors are more likely to increase sampling of the biased process. Conversely, Hypothesis 2b posits that in the presence of results from reviewing the biased estimation process, auditors are less likely to increase sampling of the biased process. As shown in Table 2, our findings do not support either Hypothesis 2a or 2b. We find that 21 (14) auditors judge that increased sampling of the biased estimation process is insufficient (sufficient) in the absence of biased substantive test results. Conversely, 32 (12) auditors judge that increased sampling of the biased estimation process is insufficient (sufficient) in the presence of biased substantive test results. A chi-square test of proportion shows no difference exists between auditor sufficiency judgments across the absence versus presence of substantive test result conditions (χ2 = 1.43, = 0.23, Table 2). We find that results from tests of the biased estimation process do not influence auditors' judgments about the sufficiency of increased testing of the process.

Hypothesis 3a posits that auditors choose equally between vouching and confirming when addressing bias stemming from the omission of externally prepared documents. As shown in Table 3 and consistent with Hypothesis 3a, we find that 22 of 41 (53.7 percent) auditors address the bias by choosing vouching, but this does not differ from the proportion who choose confirming (22 vs. 19, χ2 = 0.22, = 0.64, two-tailed). Consistent with our theory, auditors choose equally between vouching and confirming, based apparently on their individual preferences.

Table 3. Auditor choice of the prescriptively appropriate procedure for detecting the bias—vouching or confirming
image

Hypothesis 3b posits that most auditors choose vouching over confirming when addressing bias stemming from the omission of management judgment inputs. As shown in Table 3 and consistent with Hypothesis 3b, we find that significantly more auditors address the bias by choosing vouching (24 of 38, 63.2 percent) versus confirming (24 vs. 14, χ2 = 2.63, = 0.05, one-tailed). Consistent with our theory, most auditors select vouching, the most common of substantive tests, as opposed to the more effective confirming. As previously noted, this finding is worrisome, because the diagnostic value of vouching is very low when the bias stems from the omission of management judgment inputs.

We examine a post-experimental question to verify that auditor test choice is consistent with our theory. The post-experimental question measures auditors' subjective perceptions about the quality of evidence provided by vouching versus confirming (the 21-point scale is anchored on “confirmation evidence is higher quality” and “confirmation evidence is lower quality”). We find that auditor perception of test quality is significantly related to their choice of vouching versus confirming when bias stems from externally prepared documents (p < 0.01, two-tailed) but not when bias stems from management judgment inputs (= 0.26, two-tailed). These findings are consistent with our theory. When bias is from documents, we theorize that auditors will select an alternative test based on their individual preference, and perceived test quality should be an element of their preference. However, when bias is from management judgment inputs, we theorize that auditors will select an alternative test based on the availability heuristic, thereby overriding their individual preference.

Additional results

Because auditors are not routinely trained on biased processes, we theorize that a significant minority of auditors have systematic discrepancies in their beliefs about the normative use of biased evidence sources. Our results support this theory. We find that a significant minority of auditors select a biased evidence source after being told that it is biased. If this nonnormative selection is due to a lack of training, then a training intervention could improve auditor judgments in this domain. To test the potential effectiveness of a training intervention, we introduce a tutorial on bias after auditors have responded to our treatments, and then ask them to respond a second time to the treatment conditions.

The two-page tutorial first presents three categories of generic audit evidence: perfectly accurate, noisy, or biased. Next, the tutorial indicates the normative response when auditors encounter each type of evidence (respectively, no adjustment, increase sample size, acquire a new unbiased evidence source). Following the tutorial, auditors were asked which type of evidence was needed when auditing a biased process. Seventy-six (94 percent) of the auditors answered the question correctly.

We report our analysis of the 26 auditors (per Table 2) who judge increased sampling of the biased estimation process as sufficient after reading control deficiency documentation indicating that the estimation process is biased. As shown in Table 4, we find that 8 of 14 (57.1 percent) auditors changed their decision after completing the tutorial, in the absence of substantive test results from the biased estimation process. Conversely, none of 12 (0.0 percent) auditors changed their decision after completing the tutorial, in the presence of substantive test results from the biased estimation process. This difference in rate of change is significant (Fisher exact < 0.01, two-tailed, Table 4).

Table 4. Effect of a tutorial on 26 auditors who initially judged that increased sampling of a biased estimation process is sufficient
image

We find that auditors receiving falsely favorable test results do not improve their decisions following the tutorial. This finding is consistent with the representativeness heuristic, because it appears that the falsely favorable test results made these auditors believe that the estimation process is effective. In the absence of the falsely favorable test results, our findings indicate that auditor judgment about the use of biased evidence sources can be improved with a relatively simple training intervention on the normative response to biased evidence. The recent emphasis on auditors' professional judgment has led firms to develop training interventions oriented to improving judgment (Ranzilla et al. 2011). Our findings suggest that training on the normative use of biased evidence sources would be helpful.

Robustness and other tests

Recognizing the insufficiency of increased sampling of the biased estimation process was the first step in our study. The second step involved selecting an alternative test to replace reviewing the biased estimation process. In supplemental tests (not tabulated) we find no evidence that judgments related to the sufficiency/insufficiency of increased sampling are associated with the choice of vouching or confirming audit procedures (= 0.84, two-tailed). We also find no evidence that the absence or presence of substantive test results are associated with the choice of vouching or confirming audit procedures (= 0.82, two-tailed). Overall, we do not observe carryover effects between the elements of our experiment.

On average, auditors in our study had a low rate of experience with percentage-of-completion accounting, the basis of our experimental task. To analyze the effect of experience, we compare the performance of auditors with higher and lower levels of percentage-of-completion accounting experience. We find that 65.3 percent (64.5 percent) of auditors who assessed their experience as lower (higher) judge that increasing sample size is insufficient, and we find that 44.9 percent (41.9 percent) of auditors who assessed their experience as lower (higher) choose the more appropriate procedure. For the auditors who assess their percentage-of-completion experience as very high, we find that 54.6 percent judge that increasing sample size is insufficient and 36.4 percent choose the more appropriate procedure. We observe no difference between performance of the highly experienced and less-experienced auditors on the sufficiency judgment (= 1.00, two-tailed) or on test selection (= 1.00, two-tailed). Variation in percentage-of-completion experience within our experiment does not influence our findings.

To further verify that our results are not driven by our participating auditors' relatively low experience with percentage-of-completion accounting, we conducted a second experiment with 14 managers from international audit firms. The managers had an average of 9.3 years of professional experience; each indicated extensive experience in substantive test selection, and 11 of the 14 had clients that used percentage-of-completion accounting. Each manager analyzed the sufficiency of increased sampling of the biased estimation process, and each manager selected tests to replace review of the biased process for bias stemming from both externally prepared documents and management judgment inputs.

As shown in Table 5, we find that 10 (71.4 percent) of the managers judge that increasing sample size is insufficient. For bias from documents, 10 (71.4 percent) managers choose the effective and most efficient procedure, vouching, and four (28.6 percent) managers choose confirming. For bias from judgment inputs, eight (57.1 percent) managers choose the more effective procedure, confirming, and six (42.9 percent) managers choose vouching. We observe no difference between the performance of these managers and our sample of senior auditors in terms of the sufficiency judgment (= 1.00, two-tailed) test selection when bias stems from externally prepared documents (= 0.35, two-tailed), or test selection when bias stems from judgment inputs (= 0.22, two-tailed). Due to the small sample of managers, our results comparing their responses to those of our senior auditor participants must be interpreted with caution. Regardless, results from our sample of managers indicate that a nontrivial proportion choose to increase sampling of a biased estimation process and select the less appropriate alternative test. Importantly, our additional findings from managers suggest that dealing with a biased evidence source causes difficulty for many auditors, regardless of experience.

Table 5. Manager responses compared to those of senior auditorsa
image

5 Discussion

We find that a significant minority of senior auditors (33 percent) attempt to identify bias in an accounting estimate with increased sampling from the biased estimation process. Further, they do this after being told that the estimation process is biased. In supplemental tests, we find evidence that the observed results are not driven by lack of experience with percentage-of-completion accounting. We partition participants based on percentage-of-completion experience and find that those with high experience make judgments very similar to those with lower experience. In addition, we find that managers with significant task and test planning experience make judgments similar to the senior auditors. Our findings suggest that, regardless of experience, some auditors have flawed perceptions about evidence quality, precluding them from rejecting biased evidence sources (Soll 1999). Regarding tests that could identify the bias in the accounting estimate, we find that auditors often choose inefficient or ineffective tests. When the source of bias is externally prepared documents, about one-half of the auditors choose confirming instead of more efficient vouching. When the source of bias is management judgment, most auditors choose vouching instead of more effective confirming.

We discussed our findings with five audit partners experienced in percentage-of-completion accounting. Consistent with research indicating that audit supervisors are overconfident in their subordinates' competence, the partners expressed surprise that so many auditors judge that increased sampling of the biased estimation process is sufficient, given the case material's straightforward explanation of the bias (Han, Jamal, and Tan 2011; Kennedy and Peecher 1997). Several partners conjectured that increasing sample size is a common way to address control deficiencies and that some auditors routinely opt to do it without appropriately considering the situation.

The partners also indicated that the auditors' choices of alternative substantive tests are troubling, particularly when the bias stems from management judgment inputs, because most auditors chose to adjust the estimate using documents even though such a test is ineffective. They were surprised that auditors did not consistently revert to confirming in the face of any uncertainty, especially with the specter of PCAOB audit inspections. The partners admitted that mapping internal control deficiencies to substantive tests is difficult, particularly when the deficiency documentation offers no link to a substantive test, as in our bias from management judgment condition. Several suggested that if auditors fail to connect a deficiency's root cause to a substantive test, they will likely revert to what they know best (vouching), which is consistent with the availability heuristic. Finally, the partners indicated that managers should do better than seniors, but acknowledged that it is still difficult to get people to look beyond the familiar, regardless of experience. The interviews are consistent with our theoretical premises and offer practical insight into our results.

This research has limitations. Audit planning materials in practice are rich, but they are necessarily restricted in this study due to limits on access to the experimental participants. In addition, audits usually involve an audit team, and the ability to consult team members can affect audit judgments. In this experiment, we use individual judgments that do not capture dynamic team interactions. However, audit seniors' judgments and documentation can influence higher ranking members of the audit team (Hammersley, Johnstone, and Kadous 2011; Bellovary and Johnstone 2007; Ricchiute 1999). Firm partners that we interviewed indicated that seniors' judgments are important to the audit and that seniors are capable of addressing control deficiencies. Finally, our experimental setting of accounting estimates is complex, raising concerns about the experience level of our senior auditor participants. However, results from an additional sample of managers suggest lack of experience is not driving our findings.

With regard to practice, our findings inform the integrated audit. According to professional standards, auditors must integrate the internal control and financial statement audits (PCAOB 2007). Revised risk assessment standards were issued, in part, to improve the integration of controls into the financial statement audit (PCAOB 2010a). However, PCAOB inspections find that auditors sometimes do not appropriately change the nature, timing, and/or extent of their substantive tests in response to clients' internal controls (PCAOB 2008). Our findings are consistent with PCAOB concerns and shed light on potential sources of inefficient/ineffective auditor judgments surrounding the integration of control deficiencies and substantive test changes. The audit partners interviewed all indicated that auditor response to control deficiencies is an important issue. Several partners said that their firms struggle with integrating controls into the financial statement audit, and such integration is a common training topic, which adds credence to our findings about training the normative response to biased evidence sources.

Our study also contributes to theory. Specifically, our findings inform Griffith et al.'s (2013) field study on the audit of accounting estimates, helping explain why auditors do not “tend to generate independent estimates or consider what management has neglected to include in its estimation model” (Griffith et al. 2013, 4). We also find evidence of nonsampling risk when control deficiencies bias evidence used in substantive tests, supporting Peecher et al.'s (2007) contention that audit breakdowns often occur due to nonsampling risk/error. Overall, our findings provide a better understanding of how auditors perform when addressing errors of bias in management's estimation process.

Notes

  • 1 Substantive tests are audit procedures performed to ensure account balances are complete, valid, and accurate. Control deficiencies could require changes to the procedures' sample size and sampling characteristics, to the type of procedures performed, or to the timing of the procedures.
  • 2 While professional guidance does not define a significant minority, it does suggest that the risk of not detecting a material misstatement (detection risk) can be reduced to a negligible level through testing procedures combined with appropriate planning, personnel, supervision, skepticism, and quality control mechanisms (AICPA 2006a, ¶24; Bell et al. 2005). Therefore, any detection failure is important, because it would exceed a negligible level and can quickly amplify the risk of material misstatement, given the multiplicative nature of the audit risk model.
  • 3 Anecdotal discussions with audit partners confirm that audits generally do combine tests of control and substantive tests, particularly at year-end.
  • 4 Reviewing subsequent events can also identify errors, but only if the uncertainty is resolved prior to the audit report date (IAASB 2008b, ¶13a). Auditors rarely choose this test for accounting estimates (Griffith et al. 2013).
  • 5 Transfer of learning involves the application of skills and/or knowledge that are learned in one situation to another situation. This can most readily occur when the knowledge to be transferred is immediately useful, as would be knowledge of the documents needed to detect the bias in management's estimate (Haskell 2001).
  • 6 Structured interviews with five audit partners experienced in auditing accounting estimates, from different international firms, indicate that (1) using the documents to adjust the estimate is more efficient when bias stems from omitted data found on external documents, and (2) independently developing an estimate is likely the only effective procedure when bias stems from omitted management judgment.
  • 7 Audit firms often have checklists of audit procedures. However, cognitive effort is still required to assess the effectiveness and efficiency of procedures on such lists.
  • 8 Please see supporting information, “S1: Experimental Instrument,” as an addition to the online article.
  • 9 The percentage-of-completion revenue recognition formula, provided in the case, is (cost-to-date/(cost-to-date + estimated cost-to-complete)) × contract price. The case discusses estimated cost-to-complete in regards to this formula and revenue recognition.
  • 10 As previously discussed, results from these tests are falsely favorable.
  • 11 Vouching and confirming are common types of substantive tests (PCAOB 2010a ¶¶A8–15, 18). While other tests possibly exist for this estimation process, participants were provided with only these two alternatives.
  • 12 Participants are explicitly told that vouching is less costly than confirming, consistent with research suggesting non-standard, positive confirmations impose higher hour/dollar costs in the form of design, execution, and follow-up on discrepancies and non-responses (McConnell and Schweiger 2008; Castor, Elder, and Janvrin 2008).
  • 13 The partners' experience ranges from 16 to more than 30 years with their firm, and two have national audit practice responsibilities.
  • 14 The six participants dropped from our analyses left substantial portions of their instruments incomplete, including responses for the dependent variables.
  • 15 We compensate for possible lack of experience by explaining the mechanics and risks associated with the percentage-of-completion method, and our experimental check of participants' understanding suggests we were successful. Pilot tests also indicate that participants understood the experimental setting.
  • 16 As shown in the Appendix, this scale ranged from −10 to 10 with 0 as the midpoint. The mean response is −8.8 on the “overstated” end of the scale. For consistency with the other checks, we adjust the response to a range of 0 to 20.
  • 17 Because we predict the null, and our statistical analysis fails to reject it, we perform confidence interval analysis (Levine and Ensom 2001). A 90 percent confidence interval for selecting vouching is 16.91 to 27.09. For approximately 80 percent of this interval, 17 to 25, we would fail to reject H3a at conventional levels (> 0.10). This provides evidence that population differences between vouching and confirming are unlikely to be statistically significant.
  • 18 These findings (not tabulated) are determined by regressing perceived evidence quality on the continuous response to vouching versus confirming.
  • 19 Prior literature in auditing suggests task-related cognitive interventions improve judgment (Hoffman and Zimbelman 2009; Earley, Hoffman, and Joe 2008; Glover, Prawitt, and Wilks 2005; Earley 2003, 2001; Ashton and Kennedy 2002; Renkl, Stark, Gruber, and Mandl 1998; Kennedy 1995). We extend this research by using a more generic intervention that explicitly makes auditors aware of a particular judgment trap.
  • 20 All tests in this section are performed with the Fisher exact probability test because many of the sub-sample comparisons are smaller than recommended for chi-square tests.
  • 21 The percentage-of-completion experience scale ranges from zero to 20 and is anchored on “none” and “a great deal.” Lower experience is defined as below four (mean of 0.7) and consists of 49 auditors. Higher experience is defined as four and above (mean 11.3) and consists of 31 auditors. (One auditor failed to disclose the percentage-of-completion experience.) This lower/higher split approximates a median split. Moving the lower/higher split plus or minus one produces results inferentially similar to those reported.
  • 22 Very high experience is determined by the upper one-third of the scale and consists of 11 auditors with experience assessments of 14 and above (mean 18.2).
  • 23 First, managers selected tests for bias from externally prepared documents, after which they were given instructions to “consider a different deficiency” and select tests for bias from management judgment inputs. Both test selection conditions were on the same page in close proximity, enhancing the comparison of the two conditions, potentially signaling their differences and improving manager performance. Given that we intentionally induced carryover effects, we did not counterbalance test selection across source of bias.
  • 24 When we analyze only the 11 managers with clients using percentage-of-completion accounting, we find that all inadequate sufficiency judgments are made by them. With regard to test selection, we find 8 of the 10 less appropriate procedure selections are made by these managers.
  • Appendix A: Data collection scales

    Dependent variables

    All participants were provided the following description:

    Your firm's audit methodology describes the following alternative procedures for testing estimated cost-to-complete:

    P1: Examine engineering cost analysis reports to test the reasonableness of estimated costs-to-complete (procedure in the initial audit plan shown on page 3).

    P2: Obtain engineering cost analysis reports and vouch selected estimates of cost to complete to source documents from vendors and subcontractors. Reconcile with estimated cost to complete used for revenue recognition at December 31.

    P3: Request positive confirmations from customers of the percent complete on contracts. Reconcile with estimated cost to complete used for revenue recognition at December 31.

    The dependent variable for sufficiency of increasing sample size is:

    Rate the likelihood that you will continue to select procedure P1 from the initial plan (i.e., increasing the sample size is a sufficient response) in your revised plan.

    image

    The dependent variable for alternative audit procedure choice is:

    If you were to change from P1, rate the likelihood that you would select procedure P2 versus P3 to test estimated cost to complete in your revised plan.

    image

    Experimental Checks

    1. Had the audit team already completed the year-end substantive tests of revenue from contracts in progress based on the initial audit plan?

    image

    1. The significant deficiency in contract cost-to-complete estimates involved the temporary engineer failing to:

    image

    1. The significant deficiency's effect on recognized revenue from contracts in progress would most likely be to:

    image

    1. What was different in the source of evidence in substantive procedures P2 and P3?

    image

    1. What was the difference in the needed hours to complete between substantive procedures P2 and P3?

    image

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.