The Effect of Face Masks on the Recognition of Own- and Other-Race Faces
Funding: The authors received no specific funding for this work.
ABSTRACT
The other race-effect (ORE), the tendency to identify more accurately own- than other-race faces, is typically attributed to diminished holistic or configural processing for other-race faces. However, other accounts suggest that the ORE can be mediated when observers specifically focus on particular facial features. For example, Black observers do not show an ORE for White faces when they attend to the eye region. This study examines these accounts when surgical face masks naturally occlude the lower region of the face, which may both disrupt holistic processing and facilitate or hamper selective feature processing, dependent on the race of the face. Overall, our experiments showed that face masks disrupted the identification of both own- and other-race faces. In addition, internal meta-analyses showed that this effect was slightly larger for own- than other-race faces, providing more support for the holistic processing account of the ORE.
1 Introduction
One of the most studied phenomena in face recognition research is the other-race effect (ORE). This effect simply shows that human observers are generally better at recognizing own-race versus other-race faces (Estudillo et al. 2020; Malpass and Kravitz 1969; Meissner and Brigham 2001). The ORE is robust and has been replicated across different races and ethnic groups (Lee and Penrod 2022; Meissner and Brigham 2001). Interestingly, this effect is not only observed with objective measures, but participants also report lower confidence in their face identification abilities for other-race faces (Brigham et al. 2007; Estudillo 2021; Hourihan et al. 2012; Smith et al. 2001, 2004). In addition to its theoretical implications for human face recognition models (see e.g., Levin 2001; Valentine 1991; Valentine et al. 2016), the ORE can also have important consequences in different forensic settings, such as during eyewitness identification (Evans et al. 2009; Wilson et al. 2013) and at identity checkpoints (Kokje et al. 2018; Megreya and Bindemann 2018).
It is widely accepted that face recognition relies on holistic-and/or configural processing (Estudillo et al. 2022; Maurer et al. 2002; Richler and Gauthier 2014; Rossion 2013). Although the precise definition of holistic processing is currently under debate (Lee et al. 2022; Rezlescu et al. 2017; Richler and Gauthier 2014; Rossion 2013; Wong et al. 2021), it is generally acknowledged that holistic processing involves the integration of individual facial features into an undecomposed whole (Maurer et al. 2002; Piepers and Robbins 2012; Rossion 2013). According to one of the most influential accounts of the ORE, this effect can be explained by a diminished or even absent holistic processing of other-race faces (e.g., DeGutis et al. 2013; Michel et al. 2006; Tanaka et al. 2004, but see Wong et al. 2021). In support, some research has shown smaller inversion effects for other-race compared to own-race faces (Rhodes et al. 1989). Similar results have been reported with other measures of holistic processing, such as the part-whole task (Tanaka et al. 2004) and the composite face task (Michel et al. 2006). However, a recent multicultural study by Wong et al. (2021) showed no evidence of reduced holistic processing for other-race faces (see also Crookes et al. 2013; Mondloch et al. 2010).
An alternative, although not necessarily exclusive, view suggests that the ORE is related to the different physiognomic properties of faces from different races (Hills and Pake 2013; Levin 2001; Shepherd and Deregowski 1981; Valentine 1991; Valentine et al. 2016). Differential exposure levels to own- and other-race faces may mediate how efficiently observers spontaneously attend to features that are important for within-class discrimination, with observers less adept at this process for other-race faces (Hills and Pake 2013). In support, while White participants tend to focus on the eye region for face identification (Blais et al. 2008; Caldara et al. 2010; Hills and Pake 2013; Miellet et al. 2013) the ORE is found to be reduced when they are instead trained to focus on and discriminate between Black faces on the basis of the disproportionately category-diagnostic information contained in the lower half of the face (Hills and Lewis 2006). Similarly, recognition of Black faces by White participants is better when these faces have been learned and recognized preceded by a fixation cross on the tip of the nose compared to when the fixation cross preceded the glabella (i.e., the area between the eyes; Hills et al. 2013; Hills and Lewis 2011). In contrast, although Black participants automatically focus on the nose area for face identification (Hills and Pake 2013), their recognition of White faces is better when a fixation cross signals the glabella, a comparatively more informative category-diagnostic region than the tip of the nose (Hills et al. 2013). Altogether, these findings highlight that the successful recognition of own- and other-race faces may be underpinned by focus on differential facial regions and that the ORE may arise when fixation placement is undifferentiated, causing the observer to default to a processing style specific to their own race.
During the Covid-19 pandemic, wearing surgical face masks became commonplace to prevent viral transmission. As surgical face masks cover approximately 50% of the face, it is perhaps unsurprising that they impair performance in different face processing tasks, including emotion recognition (Carbon 2020; Wong and Estudillo 2022), social judgments (Oldmeadow and Koch 2021), sex and age perception (Wong and Estudillo 2022). It has also been shown that face masks impair face identification (Carragher et al. 2022; Carragher and Hancock 2020; Estudillo et al. 2021; Estudillo and Wong 2022). For example, using an adapted version of the Cambridge Face Memory Test—a highly reliable measure of face identification (Bowles et al. 2009; Duchaine and Nakayama 2006; Estudillo and Wong 2021)—it was shown that face masks impair face memory performance for upright, but not for inverted, faces (Freud et al. 2020), suggesting that face masks disrupt holistic and/or configural processing of faces.
1.1 The Present Study
Most studies exploring the effects of face masks on face identification have exclusively used White face stimuli and tested White participants, so it is unknown how face masks affect the identification of other-race faces. The present study aims to shed light on this question by testing White and Black participants' recognition memory performance for full-view and masked Black and White faces. As face masks cover the bottom part of the face and thus impair holistic and/or configural processing of faces (Freud et al. 2020), they offer an ecological test for studying the relative contribution of holistic processing and attention to diagnostic features as potential mechanisms of the ORE. In this sense, if the ORE can be explained by a reduction in holistic and/or configural processing of faces, we expect to find an ORE for full-view faces observed by both Black and White participants, but reduced evidence of the ORE for masked faces. Alternatively, if the ORE is a consequence of misplaced attention to facial features, then divergent outcomes are expected in White and Black participants. As White participants rely on the upper region to identify faces, which is visible in both full-view and masked depictions, then the ORE should be evident across conditions as both allow the participant to adopt their default sub-optimal focus on less category-diagnostic regions of a Black face. In contrast, as Black participants have a bias to extract identity from the lower region of a face, we would only expect an ORE to emerge for full-view versus masked faces, as in the latter condition participants are forced to adopt a viewing pattern that complements extraction of identity-relevant information from White faces.
2 Experiment 1
Experiment 1 explored the effect of face masks on the recognition of own- and other-race faces. Black and White observers performed a standard old/new recognition memory task involving Black and White faces. Across learning and recognition stages, faces were presented either in full-view or with a face mask, with format held congruent across study and recognition stages for each ‘old’ item. That is, if a face was studied in full-view (or with a face mask), it had to be recognized in full-view (or with a face mask).
2.1 Methods
2.1.1 Participants
Participants were tested online with the use of a suitable device (e.g., a PC or Laptop). We initially recruited a total of 205 participants using the platform Testable Minds (www.testable.org). Twenty-three participants were removed from further analysis as they failed attention checks during the experiment. Thus, our final sample comprised 182 participants (Mage = 35.04; SDage = 11.81). Seventy-nine participants reported being Black (39 female) and 103 White (52 female, 2 other). A sensitivity power analysis run with the software MorePower (Campbell and Thompson 2012) revealed that the present sample size was sufficient to detect a three-way interaction between face race, participant's race, and viewing condition with small effect size ( = 0.04), assuming a power of 80%. White participants were originally from White majority countries in Europe, the USA, and Canada. Sixty-four of our Black sample hailed from Black majority African countries, but 15 Black participants reported being non-African (see raw data files for nationalities).
2.2 Materials
A total of 64 identities were taken from the Chicago Face Database (Ma et al. 2015; 32 Black and White identities, respectively). Across ethnic groups, half of the identities were female. Black and White faces were matched in terms of attractiveness and race-specific prototypicality according to the published norms of the database (Ma et al. 2015) [both ts(62) ≤ 0.81, ps ≥ 0.42]. For each identity, we selected one face displaying a happy expression and one face displaying a neutral expression. Using Adobe Photoshop, we fitted a surgical face mask to each face. Each face measured approximately 711 × 500 pixels. Example stimuli are shown in Figure 1. For counterbalancing between old and new trials (see below) we created two sets of faces, with faces from each masking condition distributed equally across the sets.

2.3 Procedure
Observers performed a standard old/new recognition task. In the study stage, observers were asked to study a total of 32 faces showing a neutral expression. Half of the faces were Black individuals, and half were White individuals. For each race, half of the faces were presented with surgical face masks. The face mask conditions were counterbalanced across participants such that they encountered different identities as masked and full-view depictions. Thus, there were a total of 8 faces in each condition. The list length of this study is comparable to other face recognition studies using the same number of factors (e.g., Hills and Lewis 2006; Shriver et al. 2008; Zhou et al. 2021). Each face was presented in the center of the screen for a total of 2 s with an interval of 800 msecs between faces. The presentation of the faces was randomized. Subsequently, in the recognition stage, observers were presented with a face in the center of the screen on each trial and had to determine whether the identity was presented in the study stage. To avoid picture recognition (Bruce 1982; Estudillo 2012; Estudillo and Bindemann 2014), in the recognition stage the identities displayed a happy expression, whereas in the study phase they had been shown with a neutral expression. In addition, to avoid incongruency effects (Estudillo and Wong 2022; Manley et al. 2019; Toseeb et al. 2014), faces that were presented with face masks in the study stage were also presented with face masks in the recognition stage. The test stage comprised 64 trials: 32 studied and 32 new identities, which were presented randomly, with an equal number of new faces shown masked and in full view. The faces remained on the screen until participant response.
2.4 Results
We used correctly recognized old faces (i.e., hits) and incorrectly accepted new faces (i.e., false alarms) to calculate d-prime, a measure of sensitivity (Stanislaw and Todorov 1999). Higher d-prime values reflect better recognition. D-prime was corrected using Hautus' method for extreme values (Hautus 1995). Figure 2A shows mean d-prime across conditions.

A 2 (viewing condition: full-view vs. mask; within) × 2 (face race: Black vs. White; within) × 2 (participant's race: Black vs. White; between) mixed ANOVA revealed that the three-way interaction between these factors did not reach statistical significance, [F < 1]. The main effects of viewing condition [F(1, 180) = 62.12, p < 0.001, = 0.25], face race [F(1, 180) = 28.16, p < 0.001, = 0.13] and participant's race [F(1, 180) = 4.25, p < 0.05, = 0.02] reached statistical significance. The interaction between face race and participant's race was also significant [F(1, 180) = 46.61, p < 0.001, = 0.20]. Post hoc analysis revealed that White participants were better at recognizing White (M = 1.40; SEM = 0.05) than Black faces (M = 0.84; SEM = 0.04) [t(102) = 9.21, p < 0.001, d = 0.81, 95% CI = 0.55–1.07]. However, Black participants showed no differences in identifying Black (M = 1.23; SEM = 0.05) and White (M = 1.30; SEM = 0.06) faces [t(78) = 1.01, p = 0.38].
The interaction between viewing condition and face race also reached statistical significance [F(1, 180) = 7.46, p < 0.01, = 0.04]. To explore this interaction, for each face race, we calculated the Mask effect as the difference between sensitivity in the full-view and mask conditions (for full analysis, see Supporting Information). The mask effect was stronger for White faces (M = 0.45; SEM = 0.06) than for Black faces (M = 0.22; SEM = 0.06) [t(181) = 2.79, p < 0.01, d = 0.20, 95% CI = 0.06–0.35].
Finally, the interaction between viewing condition and participant race was also significant [F(1, 180) = 62.12, p < 0.001, = 0.07]. To explore this interaction, we again utilized mask effect sensitivity differences. The mask effect was stronger for Black (M = 0.53; SEM = 0.06) than for White participants (M = 0.18; SEM = 0.06) [t(180) = 3.80, p < 0.001, d = 0.57, 95% CI = 0.27–0.86].
2.5 Discussion
Experiment 1 explored the effect of face masks on the recognition of Black and White faces by Black and White participants. We found that face masks disrupted face identification, replicating previous results (Carragher et al. 2022; Carragher and Hancock 2020; Estudillo et al. 2021; Estudillo and Wong 2022; Freud et al. 2020). However, the effect of masking was stronger for White faces compared with Black faces and for Black participants compared with White participants. Figure 2A suggests that this effect is driven by the presence of mask effects for both Black and White faces in Black participants and the lack of mask effects for Black faces in White participants. In other words, in White participants, face masks reduce performance for own-race but not for other-race faces. This result supports a holistic explanation of the other-race effect in White participants as face masks may disrupt only the (holistic and/or configural) processes that typically underpin the recognition of own-race faces, leaving spared the alternate, comparatively suboptimal processing methods used for other-race faces (Freud et al. 2020).
Strikingly, we found no evidence of the ORE in Black participants. While this effect may simply suggest that the White faces used in this experiment were easier for these participants to identify than Black faces sampled, this is unlikely as both sets of stimuli were matched for attractiveness and prototypicality. An alternative account for the lack of ORE in Black participants could be the presence of face masks in the encoding stage. To ensure that our conclusions were based on the key factors manipulated, rather than encoding specificity violations, the presentation of our stimuli was congruent across learning and recognition stages, such that faces encoded with a mask were also presented with a mask at recognition (see also Toseeb et al. 2014; Manley et al. 2019; Estudillo and Wong 2022). However, the presence of face masks during encoding prevents Black participants from naturally defaulting to their preferred processing strategy, specifically, a disproportionate assessment of the nose region and below for cues diagnostic for later recognition—an optimal strategy for the recognition of Black, but not White, faces (e.g., Hills and Pake 2013). Instead, the presence of a face mask would force Black participants to focus their attention upon the visible eye region; a strategy that may simultaneously enhance recognition of White faces, while disadvantaging recognition of Black faces, nullifying differences according to stimulus race (Hills and Pake 2013; Hills et al. 2013). Given that face masks were applied to half of the faces during encoding, and trials differing in encoding format were intermixed, participants may have learned to apply this strategy also to full-view faces. In contrast, White participants can continue to use their default processing style (i.e., a focus on the eyes/upper region of the face) despite application of a face mask, explaining why they continued to show a classic ORE.
This second account can be directly tested by presenting Black participants with all faces unmasked during encoding. Thus, if the lack of ORE in Black participants is explained by a shift of attention to the top part of the faces for all stimuli at encoding as a consequence of including masked stimuli at this stage, presenting all the faces in full view here should reinstate the ORE in Black participants. Importantly, under this account, any difference between the performance of Black participants in Experiment 1 and Experiment 2 should only be evident in the full view condition for White faces. In other words, Black participants in Experiment 1 should perform better than Black participants in Experiment 2 only in the full view condition for White faces, but not for Black faces in the full view or mask condition, or for White faces in the mask condition. This possibility was explored in Experiment 2.
3 Experiment 2
In Experiment 2, Black participants were asked to learn and recognize own- and other-race faces in an old/new recognition task. In contrast to Experiment 1, the presence/absence of the mask was only manipulated in the recognition stage. As at encoding all the faces were presented in full view, the presence of face masks at test was unpredictable, preventing participants from strategically deploying their attention toward the top part of the face. Under these circumstances, we expect that our Black participants would show an ORE, at least, for full-view faces.
3.1 Methods
3.1.1 Participants
As in Experiment 1, participants were tested online, using a PC or a Laptop. We initially recruited a total of 90 participants using the platform Testable Minds (www.testable.org). Seven participants were removed from further analysis as they failed attention checks during the experiment. Thus, our final sample comprised 83 participants (Mage = 29.90; SDage = 9.50). All our participants reported to be Black (46 females, 2 other), but 20 were non-African. A sensitivity power analysis run with the software MorePower (Campbell and Thompson 2012) revealed that with this sample size, we would be able to detect a two-way interaction between face race and viewing condition of medium effect size ( = 0.08), assuming a power of 80%.
3.2 Materials and Procedure
The stimuli and procedure were identical to Experiment 1, with the difference that all faces were presented in full view during encoding. As in Experiment 1, in the recognition stage, half of the faces were presented in full view and the other half with a face mask fitted.
3.3 Results
As in Experiment 1, d-prime was calculated and analyzed (see Figure 2B). A 2 (viewing condition: full-view vs. mask) x 2 (face race: Black vs. White) repeated measures ANOVA revealed a main effect of viewing condition [F(1, 82) = 23.71, p < 0.001, = 0.22], showing that full-view faces (M = 1.23; SEM = 0.06) were better identified than masked faces (M = 0.92; SEM = 0.06). The main effect of face race was also significant [F(1, 82) = 17.41, p < 0.001, = 0.17], showing that participants were better able to recognize Black faces (M = 1.20; SEM = 0.06) than White faces (M = 0.95; SEM = 0.06). The interaction between viewing condition and face race failed to reach statistical significance [F(1, 82) = 1.40, p = 0.23].
Visual inspection of Figure 2A,B, suggests that the differences in Black participants' performance between Experiment 1 and 2 arose due to a reduction in accuracy for full-view White faces in Experiment 2. This pattern of results was confirmed by a 2 (viewing condition: full-view vs. mask) × 2 (face race: Black vs. White) × 2 (experiment: 1 vs. 2) mixed ANOVA, which revealed a three-way interaction between viewing condition, face race, and experiment [F(1, 160) = 3.92, p < 0.05, = 0.02]. Post hoc t-tests showed that Black participants in Experiment 1 performed better for full-view White faces (M = 1.55; SEM = 0.07) than Black participants in Experiment 2 (M = 1.07; SEM = 0.08) [t(160) = 7.76, p < 0.001, d = 0.64, 95% CI = 0.32–0.95]. However, Black participants in Experiment 1 and Experiment 2 performed similarly for Black faces across the full-view condition and mask condition, and for White faces in the mask condition [all ts(1, 160) ≤ 1.10, p ≥ 0.27].
3.4 Discussion
In Experiment 2, face masks were only presented at the recognition stage. Regardless of this, the results of this experiment showed that Black participants were more accurate recognizing faces in the full view compared with the mask condition, replicating the main results from Experiment 1 and previous studies (Carragher et al. 2022; Carragher and Hancock 2020; Estudillo et al. 2021; Estudillo and Wong 2022; Freud et al. 2020). In contrast to the results of Experiment 1, our Black participants also showed a clear ORE. These contrasting findings point to the importance of the eye region in the learning and recognition of White faces (see Hills and Pake 2013) and suggest that the lack of ORE in Experiment 1 for Black participants was a consequence of an attentional shift to the top part of the face; a strategy participants may have seen fit to adopt for all faces, given the randomized and unpredictable presence of masking during the encoding stage.
Nevertheless, it is important to note that the results of Experiment 2 simply revealed similar mask effects for own- and other-race faces, which by itself does not support the attentional shift account, nor the holistic account, as each would predict asymmetrical (albeit different) influences of masking across own- and other-race faces. In other words, only when the findings of Experiment 2 are interpreted in conjunction with those of Experiment 1, does the attentional shift account receive partial support.
4 Experiment 3
So far, our results seem to suggest that the lack of ORE for Black participants in Experiment 1 can be explained by an attentional shift to the top part of all faces during encoding, due to the unpredictable presence of face masks during this task stage. However, not all our findings support this account. In fact, as previous research has shown (see Hills and Pake 2013; Hills et al. 2013), such an attentional shift in Black participants should increase recognition performance with White faces but reduce it for Black faces. Thus, it is possible that alternate factors may contribute toward the difference in results observed across experiments. For example, in addition to the presence and absence of masks during the encoding stage, a key difference between our two previous experiments is the participant sample. Previous studies have shown that experience with other-race faces can significantly reduce or abolish the ORE (Estudillo et al. 2020; Meissner and Brigham 2001; Tanaka et al. 2013). Therefore, it is possible that the lack of ORE for Black participants in Experiment 1 could be explained by higher experience with White faces.
For these reasons we performed a third experiment in which Black participants were randomly allocated to either a full-view encoding condition (as in our previous Experiment 2) or a mixed encoding condition (as in our previous Experiment 1). In addition, to control for experience with White faces, we included an individuating questionnaire (Walker and Hewstone 2008, see appendix 1). If attention switching to the eyes explains the lack of ORE for Black participants in Experiment 1, in Experiment 3, we would expect to find a stronger ORE in participants who only encode faces in full view, compared to those who encode both full-view and masked faces. This study was pre-registered in the OSF Registries (https://osf.io/frt3a/).
4.1 Methods
4.1.1 Participants
Participants were tested online, using a PC or a Laptop. To ensure that this experiment had sufficient sensitivity to detect a three-way interaction between encoding condition, viewing condition, and face race, we conducted a power analysis using the MorePower software (Campbell and Thompson 2012). Assuming a medium effect size ( = 0.06) and a power of 0.80, these parameters indicate that we would need a sample of at least 74 Black participants. We recruited a total of 119 participants using the platform Testable Minds (www.testable.org). Twenty-three participants were removed from further analysis as they failed attention checks during the experiment. Thus, our final sample comprised 96 participants (48 females; Mage = 28.11; SDage = 8.35). All reported to be Black (18 non-African). We intentionally oversampled in this experiment to account for potential data loss due to incomplete responses and failed attention checks. Additionally, we aimed to maximize the likelihood of detecting an interaction between viewing condition, face race, and encoding condition, should such an interaction exist.
4.2 Materials and Procedure
The stimuli and procedure were identical to the previous experiments with the following differences: (1) participants were randomly allocated to either a full-view encoding condition or to a Mixed encoding condition, and (2) participants completed an individuating questionnaire adapted from Walker and Hewstone (2008) after completing the recognition task. The questionnaire measured the participants' experience with other-race faces and contained five items enquiring about their engagement in activities with White people (e.g., I have looked after or helped a White friend when someone was causing them trouble or being mean to them). Participants rated each statement on a five-point Likert scale, ranging from “Never” (anchored “0”) to “Very often” (“4”) (See Walker and Hewstone 2008 for details).
5 Results
Figure 3 shows mean d-prime across conditions. We conducted a 2 (viewing condition: full-view vs. mask; within) × 2 (face race: Black vs. White; within) × 2 (encoding condition: full-view encoding vs. mixed encoding; between) mixed ANOVA controlling for the variable other-race contact. The ANOVA revealed a main effect of face race [F(1, 93) = 9.201, p < 0.01, = 0.09], showing that our participants were more accurate with Black than with White faces. Neither the main effect of viewing condition [F(1, 93) = 1.59, p = 0.21] nor any other main effect or interactions reached statistical significance [all Fs(1, 93) ≤ 3.65, ps ≥ 0.06].

5.1 Discussion
Experiment 3 replicated the ORE observed in Experiment 2. Interestingly, this effect was consistent across both the full-view and mixed encoding conditions. This latter finding challenges the notion that the absence of the ORE for Black participants in Experiment 1 was due to an attention shift towards the top part of the face, encouraged by the mixed and unpredictable nature of masking during encoding. Similarly, the presence of the ORE for both masked and unmasked faces challenges predictions made by the holistic account. Notably, one unexpected discovery from the current experiment is the absence of mask effects. Although Figure 3 reveals a trend in the expected direction, it was not statistically significant (p = 0.21). One potential explanation for the lack of a mask effect could be the use of happy faces in the recognition stage. Compared to the neutral faces sampled during the encoding stage, happy expressions involve more pronounced changes in the lower part of the face. Thus, the change in expression from study to test might have had a stronger impact in the full-view condition, potentially reducing mask effects. However, given that masking effects were clearly obtained in Experiments 1 and 2, despite the employment of the same expression-incongruent stimuli across encoding and recognition, this explanation seems unlikely. Alternatively, the absence of mask effects may reflect participants' accumulated experience with face masks. While Experiments 1 and 2 were conducted during the COVID-19 pandemic, Experiment 3 took place in the spring of 2023. The extensive experience with masks acquired during the pandemic may have diminished their overall impact. Importantly, this explanation contrasts with a recent study that found no improvement in the recognition of masked faces despite prolonged and natural exposure to face masks (Freud et al. 2022).
6 Experiment 4
Since the outcomes of Experiment 3 appear to contradict not only our prior findings, but also those obtained by other researchers, a final experiment is required to further interrogate the mechanisms involved. In Experiment 4, our objectives are to further investigate whether (1) the absence of the ORE for Black participants in Experiment 1, interpreted as supportive of the attentional shift account, was simply an experimental artifact, as the results of Experiment 3 would suggest, and (2) the lack of a mask effect in Experiment 3 is also an experimental artifact, as indicated by the results of Experiments 1 and 2, along with recent research (Freud et al. 2022).
As in Experiment 1, we recruited both Black and White participants, which directly allows us to assess whether the lack of mask effects for Black faces in White participants arose due to the mixed encoding procedure (i.e., masks and no masks) adopted in Experiment 1. As in Experiment 3, one group of participants studied both full-view and masked faces (mixed encoding condition), and another group studied these faces in full view only. If the lack of ORE in Black participants found in Experiment 1 was a consequence of attention switch to the top part of the face, we would expect to find a stronger ORE in those Black participants who only encoded faces in full view compared to those who encoded both full-view and masked faces. This study was pre-registered OSF Registries (https://osf.io/frt3a/).
6.1 Methods
6.1.1 Participants, Materials and Procedure
Participants were tested online, using a PC or a Laptop. To ensure that this experiment had sufficient sensitivity to detect a four-way interaction between participants' race, encoding condition, viewing condition, and face race, we conducted a power analysis using the MorePower software (Campbell and Thompson 2012). We have assumed a small effect size ( = 0.06) and a power of 0.80; these parameters indicate that we would need a final sample of 128 participants to detect such an effect. To maximize the probability of finding an effect, if this exists, we recruited a total of 242 participants using the platform Testable Minds (www.testable.org). Thirty-five participants were removed from further analysis as they failed attention checks during the experiment. Thus, our final sample comprised 207 participants. Ninety-eight participants reported to be Black (55 females; Mage = 28.95; SDage = 7.51; 25 non-African) and 109 to be White (55 females, 1 other; Mage = 39.33; SDage = 11.40, all from white-majority countries). The stimuli and procedure were identical to Experiment 3, except that for White participants, the questionnaire items were instead worded to probe contact with Black.
6.2 Results
Figure 4 shows mean d-prime across conditions. We conducted a 2 (viewing condition: full-view vs. mask; within) × 2 (face race: Black vs. White; within) × 2 (participant's race: Black vs. White; between) × 2 (encoding condition: full-view encoding vs. mixed encoding; between) mixed ANOVA controlling for the variable other-race contact. The ANOVA revealed an interaction between face race and participants' race [F(1, 202) = 23.872, p < 0.001, = 0.254]. Post hoc t-tests revealed that White participants were better at recognizing White (M = 1.43; SEM = 0.05) than Black (M = 0.94; SEM = 0.04) faces [t(108) = 8.414, p < 0.001, d = 0.79, 95% CI = 0.57–1.01]. In contrast, Black participants were better at recognizing Black (M = 1.31; SEM = 0.05) than White (M = 1.09; SEM = 0.05) faces [t(97) = 3.854, p < 0.001, d = 0.39, 95% CI = 0.18–0.59].

The ANOVA also revealed a main effect of viewing condition [F(1, 202) = 12.169, p < 0.01, = 0.02], showing that participants were better at identifying full-view faces (M = 1.38; SEM = 0.07) compared to masked faces (M = 1.08; SEM = 0.07). This main effect was qualified by a three-way interaction between viewing condition, face race, and encoding condition [F(1, 202) = 4.540, p < 0.05, = 0.022]. None of the other main effects or interactions reached statistical significance [all Fs(1, 202) ≤ 3.72, ps ≥ 0.05].
To explore the three-way interaction, we calculated the Mask effect as the difference between sensitivity in the full-view and mask conditions for each race. Then we conducted a 2 (face race: Black vs. White) × 2 (encoding condition: full-view encoding vs. mixed encoding) mixed ANOVA on the mask effect. However, none of the main effects or interactions reached statistical significance [all Fs(1, 205) ≤ 1.16, ps ≥ 0.28].
6.3 Discussion
Experiment 4 revealed that both Black and White participants were better at recognizing faces from their own race. This contrasts with the results of Experiment 1, in which only White participants showed a clear ORE. In addition, we also found that participants were better to recognize full-view faces compared to masked faces, and this mask effect was of similar magnitude, irrespective of encoding condition, and participant and stimulus race. Firstly, this latter finding contrasts with those obtained in Experiment 3 and suggests that face masks still negatively impact face recognition, despite increased natural exposure to these stimuli in our environment (see also Freud et al. 2022). Secondly, they contrast with the results of Experiment 1; contrary to the expectations of an attentional shift (or holistic) hypothesis, masking had undifferentiated and negative impact irrespective of whether stimuli comprised own- or other-race faces.
6.4 Internal Meta-Analyses
To gain a better understanding of masking effects for own- and other-race faces, and whether these effects differ across Black and White participants and encoding conditions, we conducted two internal meta-analyses (see Goh et al. 2016). The aim of the first meta-analysis was to compare the mask effect for own- and other-race faces. If, as previous research suggests (e.g., Freud et al. 2020), face masks disrupt holistic face processing, larger mask effects for own- versus. other-race faces would support the holistic account of the ORE (DeGutis et al. 2013; Michel et al. 2006; Rhodes et al. 1989; Tanaka et al. 2004). Conversely, the attentional shift account of the ORE predicts that this mask effect would be modulated by participants' race and the encoding condition. This possibility was explored in the second internal meta-analysis, which included participants' race and encoding condition as moderators.
Across the different conditions and experiments, we calculated the mask effect as the difference in performance between full-view and masked faces. We then calculated the Cohen's d of these mask effects when comparing own- vs. other-race faces across encoding conditions and participants' race. The first meta-analysis revealed stronger mask effects for own compared to other race faces [Z = 2.54, p < 0.001, d = 0.23]. However, the second meta-analysis revealed that neither participant race nor encoding condition moderated mask effects [Qb(2) = 0.53, p = 0.76].
7 General Discussion
The aim of this study was to investigate the effects of face masks in the recognition of own- and other-race faces. In Experiment 1, participants performed a standard old/new face recognition task including Black and White faces. Faces were presented with and without masks, with congruent presentation of face stimuli at encoding and test. Results showed that face masks impaired face recognition of both Black and White participants, but this effect was stronger in the former group. This difference across races seems to reflect the lack of mask effects for Black faces in White participants. In other words, in White participants, face masks did not reduce recognition of Black faces. While these findings may provide asymmetric, participant-race-specific support for the holistic view of the ORE (e.g., Carragher et al. 2022; Carragher and Hancock 2020; Estudillo et al. 2021; Estudillo and Wong 2022), other effects obtained in this experiment instead support an attentional shift account (e.g., Hills et al. 2013; Hills and Lewis 2011). Specifically, Black participants showed no evidence of the ORE in either the full-view or masked conditions. This may suggest that race-specific differences in default focuses of attention to facial features may be nullified when the presence of masks forces participants to focus on the upper region of the face; a strategy optimal for the processing of White (but not Black) faces, and which may be adopted in response to both full-view and masked faces when the presence of masks is unpredictable at encoding.
Experiment 2 thus sought to assess whether the intermixed, randomised encoding conditions, utilised in Experiment 1, were responsible for driving trends supportive of the attentional shift hypothesis. As such, in Experiment 2, all faces were encoded in full-view, allowing Black participants to use their default lower-region focus for all faces, with masks only super-imposed during the recognition stage. An ORE, undifferentiated by masking status was indeed found; Experiment 2 thus seemed to confirm that Experiment 1's support for the attentional shift account was a methodological artefact, while itself neither providing strong support for the attentional shift nor holistic accounts of the ORE, as each would predict an interaction to emerge between stimulus race (own vs. other) and masking status. Experiments 3 and 4 thus aimed to further interrogate the differences observed in the findings across Experiments 1 and 2 by (a) including both mixed and full-view encoding as a between-subjects variable, and (b) controlling for another factor that may have specifically driven disparities in the presence and magnitude of the ORE i.e., participant differences in the quantity and quality of contact with other-race persons. In both experiments, we found a clear ORE in our Black-only and mixed-race samples. In addition, Experiment 4 also revealed a comparable and undifferentiated mask effect. In other words, face masks impaired the identification of both own and other-race faces, replicating Experiment 2, and reducing the likelihood that the absence of significant masking effects in Experiment 3 resulted from increased exposure to masked stimuli across the course of the pandemic. Interestingly, however, our internal meta-analyses revealed that while the mask effect was stronger for own- compared to other-race faces, this difference was not moderated by participant's race or encoding condition.
Taken together, our experiments generally replicate the detrimental effects of masks on face identification previously reported (Carragher et al. 2022; Carragher and Hancock 2020; Estudillo et al. 2021; Estudillo and Wong 2022). From a theoretical perspective, while some of our results provided partial and limited support for both the attentional shift and holistic accounts of the ORE, others supported neither. However, our internal meta-analyses appear to favour the holistic account, albeit with a small overall effect size. One possible explanation for this small effect is that face masks may not effectively engage the intended cognitive processes. For example, it is possible that in addition to holistic and/or configural processing, face masks also impair featural processing (Stajduhar et al. 2022). In addition, it is also possible that the role of holistic processing and/or attention to specific facial features on the ORE is smaller than previously proposed (DeGutis et al. 2013; Hills and Pake 2013; Hills et al. 2013; Michel et al. 2006; Tanaka et al. 2004). In fact, recent research has suggested that holistic processing is similar for own- and other-race faces (Wong et al. 2021). Similarly, it has also been recently shown that forcing participants to drop their default, own-race biased focus of attention, in favour of a style that increases focus on other-race category-diagnostic features does not always decrease the ORE (Wittwer et al. 2019).
There is one important limitation of the current study to highlight. Our sample of White participants came all from White-majority countries (e.g., US, France, and UK). However, although across our experiments around 80% of our Black participants came from Black-majority countries (e.g., South Africa, Nigeria, and Zimbabwe), some came from White-majority countries (e.g., US and UK). These differences could be problematic for two reasons. First, Black participants from White-majority countries could have higher social contact and perceptual experience with White faces, which might reduce the magnitude of the ORE in this group. Indeed, while we initially attributed the lack of the ORE among Black participants in Experiment 1 to an attentional shift induced by mixed and unpredictable views at face encoding, this effect went unreplicated in the subsequent experiments that included this same encoding manipulation (Experiments 3 and 4), which might suggest that differences in our sample's level of other-race contact were instead responsible for these disparities, particularly as other-race contact was controlled for in Experiments 3 and 4. In addition, our face stimuli were sourced from Black-American individuals, while most of our Black participants were of African origin. This mismatch could potentially explain the absence of ORE for this group in Experiment 1. However, this explanation seems unlikely, as the ORE was clearly observed in Experiments 2, 3, and 4, despite a similar ratio of African to non-African Black participants as in Experiment 1.
The use of specific processing strategies can potentially impact both the effects of face masking (Carragher et al. 2022) and race in face identification (Hills et al. 2013; Hills and Lewis 2006, 2011). For example, recent research suggests that directing participants to focus on diagnostic facial features reduces the effect of face masks on face identification (Carragher et al. 2022). To avoid introducing such strategies that could alter the natural processing of facial stimuli (e.g., Michel et al. 2007; Richler et al. 2011), we intermixed the study lists (Experiments 1, 3 and 4) and recognition stimuli (in all the experiments). Nevertheless, it is possible that the effect of masks, race, and their interaction might differ if conditions were blocked, as participants could develop specific, optimal strategies over time. This represents an interesting avenue for future research.
In conclusion, the current study suggests that face masks impair the recognition of both own- and other-race faces, a finding with important forensic implications. However, this effect appears to be slightly larger for own-race faces than for other-race faces, supporting the holistic processing account of the ORE. From a methodological standpoint, this study also underscores the importance of replication and the use of internal meta-analyses to minimize the risk of Type I and Type II errors.
Author Contributions
Alejandro J. Estudillo: conceptualization, funding acquisition, investigation, writing – original draft, data curation, formal analysis, project administration, writing – review and editing, methodology, visualization. Chang Hong Liu: conceptualization, funding acquisition, investigation, writing – review and editing, methodology. Emma Portch: conceptualization, investigation, funding acquisition, methodology, writing – review and editing.
Ethics Statement
This study was approved by the research ethics committee of Bournemouth University.
Consent
Informed consent was obtained from participants before participation.
Conflicts of Interest
The authors declare no conflicts of interest.
Open Research
Data Availability Statement
The data that support the findings of this study are openly available in OSF at https://osf.io/frt3a/.