Audiovisual synchrony detection for fluent speech in early childhood: An eye-tracking study
Han-yu Zhou
Neuropsychology and Applied Cognitive Neuroscience Laboratory, CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
Search for more papers by this authorHan-xue Yang
Neuropsychology and Applied Cognitive Neuroscience Laboratory, CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
Search for more papers by this authorZhen Wei
Affiliated Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, China
Search for more papers by this authorGuo-bin Wan
Affiliated Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, China
Search for more papers by this authorSimon S. Y. Lui
Department of Psychiatry, The University of Hong Kong, Hong Kong Special Administrative Region, China
Search for more papers by this authorCorresponding Author
Raymond C. K. Chan
Neuropsychology and Applied Cognitive Neuroscience Laboratory, CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
Correspondence
Professor Raymond C. K. Chan, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Beijing 10101, China.
Email: [email protected]
Search for more papers by this authorHan-yu Zhou
Neuropsychology and Applied Cognitive Neuroscience Laboratory, CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
Search for more papers by this authorHan-xue Yang
Neuropsychology and Applied Cognitive Neuroscience Laboratory, CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
Search for more papers by this authorZhen Wei
Affiliated Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, China
Search for more papers by this authorGuo-bin Wan
Affiliated Shenzhen Maternity and Child Healthcare Hospital, Shenzhen, China
Search for more papers by this authorSimon S. Y. Lui
Department of Psychiatry, The University of Hong Kong, Hong Kong Special Administrative Region, China
Search for more papers by this authorCorresponding Author
Raymond C. K. Chan
Neuropsychology and Applied Cognitive Neuroscience Laboratory, CAS Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
Correspondence
Professor Raymond C. K. Chan, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Beijing 10101, China.
Email: [email protected]
Search for more papers by this authorFunding information: National Natural Science Foundation of China, Grant/Award Number: 31970997
Abstract
During childhood, the ability to detect audiovisual synchrony gradually sharpens for simple stimuli such as flashbeeps and single syllables. However, little is known about how children perceive synchrony for natural and continuous speech. This study investigated young children's gaze patterns while they were watching movies of two identical speakers telling stories side by side. Only one speaker's lip movements matched the voices and the other one either led or lagged behind the soundtrack by 600 ms. Children aged 3–6 years (n = 94, 52.13% males) showed an overall preference for the synchronous speaker, with no age-related changes in synchrony-detection sensitivity as indicated by similar gaze patterns across ages. However, viewing time to the synchronous speech was significantly longer in the auditory-leading (AL) condition compared with that in the visual-leading (VL) condition, suggesting asymmetric sensitivities for AL versus VL asynchrony have already been established in early childhood. When further examining gaze patterns on dynamic faces, we found that more attention focused on the mouth region was an adaptive strategy to read visual speech signals and thus associated with increased viewing time of the synchronous videos. Attention to detail, one dimension of autistic traits featured by local processing, has been found to be correlated with worse performances in speech synchrony processing. These findings extended previous research by showing the development of speech synchrony perception in young children, and may have implications for clinical populations (e.g., autism) with impaired multisensory integration.
CONFLICT OF INTEREST
The authors declare that there are no conflicts of interest.
Supporting Information
Filename | Description |
---|---|
pchj538-sup-0001-supinfo.docxWord 2007 document , 39.5 KB | Appendix S1. Supporting Information |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
REFERENCES
- Auyeung, B., Baron-Cohen, S., Wheelwright, S., & Allison, C. (2008). The autism spectrum quotient: Children's version (AQ-Child). Journal of Autism and Developmental Disorders, 38(7), 1230–1240. https://doi.org/10.1007/s10803-007-0504-z
- Bahrick, L. E. (1988). Intermodal learning in infancy: Learning on the basis of two kinds of invariant relations in audible and visible events. Child Development, 59, 197–209. https://doi.org/10.2307/1130402
- Bahrick, L. E., Flom, R., & Lickliter, R. (2002). Intersensory redundancy facilitates discrimination of tempo in 3-month-old infants. Developmentaal Psychobiology, 41, 352–363. https://doi.org/10.1002/dev.10049
- Bahrick, L. E., & Lickliter, R. (2000). Intersensory redundancy guides attentional selectivity and perceptual learning in infancy. Developmental Psychology, 36(2), 190–201. https://doi.org/10.1037//0012-1649.36.2.190
- Bahrick, L. E., McNew, M. E., Pruden, S. M., & Castellanos, I. (2019). Intersensory redundancy promotes infant detection of prosody in infant-directed speech. Journal of Experimental Child Psychology, 183, 295–309. https://doi.org/10.1016/j.jecp.2019.02.008
- Bahrick, L. E., Soska, K. C., & Todd, J. T. (2018a). Assessing individual differences in the speed and accuracy of intersensory processing in young children: The Intersensory processing efficiency protocol. Developmental Psychology, 54(12), 2226–2239. https://doi.org/10.1037/dev0000575
- Bahrick, L. E., Todd, J. T., & Soska, K. C. (2018b). The multisensory attention assessment protocol (MAAP): Characterizing individual differences in multisensory attention skills in infants and children and relations with language and cognition. Developmental Psychology, 54(12), 2207–2225. https://doi.org/10.1037/dev0000594
- Bebko, J. M., Weiss, J. A., Demark, J. L., & Gomez, P. (2006). Discrimination of temporal synchrony in intermodal events by children with autism and children with developmental disabilities without autism. Journal of Child Psychology and Psychiatry, 47(1), 88–98. https://doi.org/10.1111/j.1469-7610.2005.01443.x
- Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological), 57, 289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x
- Buchan, J., Paré, M., & Munhall, K. (2007). Spatial statistics of gaze fixations during dynamic face processing. Social Neuroscience, 2(1), 1–13. https://doi.org/10.1080/17470910601043644
- Cecere, R., Gross, J., Willis, A., & Thut, G. (2017). Being first matters: Topographical representational similarity analysis of ERP signals reveals separate networks for audiovisual temporal binding depending on the leading sense. Journal of Neuroscience, 37(21), 5274–5287. https://doi.org/10.1523/JNEUROSCI.2926-16.2017
- Chen, Y. C., Shore, D. I., Lewis, T. L., & Maurer, D. (2016). The development of the perception of audiovisual simultaneity. Journal of Experimental Child Psychology, 146, 17–33. https://doi.org/10.1016/j.jecp.2016.01.010
- Conrey, B., & Pisoni, D. B. (2006). Auditory-visual speech perception and synchrony detection for speech and nonspeech signals. Journal of the Acoustical Society of America, 119(6), 4065–4073. https://doi.org/10.1121/1.2195091
- Conti-Ramsden, G., & Durkin, K. (2012). Language development and assessment in the preschool period. Neuropsychology Review, 22(4), 384–401. https://doi.org/10.1007/s11065-012-9208-z
- Curtindale, L. M., Bahrick, L. E., Lickliter, R., & Colombo, J. (2019). Effects of multimodal synchrony on infant attention and heart rate during events with social and nonsocial stimuli. Journal of Experimental Child Psychology, 178, 283–294. https://doi.org/10.1016/j.jecp.2018.10.006
- Dodd, B. (1979). Lip reading in infants: Attention to speech presented in- and out-of-synchrony. Cognitive Psychology, 11(4), 478–484. https://doi.org/10.1016/0010-0285(79)90021-5
- Donohue, S. E., Darling, E. F., & Mitroff, S. R. (2012). Links between multisensory processing and autism. Experimental Brain Research, 222(4), 377–387. https://doi.org/10.1007/s00221-012-3223-4
- Droit-Volet, S. (2013). Time perception in children: A neurodevelopmental approach. Neuropsychologia, 51, 220–234. https://doi.org/10.1016/j.neuropsychologia.2012.09.023
- Droit-Volet, S., & Coull, J. T. (2016). Distinct developmental trajectories for explicit and implicit timing. Journal of Experimental Child Psychology, 150, 141–154. https://doi.org/10.1016/j.jecp.2016.05.010
- Golinkoff, R. M., Ma, W., Song, L., & Hirsh-Pasek, K. (2013). Twenty-five years using the intermodal preferential looking paradigm to study language acquisition: What have we learned? Perspectives on Psychological Sciences, 8(3), 316–339. https://doi.org/10.1177/1745691613484936
- Grossman, R. B., Steinhart, E., Mitchell, T., & Mcilvane, W. (2015). “Look who's talking!” gaze patterns for implicit and explicit audio-visual speech synchrony detection in children with high-functioning autism. Autism Research, 8(3), 307–316. https://doi.org/10.1002/aur.1447
- Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis: A regression-based approach. The Guilford Press.
- Heaton, T. J., & Freeth, M. (2016). Reduced visual exploration when viewing photographic scenes in individuals with autism spectrum disorder. Journal of Abnormal Psychology, 125(3), 399–411. https://doi.org/10.1037/abn0000145
- Hillock, A. R., Powers, A. R., & Wallace, M. T. (2011). Binding of sights and sounds: Age-related changes in multisensory temporal processing. Neuropsychologia, 49(3), 461–467. https://doi.org/10.1016/j.neuropsychologia.2010.11.041
- Hillock-Dunn, A., Grantham, D. W., & Wallace, M. T. (2016). The temporal binding window for audiovisual speech: Children are like little adults. Neuropsychologia, 88, 74–82. https://doi.org/10.1016/j.neuropsychologia.2016.02.017
- Hillock-Dunn, A., & Wallace, M. T. (2012). Developmental changes in the multisensory temporal binding window persist into adolescence. Developmental Science, 15, 688–696. https://doi.org/10.1111/j.1467-7687.2012.01171.x
- Kaganovich, N. (2016). Development of sensitivity to audiovisual temporal asynchrony during midchildhood. Developmental Psychology, 52(2), 232–241. https://doi.org/10.1037/dev0000073
- Kaganovich, N., Schumaker, J., Leonard, L. B., Gustafson, D., & Macias, D. (2014). Children with a history of SLI show reduced sensitivity to audiovisual temporal asynchrony: An ERP study. Journal of Speech Language and Hearing Research, 57, 1480–1502. https://doi.org/10.1044/2014_JSLHR-L-13-0192
- Lansing, C. R., & McConkie, G. W. (2003). Word identification and eye fixation locations in visual and visual-plus-auditory presentations of spoken sentences. Perception & Psychophysics, 65(4), 536–552. https://doi.org/10.3758/bf03194581
- Lewkowicz, D. J. (1996). Perception of auditory-visual temporal synchrony in human infants. Journal of Experimental Psychology: Human Perception and Performance, 22(5), 1094–1106.
- Lewkowicz, D. J. (2010). Infant perception of audio-visual speech synchrony. Developmental Psychology, 46(1), 66–77. https://doi.org/10.1037/a0015579
- Lewkowicz, D. J., & Flom, R. (2014). The audiovisual temporal binding window narrows in early childhood. Child Development, 85(2), 685–694. https://doi.org/10.1111/cdev.12142
- Lewkowicz, D. J., & Hansen-Tift, A. M. (2012). Infants deploy selective attention to the mouth of a talking face when learning speech. Proceedings of the National Academy of Sciences, 109(5), 1431–1436. https://doi.org/10.1073/pnas.1114783109
- Magnotti, J. F., Ma, W. J., & Beauchamp, M. S. (2013). Causal inference of asynchronous audiovisual speech. Frontiers in Psychology, 4, 798. https://doi.org/10.3389/fpsyg.2013.00798
- Murray, M. M., Lewkowicz, D. J., Amedi, A., & Wallace, M. T. (2016). Multisensory processes: A balancing act across the lifespan. Trends in Neurosciences, 39(8), 567–579. https://doi.org/10.1016/j.tins.2016.05.003
- Patten, E., Watson, L. R., & Baranek, G. T. (2014). Temporal synchrony detection and associations with language in young children with ASD. Autism Research and Treatment, 2014, 678346. https://doi.org/10.1155/2014/678346
- Pons, F., Andreu, L., Sanz-Torrent, M., Buil-Legaz, L., & Lewkowicz, D. J. (2013). Perception of audio-visual speech synchrony in Spanish-speaking children with and without specific language impairment. Journal of Child Language, 40, 687–700. https://doi.org/10.1017/S0305000912000189
- Powers, A. R., Hillock, A. R., & Wallace, M. T. (2009). Perceptual training narrows the temporal window of multisensory binding. Journal of Neuroscience, 29(39), 12265–12274. https://doi.org/10.1523/JNEUROSCI.3501-09.2009
- Reynolds, G. D., Bahrick, L. E., Lickliter, R., & Guy, M. W. (2014). Neural correlates of intersensory processing in five-month-old infants. Developmental Psychobiology, 56(3), 355–372. https://doi.org/10.1002/dev.21104
- Righi, G., Tenenbaum, E. J., McCormick, C., Blossom, M., Amso, D., & Sheinkopf, S. J. (2018). Sensitivity to audio-visual synchrony and its relation to language abilities in children with and without ASD. Autism Research, 11(4), 645–653. https://doi.org/10.1002/aur.1918
- Smith, H., & Milne, E. (2009). Reduced change blindness suggests enhanced attention to detail in individuals with autism. Journal of Child Psychology and Psychiatry and Allied Disciplines, 50(3), 300–306. https://doi.org/10.1111/j.1469-7610.2008.01957.x
- Spelke, E. S. (1979). Perceiving bimodally specified events in infancy. Developmental Psychology, 15(6), 626–636. https://doi.org/10.1037/0012-1649.15.6.626
- Stevenson, R. A., Segers, M., Ncube, B. L., Black, K. R., Bebko, J. M., Ferber, S., & Barense, M. D. (2018). The cascading influence of multisensory processing on speech perception in autism. Autism, 22(5), 609–624. https://doi.org/10.1177/1362361317704413
- Stevenson, R. A., Toulmin, J. K., Youm, A., Besney, R. M. A., Schulz, S. E., Barense, M. D., & Ferber, S. (2017). Increases in the autistic trait of attention to detail are associated with decreased multisensory temporal adaptation. Scientific Reports, 7, 14354. https://doi.org/10.1038/s41598-017-14632-1
- Sumby, W. H., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. Journal of the Acoustical Society of America, 26(2), 212–215. https://doi.org/10.1121/1.1907309
- Yuan, X., Li, H., Liu, P., Yuan, H., & Huang, X. (2016). Pre-stimulus beta and gamma oscillatory power predicts perceived audiovisual simultaneity. International Journal of Psychophysiology, 107, 29–36. https://doi.org/10.1016/j.ijpsycho.2016.06.017
- Zhou, H. Y., Cheung, E. F. C., & Chan, R. C. K. (2020). Audiovisual temporal integration: Cognitive processing, neural mechanisms, developmental trajectory and potential interventions. Neuropsychologia, 140, 107396. https://doi.org/10.1016/j.neuropsychologia.2020.107396
- Zhou, H. Y., Shi, L. J., Yang, H. X., Cheung, E. F. C., & Chan, R. C. K. (2020). Audiovisual temporal integration and rapid temporal recalibration in adolescents and adults: Age-related changes and its correlation with autistic traits. Autism Research, 13(4), 615–626. https://doi.org/10.1002/aur.2249
- Zmigrod, L., & Zmigrod, S. (2016). On the temporal precision of thought: Individual differences in the multisensory temporal binding window predict performance on verbal and nonverbal problem solving tasks. Multisensory Research, 29(8), 679–701. https://doi.org/10.1163/22134808-00002532