Elevenlabs Text-To-Speech and Arabic Listening Achievement in Pesantren
DOI:
https://doi.org/10.32332/an-nabighoh.v28i1.1-20Keywords:
ElevenLabs, Text-to-Speech, Arabic Listening Skills, Artificial Intelligence, Pesantren EducationAbstract
Background: Listening comprehension remains a major challenge in Arabic language learning, particularly in non-Arabic-speaking contexts such as Indonesian pesantren. Limited exposure to authentic auditory input often hinders phoneme recognition, prosodic awareness, and comprehension of connected speech. Recent advances in artificial intelligence, especially Text-to-Speech (TTS) technology, offer new opportunities to provide consistent, high-quality auditory input that can enhance listening skill development. Research Objectives: This study aims to examine the effectiveness of the ElevenLabs Text-to-Speech technology in improving students’ Arabic listening comprehension and to explore students’ perceptions regarding the clarity, authenticity, and motivational impact of AI-generated Arabic audio. Methodology: The study employed a mixed-methods quasi-experimental pre-test–post-test design complemented with interviews and classroom observations. Sixty intermediate students from Pondok Pesantren Darullughah Wadda’wah were assigned to an experimental group and a control group. The experimental group used ElevenLabs TTS materials, while the control group used teacher-narrated audio. Quantitative data were analyzed using paired-sample t-tests and qualitative data were processed through thematic analysis. Results: The findings revealed a significant improvement in the experimental group’s listening scores, increasing from a mean of 63.2 to 82.7, while the control group showed only moderate improvement from 64.5 to 71.4. The statistical analysis indicated a significant difference (p < 0.05). Qualitative results also showed that students perceived the AI-generated voices as clearer, more authentic, and emotionally engaging, which enhanced their motivation and reduced listening anxiety. Unique Contribution: This study contributes to the emerging field of AI-assisted Arabic language education by demonstrating how human-like TTS technology can be effectively integrated into pesantren-based learning environments while maintaining both pedagogical and spiritual values. Conclusion: The study confirms that ElevenLabs TTS significantly enhances Arabic listening comprehension by providing consistent, expressive, and cognitively accessible auditory input for learners. Recommendations: Future studies are recommended to explore long-term applications of AI-based listening tools across other Arabic language skills and in broader Islamic educational contexts.
Downloads
References
Abdellatif, Mohamed Sayed, Mohammed A. Alshehri, Hamoud A. Alshehri, Waheed Elsayed Hafez, Mona G. Gafar, and Ali Lamouchi. “I Am All Ears: Listening Exams with AI and Its Traces on Foreign Language Learners’ Mindsets, Self-Competence, Resilience, and Listening Improvement.” Language Testing in Asia 14, no. 1 (2024): 54. https://doi.org/10.1186/s40468-024-00329-6. DOI: https://doi.org/10.1186/s40468-024-00329-6
Abdulloh, Ferian Fauzi, Majid Rahardi, Afrig Aminuddin, Sharazita Dyah Anggita, and Arfan Yoga Aji Nugraha. “Observation of Imbalance Tracer Study Data for Graduates Employability Prediction in Indonesia.” International Journal of Advanced Computer Science and Applications 13, no. 8 (2022). https://doi.org/10.14569/IJACSA.2022.0130820. DOI: https://doi.org/10.14569/IJACSA.2022.0130820
Ada, Ada Ada, Stina Hasse Jørgensen, and Jonas Fritsch. “Cultures of the AI Paralinguistic in Voice Cloning Tools.” Designing Interactive Systems Conference, July 2024, 249–52. https://doi.org/10.1145/3656156.3663708. DOI: https://doi.org/10.1145/3656156.3663708
Adeoye‐Olatunde, Omolola A., and Nicole L. Olenik. “Research and Scholarly Methods: Semi‐structured Interviews.” JACCP: Journal of the American College of Clinical Pharmacy 4, no. 10 (2021): 1358–67. https://doi.org/10.1002/jac5.1441. DOI: https://doi.org/10.1002/jac5.1441
Ahmad, Maysa, Ahmad S. Haider, and Hadeel Saed. “Assessing AI-Driven Dubbing Websites: Reactions of Arabic Native Speakers to AI-Dubbed English Videos in Arabic.” Research Journal in Advanced Humanities 6, no. 1 (2025). https://royalliteglobal.com/advanced-humanities/article/view/1963.
Alzanin, Samah M., Aqil M. Azmi, and Hatim A. Aboalsamh. “Short Text Classification for Arabic Social Media Tweets.” Journal of King Saud University - Computer and Information Sciences 34, no. 9 (2022): 6595–604. https://doi.org/10.1016/j.jksuci.2022.03.020. DOI: https://doi.org/10.1016/j.jksuci.2022.03.020
Arisandi, Yusuf, and Moh. Tohiri Habib. “Optimizing YouTube for Interactive Arabic Learning in Pesantren: Effective Content Creation Strategies.” International Journal of Arabic Language Teaching 7, no. 02 (2025): 239–54. https://doi.org/10.32332/ijalt.v7i02.10363. DOI: https://doi.org/10.32332/ijalt.v7i02.10363
Baharun, Segaf, Nur Hanifansyah, and Aufa Hanin Salsabil. “Creative Arabic Learning through Student-Made Storytelling: A Constructivist Approach in Malaysian Islamic Schools.” Arabi : Journal of Arabic Studies 10, no. 2 (2025): 149–61. https://doi.org/10.24865/ajas.v10i2.1008. DOI: https://doi.org/10.24865/ajas.v10i2.1008
Baharun, Segaf, Muhamad Solehudin, Masnun, and Syarif Muhammad Syaheed. “The I’rab Method of Habib Hasan Baharun: Impact on Arabic Grammar Instruction.” Al-Muhawaroh: Jurnal Pendidikan Bahasa Arab 1, no. 1 (2025): 23–35. https://doi.org/10.38073/almuhawaroh.v1i1.2636. DOI: https://doi.org/10.38073/almuhawaroh.v1i1.2636
Bsharat-Maalouf, Dana, Tamar Degani, and Hanin Karawani. “The Involvement of Listening Effort in Explaining Bilingual Listening Under Adverse Listening Conditions.” Trends in Hearing 27 (January 2023): 23312165231205107. https://doi.org/10.1177/23312165231205107. DOI: https://doi.org/10.1177/23312165231205107
Chen, Jun, and Xinran Lehto. “The Impact of Sound Design with AI Synthetic Voices on the Listening Experience in Audio Tour Guides.” Information Technology & Tourism 27, no. 4 (2025): 1081–109. https://doi.org/10.1007/s40558-025-00332-4. DOI: https://doi.org/10.1007/s40558-025-00332-4
Chung, Wei-Lun. “General Auditory Processing, Mandarin L1 Prosodic and Phonological Awareness, and English L2 Word Learning.” International Review of Applied Linguistics in Language Teaching 63, no. 3 (2025): 1895–914. https://doi.org/10.1515/iral-2023-0168. DOI: https://doi.org/10.1515/iral-2023-0168
Creswell, John W., and J. David Creswell. Research Design: Qualitative, Quantitative, and Mixed Methods Approaches. Los Angeles. SAGE Publications, 2020.
Deterding, Nicole M., and Mary C. Waters. “Flexible Coding of In-Depth Interviews: A Twenty-First-Century Approach.” Sociological Methods & Research 50, no. 2 (2021): 708–39. https://doi.org/10.1177/0049124118799377. DOI: https://doi.org/10.1177/0049124118799377
Di Paolo, Alessio. “The Relation Between Mayer’s Multimedia Theory and Berthoz’s Simplexity Paradigm for Inclusive Education.” In Advanced Research in Technologies, Information, Innovation and Sustainability, vol. 1936. Communications in Computer and Information Science. Springer Nature Switzerland, 2024. https://doi.org/10.1007/978-3-031-48855-9_24. DOI: https://doi.org/10.1007/978-3-031-48855-9_24
Dignath, Charlotte, and Marcel V. J. Veenman. “The Role of Direct Strategy Instruction and Indirect Activation of Self-Regulated Learning—Evidence from Classroom Observation Studies.” Educational Psychology Review 33, no. 2 (2021): 489–533. https://doi.org/10.1007/s10648-020-09534-0. DOI: https://doi.org/10.1007/s10648-020-09534-0
Fathi, Jalil, Masoud Rahimi, and Ali Derakhshan. “Improving EFL Learners’ Speaking Skills and Willingness to Communicate via Artificial Intelligence-Mediated Interactions.” System 121 (April 2024): 103254. https://doi.org/10.1016/j.system.2024.103254. DOI: https://doi.org/10.1016/j.system.2024.103254
Golding, Jonathan M., Anne Lippert, Jeffrey S. Neuschatz, Ilyssa Salomon, and Kelly Burke. “Generative AI and College Students: Use and Perceptions.” Teaching of Psychology 52, no. 3 (2025): 369–80. https://doi.org/10.1177/00986283241280350. DOI: https://doi.org/10.1177/00986283241280350
Hanifansyah, Nur, Ahmad Arifin, Zulpina Zulpina, Menik Mahmudah, and Syarif Muhammad Syaheed. “Religious Drama Controversy: The Impact of Bidaah on Islamic Pedagogy and Media Literacy.” MIQOT: Jurnal Ilmu-Ilmu Keislaman 49, no. 2 (2025): 314. https://doi.org/10.30821/miqot.v49i2.1407. DOI: https://doi.org/10.30821/miqot.v49i2.1407
Ji, Shengpeng, Jialong Zuo, Minghui Fang, et al. “TextrolSpeech: A Text Style Control Speech Corpus with Codec Language Text-to-Speech Models.” ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 14, 2024, 10301–5. https://doi.org/10.1109/ICASSP48485.2024.10445879. DOI: https://doi.org/10.1109/ICASSP48485.2024.10445879
Keelor, Jennifer L., Nancy A. Creaghead, Noah H. Silbert, Allison D. Breit, and Tzipi Horowitz-Kraus. “Impact of Text-to-Speech Features on the Reading Comprehension of Children with Reading and Language Difficulties.” Annals of Dyslexia 73, no. 3 (2023): 469–86. https://doi.org/10.1007/s11881-023-00281-9. DOI: https://doi.org/10.1007/s11881-023-00281-9
Krashen, Stephen D. “Acquiring A Second Language.” World Englishes 1, no. 3 (1982): 97–101. https://doi.org/10.1111/j.1467-971X.1982.tb00476.x. DOI: https://doi.org/10.1111/j.1467-971X.1982.tb00476.x
Kumar, Yogesh, Apeksha Koul, and Chamkaur Singh. “A Deep Learning Approaches in Text-to-Speech System: A Systematic Review and Recent Research Perspective.” Multimedia Tools and Applications 82, no. 10 (2023): 15171–97. https://doi.org/10.1007/s11042-022-13943-4. DOI: https://doi.org/10.1007/s11042-022-13943-4
Lian, Hailun, Cheng Lu, Sunan Li, Yan Zhao, Chuangao Tang, and Yuan Zong. “A Survey of Deep Learning-Based Multimodal Emotion Recognition: Speech, Text, and Face.” Entropy 25, no. 10 (2023): 1440. https://doi.org/10.3390/e25101440. DOI: https://doi.org/10.3390/e25101440
Ma, Ziyang, Wen Wu, Zhisheng Zheng, et al. “Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion Recognition.” ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), April 14, 2024, 11146–50. https://doi.org/10.1109/ICASSP48485.2024.10445906. DOI: https://doi.org/10.1109/ICASSP48485.2024.10445906
Mohamed, Abdelrahman, Hung-yi Lee, Lasse Borgholt, et al. “Self-Supervised Speech Representation Learning: A Review.” IEEE Journal of Selected Topics in Signal Processing 16, no. 6 (2022): 1179–210. https://doi.org/10.1109/JSTSP.2022.3207050. DOI: https://doi.org/10.1109/JSTSP.2022.3207050
Mohebbi, Ahmadreza. “Enabling Learner Independence and Self-Regulation in Language Education Using AI Tools: A Systematic Review.” Cogent Education 12, no. 1 (2025): 2433814. https://doi.org/10.1080/2331186X.2024.2433814. DOI: https://doi.org/10.1080/2331186X.2024.2433814
Mulyadi, Dodi, Testiana Deni Wijayatiningsih, Charanjit Kaur Swaran Singh, and Entika Fani Prastikawati. “Effects of Technology Enhanced Task-Based Language Teaching on Learners’ Listening Comprehension and Speaking Performance.” International Journal of Instruction 14, no. 3 (2021): 717–36. https://doi.org/10.29333/iji.2021.14342a. DOI: https://doi.org/10.29333/iji.2021.14342a
Mulyanto, Dedi, Muhammad Wahyudi, Arsyad Muhammad Ali Ridho, and Muhammad Zaki. “Utilization of Artificial Intelligence with Text-To-Speech Technology Based on Natural Language Processing to Enhance Arabic Listening Skills for Non-Native Speakers.” Alsinatuna 10, no. 1 (2024). https://e-journal.uingusdur.ac.id/alsinatuna/article/view/7952.
P, Matan, and P. Velvizhy. “A Comprehensive Review of Expressive Text-To-Speech Systems and Its Advancements and Challenges.” 2025 International Conference on Inventive Computation Technologies (ICICT), April 23, 2025, 49–54. https://doi.org/10.1109/ICICT64420.2025.11004848. DOI: https://doi.org/10.1109/ICICT64420.2025.11004848
Reddy, V. Madhusudhana, T. Vaishnavi, and K. Pavan Kumar. “Speech-to-Text and Text-to-Speech Recognition Using Deep Learning.” 2023 2nd International Conference on Edge Computing and Applications (ICECAA), July 19, 2023, 657–66. https://doi.org/10.1109/ICECAA58104.2023.10212222. DOI: https://doi.org/10.1109/ICECAA58104.2023.10212222
Roussel, Stéphanie, André Tricot, and John Sweller. “The Advantages of Listening to Academic Content in a Second Language May Be Outweighed by Disadvantages: A Cognitive Load Theory Approach.” British Journal of Educational Psychology 92, no. 2 (2022): 627–44. https://doi.org/10.1111/bjep.12468. DOI: https://doi.org/10.1111/bjep.12468
Sadoski, Mark, and Allan Paivio. Imagery and Text: A Dual Coding Theory of Reading and Writing. 0 ed. Routledge, 2012. https://doi.org/10.4324/9781410605276. DOI: https://doi.org/10.4324/9781410605276
Saussure, Ferdinand De. Writings In General Linguistics. Oxford University PressOxford, 2006. https://doi.org/10.1093/oso/9780199261444.001.0001. DOI: https://doi.org/10.1093/oso/9780199261444.001.0001
Sirmokadam, Sumukh. “Speech To Text for Data Entry—Opportunities and Challenges.” In Data Management, Analytics and Innovation, vol. 137. Lecture Notes on Data Engineering and Communications Technologies. Springer Nature Singapore, 2023. https://doi.org/10.1007/978-981-19-2600-6_25. DOI: https://doi.org/10.1007/978-981-19-2600-6_25
Tilwani, Shouket Ahmad, Balachandran Vadivel, Yrene Cecilia Uribe-Hernández, Ismail Suardi Wekke, and Mir Mohammad Farooq Haidari. “The Impact of Using TED Talks as a Learning Instrument on Enhancing Indonesian EFL Learners’ Listening Skill.” Education Research International 2022 (March 2022): 1–9. https://doi.org/10.1155/2022/8036363. DOI: https://doi.org/10.1155/2022/8036363
Tong, Shelley Xiuli, Kembell Lentejas, Qinli Deng, Ning An, and Yanmengna Cui. “How Prosodic Sensitivity Contributes to Reading Comprehension: A Meta-Analysis.” Educational Psychology Review 35, no. 3 (2023): 78. https://doi.org/10.1007/s10648-023-09792-8. DOI: https://doi.org/10.1007/s10648-023-09792-8
Vogel, Adam P., Caroline Spencer, Katie Burke, et al. “Optimizing Communication in Ataxia: A Multifaceted Approach to Alternative and Augmentative Communication (AAC).” The Cerebellum 23, no. 5 (2024): 2142–51. https://doi.org/10.1007/s12311-024-01675-0. DOI: https://doi.org/10.1007/s12311-024-01675-0
Xiao, Yanling. “The Impact of AI-Driven Speech Recognition on EFL Listening Comprehension, Flow Experience, and Anxiety: A Randomized Controlled Trial.” Humanities and Social Sciences Communications 12, no. 1 (2025): 425. https://doi.org/10.1057/s41599-025-04672-8. DOI: https://doi.org/10.1057/s41599-025-04672-8
Zhang, Yanhui, and Brian MacWhinney. “Using Diagnostic Feedback to Enhance the Development of Phonetic Knowledge of an L2: A CALL Design Based on the Unified Competition Model and the Implementation with the Pinyin Tutor.” Language Testing in Asia 13, no. 1 (2023): 35. https://doi.org/10.1186/s40468-023-00232-6. DOI: https://doi.org/10.1186/s40468-023-00232-6
Zhong, Ruojun, and Yong Zhao. “Education Paradigm Shifts in the Age of AI: A Spatiotemporal Analysis of Learning.” ECNU Review of Education 8, no. 2 (2025): 319–42. https://doi.org/10.1177/20965311251315204. DOI: https://doi.org/10.1177/20965311251315204
Zhu, Xinfa, Yi Lei, Tao Li, et al. “METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer.” IEEE/ACM Transactions on Audio, Speech, and Language Processing 32 (2024): 1506–18. https://doi.org/10.1109/TASLP.2024.3363444. DOI: https://doi.org/10.1109/TASLP.2024.3363444
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Asep Sunarko, Muhamad Solehudin, Nurin Sakinah

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.