Text-to-Speech Reader for Visually Impaired

Dr.S.Maheswari, S.Tejashree, V.S.Dharun, S.Udhaya Krishnan, M.Tharani Kumar; Dr.N.Saravanakumar; Dr.S.Maheswari, S.Tejashree, V.S.Dharun, S.Udhaya Krishnan, M.Tharani Kumar; Dr.N.Saravanakumar

doi:10.15662/IJSRAT.2025.0802005

Text-to-Speech Reader for Visually Impaired

Dr.S.Maheswari, S.Tejashree, V.S.Dharun, S.Udhaya Krishnan, M.Tharani Kumar

Dr.N.Saravanakumar

Abstract

In the digital era, researchers are developing assistive devices to help visually impaired individuals access information. This paper proposes a Text-to-Speech (TTS) and Object Detection System using a Raspberry Pi. It integrates Tesseract OCR for text recognition and YOLOv8 for object detection. Google Text-to-Speech (gTTS) converts extracted text and identified objects into audible speech. The system utilizes OpenCV for image processing and is implemented in Python. With a user-friendly interface and minimal hardware, it ensures real-time processing. Designed for accessibility and portability, it enhances independence for visually impaired users. This solution promotes inclusivity, making navigation and information access easier in diverse environments.

Article Information

Journal	International Journal of Science, Research and Technology
Volume (Issue)	Vol. 8 No. 2 (2025): International Journal of Science, Research and Technology (IJSRAT)
DOI	https://doi.org/10.15662/IJSRAT.2025.0802005
Pages	13874-13877
Published	April 8, 2025
Copyright	All rights reserved
Open Access	This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite	Dr.S.Maheswari, S.Tejashree, V.S.Dharun, S.Udhaya Krishnan, M.Tharani Kumar, Dr.N.Saravanakumar (%2025). Text-to-Speech Reader for Visually Impaired. International Journal of Science, Research and Technology , Vol. 8 No. 2 (2025): International Journal of Science, Research and Technology (IJSRAT) , pp. 13874-13877. https://doi.org/10.15662/IJSRAT.2025.0802005

References

1. Olive, Joseph P., and Mark Y. Liberman. "Text to speech—An overview." The Journal of the Acoustical Society of America, vol. 5, no. 9, pp. 10–14, 2015.
2. Dagba, Theophile K., and Charbel Boco. "A Text to Speech system for Fon language using Multisyn algorithm." Procedia Computer Science, vol. 40, no. 4, pp. 225–266, 2002.
3. Dongmei, Li. "Design of English text-to-speech conversion algorithm based on machine learning." Journal of Intelligent & Fuzzy Systems, vol. 22, no. 15, pp. 201–209, 2007.
4. Dutoit, Thierry. "High quality text-to-speech synthesis: A comparison of four candidate algorithms." Proceedings of ICASSP'94. IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 9, no. 4, pp. 52–89, 2019.
5. V. M. Reddy, T. Vaishnavi and K. P. Kumar, “Speech-to-Text and Text-to-Speech Recognition Using Deep Learning,” Proceedings IEEE Conference Acoustics, Speech, and Signal Processing, pp. 1–4, 2024.
6. D. Bigioi and P. Corcoran, "Challenges for Edge-AI Implementations of Text-To-Speech Synthesis," IEEE International Conference on Embedded Systems, pp. 45–50, 2021.
7. Nazir, Owais, and Aruna Malik, "Deep Learning End to End Speech Synthesis: A Review," IEEE International Conference on Artificial Intelligence, pp. 120–126, 2021.
8. M. Smith and R. Adams, "Multilingual Text-to-Speech Synthesis," Proceedings IEEE International Symp. Signal Processing,vol.3, pp. 156–161, 2004.
9. C. Miao, Q. Zhu, M. Chen, J. Ma, S. Wang and J. Xiao, "EfficientTTS 2: Variational End-to-End Text-to-Speech Synthesis and Voice Conversion," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 1650-1661, 2024.
10. Z. Yin, "An Overview of Speech Synthesis Technology," IEEE International Conference on Innovations in Information Technology, pp. 95–100, 2020.
11. M. Hamed and Z. Lachiri, “Expressivity Transfer In Transformer-Based Text-To-Speech Synthesis,"IEEE 7th International Conference on Advanced Technologies, Signal and Image Processing,vol.1, pp. 443–448, 2024.
12. H. Kim and S. Lee, "Advances in Neural Text-To-Speech Models," IEEE Transaction Neural Networks, vol. 32, no. 5, pp. 1321–1332, 2023.
13. K. Singh, “Exploring Robustness in Neural TTS for Noisy Environments,” IEEE Conference Acoustics and Signal Processing, vol. 7, no. 1, 2020.
14. C. Wang et al., "Real-Time Text-To-Speech Synthesis Using Parallel WaveGAN," Proceedings IEEE International Conference Multimedia Expo, pp. 389–394, 2022.
15. A. Bose and K. Jain, "Cross-Lingual Adaptation for Neural Speech Synthesis," IEEE Transaction Speech Audio Process., vol. 30, pp. 512–523, 2022.
16. J.Rorberts and M.Daniels, "Transformers in speech synthesis:state of the art", IEEE TransactionMachine Learning, vol. 29,no.3, pp. 512–518, 2022.
17. A.Nguyen and T. Le, "End-to-End Systems for High-Quality Speech Synthesis," Proceedings IEEE Conference Artificial Intelligence Applications, pp. 64–69, 2023.
18. M. Gupta and R. Verma, "Interactive Applications of Text-to-Speech Systems," IEEE Conference Human-Computer Interaction, pp. 202–208, 2021.
19. E. Harper, “Expressive Speech Synthesis Using GAN Models,” IEEE International Conference Machine Learning in Speech Processing, vol. 1, pp. 200–205, 2023.B. Wright,
20. “Improvements in Prosody for Text-to-Speech Synthesis,” IEEE Transaction Signal Processing, vol. 27, pp.345-352, 2023.