Examining the Best Speech-to-Text Method for Audio Files in Podcasting
Keywords:
Podcast, Subtitles, Spectral Gating, Speech-to-Text, Silero, Vosk, Mozilla DeepSpeech, SpeechRecognition, Word Error Rate, AccuracyAbstract
Podcasting is a great way to give insights or opinions on any topic to the audience. Podcasting requires both parties to be present physically at the location. But due to the pandemic crisis, this has caused a big problem. So, it is now carried out on an online platform. But the cons are the presence of noise in the audio files as well as miscommunication. The Spectral Gating method is used to remove the noise. This paper compares the various algorithms for converting audio to text by using various speech-to-text pretrained models. We performed an experiment on various audio files and the best accuracy rate was obtained for SpeechRecognition pretrained model.
References
Akhil Kanade, Sourabh Gune, Shubham Dharamkar, Rohan Gokhale, “Automatic Subtile Generation for Videos,” Interntional Journal of Enginneering Research and General Science, Vol.3, Issue.6, p.744,2015.
Siya Sadashiv Naik, Gouri Bhatikar and Ugam Gaude, “Analysis of Best Algorithm for Noise Reduction in Podcasting,” Internatioonal Journal of Scientific Research in Science and Technology, Vol.8, Issue.3, pp24-249,2021.
N Usha Rani, P N Girija, “Error Analysis to Improve the Speech Recogntion Accuracy on Telegu Language,” Indian Academic of Sciences, Vol.37.Part.6, p.747,2012.
Aashish Agarwal, Torsten Zesch, “German End-to-end Speech Recognition based on DeepSpeech,” ResearchGate, Germany, Germany, pp.2-3, 2019.
N. SelvaKumar, M. Rohini, C. Narmada, M. Yogeshprabhu, “Network Traffic Control Using AI,” International Journal of Scientific Research in Network Security and Communication, Vol.8, Issue.2, pp.13-21,2020.
Muhammad Hafida Firmansyah, Anand Paul, Deblina Bhattachrya, Gul Malik Urfa, “A.I. based Emedded Speech to Text using DeepSpeech,” ResearchGate, South Korea, pp.1-5,2020.
Dhara Bhatt, Bhargavi Khrishna, “Computer Assisted Pronounciation Learning System Using Speech Recognition Systems “PROnunciation Application”,” International Journal of Scientific Research in Computer Science and Engineering, Vol.7, Issue.6,pp.36-39,2019.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.