Будь ласка, використовуйте цей ідентифікатор, щоб цитувати або посилатися на цей матеріал: http://ena.lp.edu.ua:8080/handle/ntb/42557
Назва: Embedding speech recognition tools for custom software: Engines Overview
Автори: Dovbysh, Arthur
Alieksieiev, Vladyslav
Приналежність: Department of Applied Mathematics, Lviv Polytechnic National University
Бібліографічний опис: Dovbysh A. Embedding speech recognition tools for custom software: Engines Overview / Arthur Dovbysh, Vladyslav Alieksieiev // Computational linguistics and intelligent systems, 25-27 June 2018. — Lviv : Lviv Polytechnic National University, 2018. — Vol 2 : Workshop. — P. 114–121. — (Part 2. Workshop conference tracks. Section I. Computational Linguistics).
Bibliographic description: Dovbysh A. Embedding speech recognition tools for custom software: Engines Overview / Arthur Dovbysh, Vladyslav Alieksieiev // Computational linguistics and intelligent systems, 25-27 June 2018. — Lviv : Lviv Polytechnic National University, 2018. — Vol 2 : Workshop. — P. 114–121. — (Part 2. Workshop conference tracks. Section I. Computational Linguistics).
Є частиною видання: Computational linguistics and intelligent systems (2), 2018
Дата публікації: 25-чер-2018
Видавництво: Lviv Polytechnic National University
Місце видання, проведення: Lviv
Часове охоплення: 25-27 June 2018
Теми: speech recognition
speech engine
API
voice command detection
voice control
Google
Microsoft
Yandex
Julius
overview and analysis
Кількість сторінок: 8
Діапазон сторінок: 114-121
Початкова сторінка: 114
Кінцева сторінка: 121
Короткий огляд (реферат): Different solutions and tools for speech recognition are now available. Nevertheless, implementation of natural language processing still remains a current problem. Developing any custom software with a good style of UI/UX requires the integration of speech recognition. Evidently, the most common solution is to use some engine as an embedded standard tool. Here in the paper we are presenting an overview and an analysis of some popular speech recognition engines: Google Speech Recognition API, Microsoft Speech API, Yandex Speech Kit and Julius. These speech recognition tools are a readyto- serve and suitable to supplement your own software with a reliable voice command detection or voice control feature. The results of our analysis comes from an experiment of voice recognition using these tools as an embedded component in a custom software.
URI (Уніфікований ідентифікатор ресурсу): http://ena.lp.edu.ua:8080/handle/ntb/42557
ISSN: 2523-4013
Власник авторського права: © 2018 for the individual papers by the papers’ authors. Copying permitted only for private and academic purposes. This volume is published and copyrighted by its editors.
URL-посилання пов’язаного матеріалу: https://tech.yandex.ru/speechkit/
https://ru.wikipedia.org/wiki/Yandex.SpeechKit
https://cloud.google.com/speech-to-text/
https://techcrunch.com/2018/04/09/google-launchesan-
https://en.wikipedia.org/wiki/Microsoft_Speech_API
https://msdn.microsoft.com/ru-ru/library/hh362873(v=office.14).aspx
https://github.com/julius-speech/julius
Перелік літератури: 1. Lawrence R. Rabiner and Bernard Gold. Theory and Application of Digital Signal Processing — Prentice Hall, 1975 – 762 p.
2. Plotikov V., Sukhanov V., Jhygulevtsev Yu. Rechevoy dialog v sistemah upravleniya. – Moscow: Mashinostoenie, 1988 – 223 p. – In Russian [Pечевoй диaлoг в cиcтемaх упpaвления / В.Н.Плoтникoв, В.A.Cухaнoв, Ю.Н.Жигулевцев. – М.: Мaшинocтpoение, 1988. – 223 c. – ISBN 5-217-00148-8]
3. Yandex SpeechKit // Yandex – https://tech.yandex.ru/speechkit/ (Retrieved on May 2018)
4. Yandex.SpeechKit // Wikipedia.org, 18.05.2018 – https://ru.wikipedia.org/wiki/Yandex.SpeechKit (Retrieved on May 2018)
5. Minimum Prediction Residual Principle Applied to Speech Recognition / Itakura F. // IEEE Transactions on Acoustics, Speech, and Signal processing. – February 1975. – Vol. 23, No. 1. – P.67–72.
6. Cloud Speech-to-Text – Speech Recognition // Google Cloud – https://cloud.google.com/speech-to-text/ (Retrieved in May 2018)
7. Google launches an improved speech-to-text service for developers / F. Lardinois // Techcrunch.com – April 9, 2018. – https://techcrunch.com/2018/04/09/google-launchesan- improved-speech-to-text-service-for-developers/
8. Microsoft Speech API // Wikipedia.org, 08.11.2017 – https://en.wikipedia.org/wiki/Microsoft_Speech_API (Retrieved on May 2018)
9. Microsoft Speech Platform SDK 11 Requirements and Installation // Microsoft Developer Network – https://msdn.microsoft.com/ru-ru/library/hh362873(v=office.14).aspx (Retrieved on May, 2018)
10. Julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine / Akinobu Lee // GitHub, April 2018 — https://github.com/julius-speech/julius (Retrieved on May 2018)
11. T. Kawahara, A. Lee, T. Kobayashi, K. Takeda, N. Minematsu, S. Sagayama, K. Itou, A. Ito, M. Yamamoto, A. Yamada, T. Utsuro and K. Shikano. "Free software toolkit for Japanese large vocabulary continuous speech recognition." In Proc. Int'l Conf. on Spoken Language Processing (ICSLP), Vol. 4, pp. 476-479, 2000.
12. A. Lee, T. Kawahara and K. Shikano. "Julius – an open source real-time large vocabulary recognition engine." In Proc. European Conference on Speech Communication and Technology (EUROSPEECH), pp. 1691-1694, 2001.
13. A. Lee and T. Kawahara. "Recent Development of Open-Source Speech Recognition Engine Julius" Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2009.
References: 1. Lawrence R. Rabiner and Bernard Gold. Theory and Application of Digital Signal Processing - Prentice Hall, 1975 – 762 p.
2. Plotikov V., Sukhanov V., Jhygulevtsev Yu. Rechevoy dialog v sistemah upravleniya, Moscow: Mashinostoenie, 1988 – 223 p, In Russian [Pechevoi dialoh v cictemakh uppavleniia, V.N.Plotnikov, V.A.Cukhanov, Iu.N.Zhihulevtsev, M., Mashinoctpoenie, 1988, 223 c, ISBN 5-217-00148-8]
3. Yandex SpeechKit, Yandex – https://tech.yandex.ru/speechkit/ (Retrieved on May 2018)
4. Yandex.SpeechKit, Wikipedia.org, 18.05.2018 – https://ru.wikipedia.org/wiki/Yandex.SpeechKit (Retrieved on May 2018)
5. Minimum Prediction Residual Principle Applied to Speech Recognition, Itakura F., IEEE Transactions on Acoustics, Speech, and Signal processing, February 1975, Vol. 23, No. 1, P.67–72.
6. Cloud Speech-to-Text – Speech Recognition, Google Cloud – https://cloud.google.com/speech-to-text/ (Retrieved in May 2018)
7. Google launches an improved speech-to-text service for developers, F. Lardinois, Techcrunch.com – April 9, 2018, https://techcrunch.com/2018/04/09/google-launchesan- improved-speech-to-text-service-for-developers/
8. Microsoft Speech API, Wikipedia.org, 08.11.2017 – https://en.wikipedia.org/wiki/Microsoft_Speech_API (Retrieved on May 2018)
9. Microsoft Speech Platform SDK 11 Requirements and Installation, Microsoft Developer Network – https://msdn.microsoft.com/ru-ru/library/hh362873(v=office.14).aspx (Retrieved on May, 2018)
10. Julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine, Akinobu Lee, GitHub, April 2018 - https://github.com/julius-speech/julius (Retrieved on May 2018)
11. T. Kawahara, A. Lee, T. Kobayashi, K. Takeda, N. Minematsu, S. Sagayama, K. Itou, A. Ito, M. Yamamoto, A. Yamada, T. Utsuro and K. Shikano. "Free software toolkit for Japanese large vocabulary continuous speech recognition." In Proc. Int'l Conf. on Spoken Language Processing (ICSLP), Vol. 4, pp. 476-479, 2000.
12. A. Lee, T. Kawahara and K. Shikano. "Julius – an open source real-time large vocabulary recognition engine." In Proc. European Conference on Speech Communication and Technology (EUROSPEECH), pp. 1691-1694, 2001.
13. A. Lee and T. Kawahara. "Recent Development of Open-Source Speech Recognition Engine Julius" Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2009.
Тип вмісту : Conference Abstract
Розташовується у зібраннях:Computational linguistics and intelligent systems. – 2018 р.



Усі матеріали в архіві електронних ресурсів захищені авторським правом, всі права збережені.