就其整体架构和 Mirco Ravanelli 等人表现出来的“野心”来看,PyTorch-Kaldi 的潜力是比较大的。项目文档中关于下一个版本的描述是这样写的:“The architecture of the toolkit will be more modular and flexible. Beyond speech recognition, the new toolkit will be suitable for other applications such as speaker recognition, speech enhancement, speech separation, etc.”。当然,这只是一个工具而已,如果没有对语音识别技术的深刻理解,肯定是做不出更好东西的。许愿:有更多的人力和资源积极地投入到这个领域,帮助让 PyTorch-Kaldi 变得更好,或者打造出全新的比 PyTorch-Kaldi 更好的工具。
参考资料
[1] M. Ravanelli, T. Parcollet and Y. Bengio, "The Pytorch-kaldi Speech Recognition Toolkit," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 6465-6469.doi: 10.1109/ICASSP.2019.8683713
[2] D. Yu and L. Deng, Automatic Speech Recognition – A Deep Learning Approach, Springer, 2015.
[3] Kaldi 文档(kaldi-asr.org/doc/)(http://kaldi-asr.org/doc/%EF%BC%89)
[4] PyTorch-Kaldi Github 仓库(https://github.com/mravanelli/pytorch-kaldi)(https://github.com/mravanelli/pytorch-kaldi%EF%BC%89)
[5] 王赟. 语音识别技术的前世今生(https://www.zhihu.com/lives/843853238078963712%EF%BC%89)(https://www.zhihu.com/lives/843853238078963712%EF%BC%89%EF%BC%89)
更多关注微信公众号:jiuwenwang