Skip to content

Latest commit

 

History

History
1048 lines (825 loc) · 39 KB

asr_001.md

File metadata and controls

1048 lines (825 loc) · 39 KB

音频、语音

教程 / Tutorial

出版社

超级大脑

  • https://www.bilibili.com/video/av29501100/
  • 人体植入芯片、外骨骼、脑电控制、OpenBCI、脑电干扰、情感分析
  • 二足机器人、机器训练、模拟神经元/神经塔、OpenWorm(仿真线虫)、IBM Blue Brain(蓝脑计划)、OpenCog(人工智能和人工通用智能框架)、Mind Uploading(意识上传)

人工智能 语音算法

  • search taobao
  • ReSpeaker
  • seq2seq
  • 问答机器人,聊天机器人,对话机器人

Python 3.7

http://www.cnblogs.com/devilmaycry812839668/p/9274547.html

pytorch

GoogleNews-vectors-negative300.bin.gz
https://github.com/lucatosto/chatbot
https://github.com/nicolenair/chatbot

自然语言处理与深度学习

自然言語処理と深層学習 C言語によるシミュレーション by 小高 知宏
http://cs.nju.edu.cn/rinc/book2017.html
search baidupan nlp_code_data
機械学習と深層学習
https://www.ohmsha.co.jp/book/9784274220333/
https://www.ohmsha.co.jp/book/9784274218873/
机器学习与深度学习
(baidupan) ml_and_dl_code_data.zip
(baidupan) search 机器学习与深度学习

csdn

  • search baidupan csdn.7z
  • hownet词库
  • 近40万词汇的中文分词词库
  • 中文 分词 -- 同义词大全整理
  • 中文同义词相近词分类词库及自然语言处理词库
  • search cwb
  • search EvaluationSystem

Tencent AI Lab Embedding Corpus for Chinese Words and Phrases

https://zhuanlan.zhihu.com/p/47133426
https://ai.tencent.com/ailab/nlp/embedding.html

funNLP

https://github.com/fighting41love/funNLP

PocketSphinx

https://github.com/cmusphinx/pocketsphinx
https://www.cnblogs.com/bhlsheji/p/4514475.html
https://blog.csdn.net/zouxy09/article/details/7942784

dingdang-robot

snowboy

https://github.com/Kitt-AI/snowboy

解析深度学习:语音识别实践

CNTK

https://github.com/Microsoft/CNTK

HTK-Android

https://github.com/lichard49/HTK-Android
search HMM speech

HMM

https://www.cnblogs.com/skyme/p/4651331.html
https://github.com/hankcs/Viterbi

nn_speach_rec

https://github.com/zavarovkv/nn_speach_rec

Keras

https://github.com/keras-team/keras

d2l-zh

https://github.com/diveintodeeplearning/d2l-zh

Books

  • Python机器学习经典实例
  • 数字语音处理及MATLAB仿真

Python机器学习经典实例

https://github.com/PacktPublishing/Python-Machine-Learning-Cookbook/blob/master/Chapter07/speech_recognizer.py
hmm-speech-recognition
https://github.com/reddyb/hmm-speech-recognition
https://code.google.com/archive/p/hmm-speech-recognition/downloads

语音识别 (树莓派平台python)

http://tieba.baidu.com/p/5971072647
百度
https://github.com/zh2209645/Sample-Raspberry-Chatbot
科大讯飞
https://github.com/wwptrdudu/Voice_Recognition_Control_Robot

scikit-learn

https://scikit-learn.org/stable/
scipy
http://www.scipy.org
numpy
http://www.numpy.org

西瓜书

周志华《机器学习》
https://github.com/datawhalechina/pumpkin-book

机器学习三剑客

numpy(基础库)
Matplotlib(绘图)
pandas(数据处理, 读取csv)

OC Volume

http://ocvolume.sourceforge.net

sunplus

https://github.com/super-1943/MCU/tree/master/sunplus/voice

PyTorch_Tutorial

https://github.com/tensor-yu/PyTorch_Tutorial

DQN三大改进(一)-Double DQN

https://www.jianshu.com/p/fae51b5fe000
https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow
https://morvanzhou.github.io/tutorials/

facebookresearch/wav2letter

https://github.com/facebookresearch/wav2letter

facebookresearch/fastText

https://github.com/facebookresearch/fastText
https://fasttext.cc
NLP︱高级词向量表达(二)——FastText(简述、学习笔记)
https://blog.csdn.net/sinat_26917383/article/details/54850933

Python自然语言处理实战:核心技术与算法

seq2seq
https://github.com/nlpinaction/learning-nlp/tree/master/chapter-10/seq2seq

语音信号处理(C++版)

机械工业出版社
search baidupan, 语音信号处理C++

caffe

https://github.com/BVLC/caffe
基于Caffe的深度学习实践
https://blog.csdn.net/san_junipero/article/details/79219730

深入理解TENSORFLOW架构设计与实现原理

http://www.ituring.com.cn/book/2397
https://github.com/DjangoPeng/tensorflow-in-depth

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

https://github.com/brightmart/nlp_chinese_corpus

stylegan

https://github.com/NVlabs/stylegan
https://mobile.twitter.com/roadrunning01/status/1095151658034757633

leon

https://getleon.ai
https://github.com/leon-ai/leon

SC-FEGAN

https://github.com/JoYoungjoo/SC-FEGAN

NeroParser

https://github.com/yaoguangluo/NeroParser

Practical Pytorch

https://github.com/spro/practical-pytorch

深度学习入门之PyTorch

https://github.com/L1aoXingyu/code-of-learn-deep-learning-with-pytorch

面向自然语言处理的深度学习:用Python创建神经网络

https://github.com/Apress/deep-learning-for-natural-language-processing
https://www.apress.com/gp/book/9781484236840

ShiqiYu/libfacedetection

https://github.com/ShiqiYu/libfacedetection

物联网应用设计与实战 基于AVR单片机和Python

神经网络与深度学习

https://nndl.github.io/
https://github.com/nndl/nndl.github.io

机器学习实战

https://github.com/pbharrin/machinelearninginaction
https://github.com/apachecn/AiLearning
search baidupan 机器学习实战
https://feisky.xyz/machine-learning/basic.html

PyTorch机器学习从入门到实战

https://github.com/xiaobaoonline/pytorch-in-action

tensorflow2_tutorials_chinese

https://github.com/czy36mengfei/tensorflow2_tutorials_chinese

alibaba/MNN

https://github.com/alibaba/MNN

《程序员代码面试指南》-左程云

https://www.jianshu.com/p/c6b26f3a97b6
https://github.com/wutengfei/ZuoChengyun

Python-100-Days

https://github.com/jackfrued/Python-100-Days

动手学深度学习

https://github.com/d2l-ai/d2l-zh

统计学习方法

https://github.com/fengdu78/lihang-code

微软人工智能教育与学习共建社区

https://github.com/microsoft/ai-edu

一款入门级的人脸、视频、文字检测以及识别的项目

https://github.com/vipstone/faceai

中文预训练BERT-wwm

https://github.com/ymcui/Chinese-BERT-wwm

mmdetection

https://github.com/open-mmlab/mmdetection

leeml-notes

https://github.com/datawhalechina/leeml-notes

makcedward/nlp

https://github.com/makcedward/nlp

周志华《机器学习》的学习笔记

https://github.com/Vay-keen/Machine-learning-learning-notes

C-OCR

https://github.com/ctripcorp/C-OCR

微软人工智能教育与学习共建社区

https://github.com/microsoft/ai-edu

动手学深度学习

https://github.com/d2l-ai/d2l-zh

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

https://github.com/freewym/espresso

吴恩达老师的深度学习课程笔记及资源

https://github.com/fengdu78/deeplearning_ai_books

face_recognition

https://github.com/ageitgey/face_recognition

machine-learning-yearning-cn

https://github.com/deeplearning-ai/machine-learning-yearning-cn

1MB轻量级通用人脸检测模型

https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB

深度学习开源书,基于TensorFlow 2.0实战

https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book

TF 2.0版入门实例代码,实战教程

https://github.com/dragen1860/TensorFlow-2.x-Tutorials

Some Books

  • Python机器学习算法:原理、实现与案例
  • 树莓派创客:手把手教你搭建机器人
  • 游戏AI程序设计实战
    国人写的,讲Unity和行为树
  • 微信小程序开发实践
    提及wepy和mpvue
  • Python 3破冰人工智能:从入门到实战
  • Java趣味编程100例

DeepSpeech

https://github.com/mozilla/DeepSpeech

Dive-into-DL-PyTorch

https://github.com/ShusenTang/Dive-into-DL-PyTorch

AI_Sudoku

https://github.com/neeru1207/AI_Sudoku

mit-deep-learning

https://github.com/lexfridman/mit-deep-learning

numpy-ml

https://github.com/ddbourgin/numpy-ml

Dive-into-DL-TensorFlow2.0

https://github.com/TrickyGo/Dive-into-DL-TensorFlow2.0

《神经网络与深度学习》 邱锡鹏著 Neural Network and Deep Learning

https://github.com/nndl/nndl.github.io

深度学习:语音识别技术实践

kaldi

libfacedetection

https://github.com/ShiqiYu/libfacedetection

chineseocr_lite

https://github.com/ouyanghuiyu/chineseocr_lite

CVPR2020-Code

https://github.com/amusi/CVPR2020-Code

esp-sr speech command recognition / wake command (not open source ???)

https://www.espressif.com/zh-hans/node/4127
ESP32 has two ASR framework, WakeNet and MultiNet
https://github.com/espressif/esp-sr/tree/master/speech_command_recognition
https://github.com/DEV-IA/wakenet/tree/master/components/esp-sr/speech_command_recognition
https://github.com/espressif/esp-sr/blob/master/wake_word_engine/README.md
https://github.com/espressif/esp-skainet

开源中文语音识别项目介绍:ASRFrame

https://github.com/sailist/ASRFrame
https://blog.csdn.net/sailist/article/details/95751825

滴滴披露语音识别新进展:基于Attention显著提升中文识别率

https://blog.csdn.net/weixin_40789411/article/details/85238766

mediapipe

https://github.com/google/mediapipe

Intro to Reinforcement Learning (强化学习纲要)

https://github.com/zhoubolei/introRL

Leon is your open-source personal assistant

https://getleon.ai
https://github.com/leon-ai/leon
https://github.com/leon-ai/leon/blob/develop/server/src/tts/tts.js
https://docs.getleon.ai/glossary.html#nlu

在csdn上发现一古老的java语音识别引擎OC Volume

http://ocvolume.sourceforge.net/ocvolume.php

Depth-Aware Video Frame Interpolation (CVPR 2019)

https://github.com/baowenbo/DAIN

The world's simplest facial recognition api for Python and the command line

https://github.com/ageitgey/face_recognition

《神经网络与深度学习》 邱锡鹏著 Neural Network and Deep Learning

https://nndl.github.io
https://github.com/nndl/nndl.github.io

wukong-robot

https://github.com/wzpan/wukong-robot
wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,
还可能是首个支持脑机交互的开源智能音箱项目。
https://wukong.hahack.com/

基于STM32的孤立词语音识别

https://github.com/gk969/stm32-speech-recognition

MegEngine

https://github.com/MegEngine/MegEngine/blob/master/README_CN.md

自美智能系统

http://docs.16302.com/1100464
https://gitee.com/kxdev/zimeimojing
https://github.com/drbdrb/zimei

Frontend for Home Assistant

https://demo.home-assistant.io
https://github.com/home-assistant/frontend

corvin_zhang / ros_voice_system

https://code.corvin.cn/corvin_zhang/ros_voice_system
https://github.com/stefantasy/ros_voice_system
https://code.corvin.cn/corvin_zhang/raspberryPi_AI_soundCard_driver
https://github.com/dabing3000/ros_voice_system
search baidupan, 木星中文语音对话系统

Respeaker

智能音箱对比,芯片对比

微雪 13.3寸智能魔镜

Maixduino / MaixPy

https://github.com/Technica-Corporation/Speech_Recognition-Maixduino
https://github.com/andriyadi/Maix-SpeechRecognizer
https://github.com/sipeed/Maixduino/tree/master/libraries/Maix_Speech_Recognition

百度AI

科大讯飞 语音识别

  • baidupan, search SpeechDemo
  • https://www.xfyun.cn
  • 官方的SDK下载(跨平台调用msc动态库), baidupan, search Android_iat or 讯飞

语音信号处理C++随书代码

  • baidupan, 搜索语音信号处理

(金鼠纳福)与树莓派完美结合,高端AI音箱养成记——Respeaker Mic Array v2.0评测

http://wiki.seeedstudio.com/cn/ReSpeaker-USB-Mic-Array/
https://github.com/respeaker/usb_4_mic_array
https://www.cirmall.com/articles/22139

NAU88C10

WM8978

WM8960, waveshare, 微雪

VS1053, VS1003, waveshare, 微雪

智能音箱开发套件

*-audio-kit

  • 树莓派+Respeaker
  • AliOS Things Starter Kit / Developer Kit
  • ROC-RK3308-CC (只有旧款可以接麦克风矩阵)
  • Maix Dock, Maixduino, Maix bit, Maix Go
  • Breakout for LinkIt Smart 7688 v2.0扩展板
  • 香橙派3

INMP441

i2s MEMS麦克风

windows下使用Microsoft Speech SDK开发包做语音识别

https://blog.csdn.net/marleylee/article/details/77116609

julius, for raspberrypi, 树莓派

https://osdn.net/projects/julius/
ラズパイで音声認識をしてみる
http://usicolog.nomaki.jp/engineering/raspberryPi/raspberryPi_Julius.html
http://sourceforge.jp/frs/redir.php?m=osdn&f=%2Fjulius%2F60273%2Fjulius-4.3.1.tar.gz
http://sourceforge.jp/frs/redir.php?m=jaist&f=%2Fjulius%2F60416%2Fdictation-kit-v4.3.1-linux.tgz
http://sourceforge.jp/frs/redir.php?m=osdn&f=%2Fjulius%2F51159%2Fgrammar-kit-v4.1.tar.gz

web-audio-recognition

https://github.com/google/web-audio-recognition

AndroidMaryTTS

https://github.com/AndroidMaryTTS/AndroidMaryTTS
search gh, HammingWindow fft

WholeWordAutomaticSpeechRecognizer

https://github.com/gianpaolocoro/WholeWordAutomaticSpeechRecognizer
search gh, HammingWindow fft Mfcc
https://github.com/ptemplin/SpeechRecognition

DBFace is a real-time, single-stage detector for face detection, with faster speed and higher accuracy

https://github.com/dlunion/DBFace

语音识别——使用python建立HMM-GMM孤立词识别模型

https://blog.csdn.net/chinatelecom08/article/details/82901480
https://github.com/audier/my_python_play/tree/master/hmm_gmm_speech_model

DeepSpeechRecognition

https://github.com/audier/DeepSpeechRecognition
https://audier.github.io/2018/09/11/chinese-speech-asrt/

Google AIY Voice Kit

https://github.com/google/aiyprojects-raspbian
谷歌AIY Voice Kit智能语音开发套件安装和使用体验
http://www.soomal.com/doc/10100008372.htm
https://dl.google.com/aiyprojects/voice/aiyprojects-2017-09-11.img.xz

数字语音处理及MATLAB仿真

search baidupan 数字语音处理及MATLAB仿真
matlab

Python 3破冰人工智能:从入门到实战

Python

baidu, 飞桨PaddlePaddle

https://github.com/PaddlePaddle/Paddle
https://www.paddlepaddle.org.cn

lstm

search lstm_bias, esp32 mn static lib
https://github.com/thomasschmied/Speech_Recognition_with_Tensorflow
https://github.com/ChaosCY/LAS-asr
search baidupan, esp-sr.rar
(xxxxxx)
于是我故技重施,用上次反编译凌阳61单片机的静态库的方法,用记事本打开ESP32的语音识别静态库,发现了新世界(划去),发现了它的确是用了RNN(循环神经网络),里面有个字符串叫lstm_bias(LSTM是长短期记忆网络,这是什么鬼的矛盾名字),在gh上可以找到一个比较多star的项目,叫Speech_Recognition_with_Tensorflow:
https://github.com/thomasschmied/Speech_Recognition_with_Tensorflow
还有一篇论文:《Listen, Attend and Spell》
https://arxiv.org/pdf/1508.01211.pdf
(xxxxxx)
关于ESP32的闭源语音识别引擎(只提供不开源的静态库进行链接),这里有一份介绍可以看看:
https://www.espressif.com/zh-hans/node/4127
简单说ESP32把这个问题分割成两个小问题(其实通常的语音助手都是这样做的):唤醒词用WakeNet,语音识别用MultiNet:
https://github.com/espressif/esp-sr/blob/master/wake_word_engine/README_cn.md
https://github.com/espressif/esp-sr/blob/master/speech_command_recognition/README_cn.md
然后还有一个统一的语音助手框架(补注:相当于语音识别的基础上外加语音合成功能)
https://github.com/espressif/esp-skainet/blob/master/README_cn.md
https://github.com/espressif/esp-skainet/blob/master/components/esp-tts/README.md
其实都是基于MFCC的,问题是关于深度学习的部分不开源——其实即便是开源也看不懂系列

LSTM 详解

https://www.cnblogs.com/yjybupt/p/10861904.html

实时语音处理实践指南

http://www.broadview.com.cn/book/6099
search baidupan 实时语音处理实践指南

可解释的机器学习--黑盒模型可解释性理解指南

https://github.com/MingchaoZhu/InterpretableMLBook

《深度学习》(花书)

https://github.com/MingchaoZhu/DeepLearning

手写实现李航《统计学习方法》书中全部算法

https://github.com/Dod-o/Statistical-Learning-Method_Code

云从科技

https://www.cloudwalk.cn/ 实体清单

olivia

https://github.com/olivia-ai/olivia

DeepSpeech

https://github.com/PaddlePaddle/DeepSpeech/blob/develop/README_cn.md

speech-mfcc

https://github.com/education-service/speech-mfcc

MATLAB在语音信号分析和合成中的应用

aka. MATLAB语音信号分析与合成(第2版) 宋知用 北京航空航天大学
https://www.ilovematlab.cn/forum-173-1.html
https://www.ilovematlab.cn/thread-530017-1-1.html
search 语音 in this web site (20131118mlyy.zip and 20171013085903_22369.zip):
http://press.buaa.edu.cn

tensorflow lite (tflite) micro_speech

(o) search baidupan, speech_commands_v0.02.tar.gz
(o) search baidupan, Blink_esp32_rpd2017_v2_success.tar.gz, for raspberry pi desktop
(o) search baidupan, Blink_esp32_v3_build_success.rar, for esp32, not tested
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/micro/examples/
https://github.com/squix78/esp32-tensorflow-microspeech
https://github.com/boochow/TFLite_Micro_MicroSpeech_M5Stack
32F746GDISCOVERY, STM32F746 Discovery kit, waveshare RMB488
https://www.st.com/en/evaluation-tools/32f746gdiscovery.html
STM32H747I-DISCO, MEMS digital microphone, RMB784
https://www.st.com/en/evaluation-tools/stm32h747i-disco.html
STM32F4DISCOVERY, waveshare RMB168
https://www.st.com/en/evaluation-tools/stm32f4discovery.html
https://circuitpython.org/board/stm32f4_discovery/
32F411EDISCOVERY, RMB120
https://www.st.com/en/evaluation-tools/32f411ediscovery.html
32L476GDISCOVERY, RMB224
https://www.st.com/en/evaluation-tools/32l476gdiscovery.html
TinyML
https://www.oreilly.com/library/view/tinyml/9781492052036/

TensorflowTTS

https://github.com/TensorSpeech/TensorflowTTS

数字语音处理及MATLAB仿真(第2版)

https://www.hxedu.com.cn/hxedu/hg/book/bookInfo.html?code=G0280790
baidupan, search [数字语音处理及MATLAB仿真(第2版)][张雪英][程序源代码].zip

espnet

https://github.com/espnet/espnet

在xubuntu 20.04 64bit 上安装tensorflow最新版(当前为2.2版)

(检查是否为64位Python)
$ python3
import platform
platform.architecture()
$ sudo apt-get update
$ sudo apt-get install python3-pip
$ pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorflow-cpu
$ python3
import tensorflow
SIGILL nsync::nsync_mu_init
avx tensorflow

python2 pip install

https://www.cnblogs.com/zhuangliu/archive/2016/11/20/6083063.html
wget https://bootstrap.pypa.io/get-pip.py
sudo python2.7 get-pip.py
sudo python2.7 -m pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
$ python2.7 进入执行即可

(DONNOT USE!)tensorflow2.0.0a0国内镜像安装

https://blog.csdn.net/Lip_tom/article/details/89761639
镜像办法:
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorflow-gpu==2.0.0a0
注意以上格式为:pip install -i 镜像网址 你要的tensorflow版本号
镜像网址包括:
清华:https://pypi.tuna.tsinghua.edu.cn/simple (很齐全,各都有)
豆瓣:http://pypi.douban.com/simple/ (亲测1.0版本可用)
tensorflow2.0的版本
GPU:tensorflow-gpu==2.0.0a0 ? ?? #258.4M
CPU:tensorflow-cpu==2.0.0a0 #49M左右
官方教程:
https://tensorflow.google.cn/install/source_rpi?hl=zh_cn#python-2.7
https://tensorflow.google.cn/install/pip?hl=zh_cn&lang=python2
TensorFlow安装(Ubuntu 16.04)
https://www.cnblogs.com/zhuangliu/archive/2016/11/20/6083063.html
https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl

从0开始语音识别(附带讲解内容)Python

https://www.bilibili.com/video/BV1pE411B7Ja
search baidupan, 从0开始语音识别

TensorFlow speech_commands

https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands
Can use Tensorflow 1.5.0
https://github.com/tensorflow/tensorflow/releases/tag/v1.5.0
Train data
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/speech_commands/train.py
Recognition
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/speech_commands/label_wav.py
how to run, see this
https://www.cnblogs.com/lijianming180/p/12258774.html
RKNN
http://t.rock-chips.com/forum.php?mod=viewthread&tid=456&extra=page%3D1
https://github.com/tensorflow/docs/blob/master/site/en/r1/tutorials/sequences/audio_recognition.md

tflite-speech-recognition

https://github.com/ShawnHymel/tflite-speech-recognition
https://www.digikey.com/en/maker/projects/tensorflow-lite-tutorial-part-1-wake-word-feature-extraction/54e1ce8520154081a58feb301ef9d87a
https://www.digikey.com/en/maker/projects/intro-to-tinyml-part-1-training-a-model-for-arduino-in-tensorflow/8f1fc8c0b83d417ab521c48864d2a8ec

语音识别深度学习模型

http://www.hackcha.cn/?p=402

TensorFlow发布语音识别入门教程,附1GB数据集&代码

https://www.sohu.com/a/167209693_798050

使用TensorFlow训练自己的语音识别AI

https://www.cnblogs.com/lijianming180/p/12258774.html
(详细命令行参数过程记录)

EasyOCR

https://github.com/JaidedAI/EasyOCR

hmmlearn and python_speech_features

https://github.com/jameslyons/python_speech_features
https://github.com/hmmlearn/hmmlearn
python语音识别基于hmmlearn库(隐马尔科夫模型)
https://blog.csdn.net/zhuyijun09/article/details/82086000

《语音信号处理实验教程》

http://www.cmpedu.com/books/book/2052865.htm
search baidupan, 语音信号处理代码.rar

成功解决Tensorflow不支持AVX2指令集问题

https://github.com/fo40225/tensorflow-windows-wheel
https://www.imooc.com/article/details/id/289425

RKNN, tensorflow

RK1808
RK3399Pro
https://github.com/tensorflow/docs/blob/master/site/en/r1/tutorials/sequences/audio_recognition.md
http://t.rock-chips.com/forum.php?mod=viewthread&tid=456&extra=page%3D1
http://t.rock-chips.com/portal.php?mod=list&catid=11&product_id=28

Jetson nano 的语音

https://blog.csdn.net/chencef/article/details/96900061
pip install SpeechRecognition
pip install gTTS-token
pip install gTTS
pip install pygame
sudo apt install python-pyaudio python3-pyaudio ( version )
sudo apt install portaudio19-dev python-all-dev python3-all-dev
pip install PyAudio

Speech Recognition(语音识别)

https://blog.csdn.net/chen_gong_ping/article/details/91442422
语音识别:基于深度学习的中文语音识别tutorial(代码实践)(转下)
https://blog.csdn.net/chinatelecom08/article/details/85013535
(接上)在github上的链接:基于深度学习的中文语音识别系统
https://github.com/audier/DeepSpeechRecognition
基于深度学习的中文语音识别(自注意力机制的语言模型代码实践)
https://blog.csdn.net/chinatelecom08/article/details/85051817
python+keras实现语音识别
https://github.com/BenShuai/kerasTfPoj
https://blog.csdn.net/sunshuai_coder/article/details/83658625
search baidupan python_keras实现语音识别
数据集:清华大学THCHS30中文语音数据集
data_thchs30.tgz :http://cn-mirror.openslr.org/resources/18/data_thchs30.tgz
test-noise.tgz :http://cn-mirror.openslr.org/resources/18/test-noise.tgz
resource.tgz :http://cn-mirror.openslr.org/resources/18/resource.tgz
2.Free ST Chinese Mandarin Corpus
ST-CMDS-20170001_1-OS.tar.gz?http://cn-mirror.openslr.org/resources/38/ST-CMDS-20170001_1-OS.tar.gz
https://blog.csdn.net/chen_gong_ping/article/details/91442422

BSRV215.LIB hack, 凌阳61单片机

我翻阅过一本旧书,里面提及凌阳61单片机的语音识别算法的特征提取方法除了可能基于MFCC(梅尔频率倒谱系数)的,
也有可能是基于更易于单片机运算的LPCC(线性预测倒谱系数)(书中否定了MFCC,但实际上lib文件中有提及Mel),
当然了,那本书没有具体提及凌阳61单片机的HMM模型训练细节,从lib文件的二进制hack可知(通过linux命令strings -a BSRV215.LIB获取所有文本),
很大可能性是基于维特比(Viterbi)算法,原因是里面出现了这样的字符串fixvtb_con,_VTBStartCon,_VTBNextCon
除此以外,还涉及这些技术:
<1> CMS
倒谱均值减(Cepstrum Mean Subtraction, CMS):CMS可以有效地减小语音输入信道对特征参数的影响。
https://www.cnblogs.com/welen/p/4096708.html?utm_source=tuicool&utm_medium=referral
<2> IDCT
反离散余弦变换。离散余弦变换(Discrete Cosine Transformation, DCT):去除各维信号之间的相关性,将信号映射到低维空间。
<3> Pre-Emphasis
预加重。是噪声整形技术在模拟信号的处理中,一项关于噪声整形技术原理的技术。
<4> HammingWindow
汉明窗
search baidupan, BSRV215.LIB_readme.txt

【语音识别】从入门到精通——最全干货大合集!

https://yq.aliyun.com/articles/665231?utm_content=m_1000022483

隐马尔科夫模型用于语音识别的原理(HMM+GMM)

https://www.cnblogs.com/christlxl/articles/5017746.html

search github: mfcc Dist Weight

从yesno模型入门kaldi语音识别

https://blog.csdn.net/u011930705/article/details/81737937

P-Brain.ai

https://github.com/patrickjquinn/P-Brain.ai
https://github.com/patrickjquinn/P-Brain.ai-RasPi
小试 P-Brain.ai 虚拟助手
https://www.jianshu.com/p/f0c7cb4bd624
SpeechKITT
https://github.com/TalAter/SpeechKITT

语音识别之特征参数提取(一)

https://blog.csdn.net/w_manhong/article/details/78977833

dtw算法

乐鑫Esp32学习之旅 23 安信可 esp32-a1s 音频开发板移植最新 esp-adf 音频框架,小试牛刀如何实现在线文字转语音播放。

https://blog.csdn.net/xh870189248/article/details/104160104
https://github.com/Ai-Thinker-Open/Ai-Thinker-Open_ESP32-A1S_ASR_SDK
https://github.com/Ai-Thinker-Open/ESP32-A1S-AudioKit
https://github.com/weimingtom/ESP32-A1S-AudioKit

P3421, with Raspberry Pi, Arduino Zero, Feather M0

https://learn.adafruit.com/adafruit-i2s-mems-microphone-breakout/overview

voicetainment with Qt and pocketsphinx

https://github.com/PacktPublishing/Hands-On-Embedded-Programming-with-CPP-17/tree/master/Chapter08/voicetainment
https://github.com/MayaPosch/EmbeddedProgrammingWithCpp17

AndroidCMUSphinx

https://github.com/liufuliang/AndroidCMUSphinx

essentia

https://github.com/MTG/essentia

Isolated-word-speech-recognition

https://github.com/echos2019/Isolated-word-speech-recognition
in Matlab

DTW_Digital_Voice_Recognition

https://github.com/zhengyima/DTW_Digital_Voice_Recognition
in Matlab

基于K210的MNIST手写数字识别

https://blog.csdn.net/weixin_44874976/article/details/104487069

亚博智能,麦克风阵列

https://www.yahboom.com/study_module/MicArray

MicArray—-MaixBit         MicArray—-MaixBit
LED_CK------IO25           LED_DA--------IO24
MIC_D0------IO23           MIC_D1--------IO22
MIC_D2------IO21           MIC_D3--------IO20
MIC_WS------IO19           MIC_CK--------IO18
VIN---------5V             GND-----------GND

https://cn.dl.sipeed.com/MAIX/HDK/Sipeed-R6%2B1_MicArray/
https://www.waveshare.net/wiki/Maix_R6%2B1_Microphone_Array

google speech api

https://github.com/googleapis/python-speech
https://github.com/MhageGH/esp32_CloudSpeech/

whyengineer

https://github.com/whyengineer/esp32_circle/blob/master/baidu_rest/components/rest/baidu_rest.c

Esp32_speech_to_text

https://github.com/YaxiongWu/Esp32_speech_to_text

清华大学中文语料库

https://github.com/xxbb1234021/speech_recognition
http://www.openslr.org/18/

Wake-Word-Speech-Recognition, with tensorflow

https://github.com/chrishorton/Wake-Word-Speech-Recognition

Scikit-Learn

DNN

search DNN speech

灵空机器人

https://gitee.com/lingkonggzs/lingkong-robot/blob/master/snowboydecoder.py

ffmpeg, 音频文件转码

https://ai.baidu.com/ai-doc/SPEECH/7k38lxpwf

智能电话机器人--基于 UniMRCP 实现讯飞 ASR MRCP Server

https://blog.csdn.net/dalangtaosha2011/article/details/82854173

VAD

百度AI开放平台->语音识别->开发工具
https://ai.baidu.com/ai-doc/SPEECH/Ek39uxgre
开源VAD音频切分工具
https://ai.baidu.com/ai-doc/SPEECH/xk38lxq46
https://github.com/Baidu-AIP/speech-vad-demo
Android
https://github.com/LH-YU/SpeechVadDemo
py-webrtcvad
https://github.com/wiseman/py-webrtcvad

2020-02-25 python使用ffmpeg、speech-vad-demo、百度语音识别生成字幕

https://www.jianshu.com/p/d87b3d0f0618

snowboy docs

http://docs.kitt.ai/snowboy/

Raspibot

https://github.com/LoveThinkinghard/Raspibot

mobvoi/lstm_ctc

https://github.com/mobvoi/lstm_ctc

数字麦克风PDM信号采集与STM32 I2S接口应用(一) - 啊哈彭 - 博客园

https://www.cnblogs.com/pingwen/p/11298675.html

olami

https://cn.olami.ai/open/website/home/home_show
https://blog.csdn.net/ls0609/article/details/73920229
julius-js
https://www.cnblogs.com/lhb25/p/julius-js-speech-recognition-library-web.html

《语音识别的前世今生:GMM+HMM & 深度学习》讲座笔记

https://www.cnblogs.com/lyu0709/p/6929659.html

一个基于云端语音识别的智能控制设备,类似于天猫精灵,小爱同学。采用的芯片为stm32f407,wm8978,esp8266

https://github.com/lovelyterry/SmartSpeaker

mlpack

GMM-HMM (multiple Gaussian) for isolated words recognition

https://web.ece.ucsb.edu/Faculty/Rabiner/ece259/speech%20recognition%20course.html
https://www.mathworks.com/matlabcentral/fileexchange/64297-gmm-hmm-multiple-gaussian-for-isolated-words-recognition
https://web.ece.ucsb.edu/Faculty/Rabiner/ece259/
https://web.ece.ucsb.edu/Faculty/Rabiner/ece259/speech%20course.html
https://github.com/hhle88/GMM-HMM

语音识别(六)——FBank, 语音识别的评价指标, 声学模型进阶, 语言模型进阶, GMM-HMM, WFST(1)

http://antkillerfarm.github.io/speech/2018/07/26/speech_6.html

sourceforge, matlab, hmm asr matlab

https://sourceforge.net/projects/hmm-asr-matlab/

能在MCU上运行的语音算法方案——AID.Speech

https://www.sohu.com/a/315371189_120080940
https://zhuanlan.zhihu.com/p/68938575
https://github.com/OAID/SpeechRecognition

tensorflow-speech-recognition

https://github.com/pannous/tensorflow-speech-recognition/

esp32_CloudSpeech

https://github.com/MhageGH/esp32_CloudSpeech

I2S Microphone, stm32

https://github.com/har-in-air/STM32_CODE_EXAMPLES/blob/3eb512ade9b3a7f3ad45d0d0d0a0c1f790b2fd95/dsp/f407_i2s_mic/README.md

stm32音频接口I2S

https://www.cnblogs.com/jianfengjin/articles/4943694.html

野火i2s

https://github.com/Embedfire-stm32f429-tiaozhanzhe/ebf_stm32f429_tiaozhanzhe_hal_code/blob/master/I2S—录音与回放/User/main.c

正点原子

https://github.com/shuimuyangsha/STM32F407StandardRoutine/blob/master/Experiment_44_RecordingMachine/USER/main.c

AZ3166

https://github.com/F-ARobert/DeteX_Firmware/tree/master/.build/sketch/AZ3166
https://microsoft.github.io/azure-iot-developer-kit/docs/get-started/
https://microsoft.github.io/azure-iot-developer-kit/docs/apis/audio-v2/
https://github.com/VSChina/azureiotdevkit_tools/blob/master/package_azureboard_index.json
https://github.com/microsoft/devkit-sdk/blob/master/AZ3166/src/libraries/AudioV2/examples/VoiceRecord/VoiceRecord.ino

Helix MP3

SparkFun_Edge, tensorflow lite

https://github.com/sparkfun/SparkFun_Edge
https://learn.sparkfun.com/tutorials/using-sparkfun-edge-board-with-ambiq-apollo3-sdk/example-applications
https://github.com/sparkfun/SparkFun_Edge_BSP

ADMP441

ESP8266_MP3_DECODER

https://github.com/espressif/ESP8266_MP3_DECODER
https://www.icxbk.com/ask/detail?tid=5537

nrf-tensorflow

https://github.com/oivoii/nrf-tensorflow

ADMP401

https://learn.sparkfun.com/tutorials/mems-microphone-hookup-guide
https://os.mbed.com/users/rayxke/notebook/sparkfun-mems-microphone-breakout---inmp401-admp40/
https://github.com/sparkfun/MEMS_Mic_Breakout-ADMP401/blob/V_1.3/Firmware/SparkFun_ADMP401_Simple_Sketch/SparkFun_INMP401.ino

TFLite (TensorFlow Lite) Micro support boards, micro_speech

我来比较一下貌似可以运行tflite(tensorflow lite)micro语音识别例子micro_speech的开发板(包括可能支持的板)成本价格(不含运费):
(1)ESP32:安信可NodeMCU-32S,24元,乐鑫ESP32-DevKitC,55元
(2)K210:Maix Bit,80元(或使用更贵的M5StickV或Maix Dock)
(3)Apollo3:SparkFun Edge,90元
(4)stm32f746:STM32F746 Discovery kit (Mbed),即32F746GDISCOVERY,450元
(5)ATSAMD51J19:Adafruit EdgeBadge(PyBadge,或者叫TensorFlow Lite for Microcontrollers Kit):330元
(6)nRF52840:Arduino Nano 33 BLE Sense:310元
(7)stm32f407vg:STM32F4Discovery:148元
(8)STM32H743VI:OpenMV4 H7 Cam:463元
see https://blog.boochow.com/article/ensorflow-lite-mcu-microspeech.html

Arduino Nano 33 BLE Sense (Arduino)  
ESP32-DevKitC、ESP-EYE (ESP-IDF4.0) — Hello worldのみ  
SparkFun Edge (プラットホームは使わない)  
STM32F746 Discovery kit (Mbed) — Hello worldとMicro speechのみ  
Adafruit EdgeBadge (Arduino)  
Adafruit TensorFlow Lite for Microcontrollers Kit (Arduino)  

Wio Terminal, microphone

其实除了nrf、samd51,还有一些冷门开发板可以玩i2s,例如w600,RTL8711AF,MT7688。
另外还有一些开发板是用单线的ADC麦克风,例如Wio Terminal:
https://wiki.seeedstudio.com/Wio-Terminal-Mic/
简单说如果有多线程,用多线程读ADC结果,如果没有,但不阻塞,可以用delay模拟采样率,如果有阻塞(耗费时间),
需要用一个有限长队列或者循环队列去适当保留或舍弃之前的ADC读取结果以保证读取值是实时的或接近实时(延迟但延迟时间固定)

seeedstudio forum, sipeed forum

https://github.com/edgeimpulse/firmware-arduino-nano-33-ble-sense/tree/master/src/tflite-model
see https://forum.seeedstudio.com/t/regarding-tensorflow-lite-library/252284 az3166
https://forum.seeedstudio.com/t/training-an-ai-model-using-real-sensor-data-in-less-than-10-minutes/251233/3
k210, maixduino
https://cn.bbs.sipeed.com/t/maixduino
https://en.bbs.sipeed.com

qiita

https://qiita.com/iwatake2222/items/4d198f6203348ef7fd31

webkitSpeechRecognition, SpeechSynthesisUtterance

用HTML5的Web Speech API识别语音读出内容
https://www.jianshu.com/p/e42638839475
https://qiita.com/hmmrjn/items/4b77a86030ed0071f548

maixcube espeak

https://github.com/fukuen/maixcube-espeak-demo
https://github.com/fukuen/MaixCube_eSpeak
https://github.com/fukuen/MaixCube_ES8374
https://github.com/fukuen/maixcube-tensorflow-lite-micro
https://github.com/espeak-ng/espeak-ng
https://qiita.com/fukuebiz/items/573f9822d3c15a585081