python实现语音识别

tech2025-09-13 125

AudioRecognition

介绍

使用python语言开发，可以应用于arm开发的语音识别，gui程序的语音识别等。

版本管理地址

gitee码云版本管理(点击链接进入)

安装教程

使用百度语音识别aip，你需要设置__init__中的三个接口参数你需要使用AudioRecognition.microphone()方法对输入设备检测，并修改对应参数(百度语音识别的音频采样率不建议过高)你需要安装pyaudio、baidu-aip的python运行库，在conda中没有baidu-aip库，必须使用pip下载

使用说明

你可以使用record()方法进行语音录制，使用recognition()方法进行语音识别成员变量result为识别结果，结果为一个字符串型

参与贡献

Fork 本仓库新建 Feat_xxx 分支提交代码新建 Pull Request

联系方式

如果有任何问题请联系qq773323518

代码预览

import pyaudio import wave from aip import AipSpeech class AudioRecognition(object): def __init__(self): p = pyaudio.PyAudio() self.dir=p.get_device_info_by_index(0) self.chunk = 1024 self.sample_format = pyaudio.paInt16 self.channels = self.dir['maxInputChannels'] #检测当前麦克风的最大声道 self.fs = 16000 #采样频率 self.seconds = 2 #每次录制时间 self.filename = "output.wav" #输出文件名 self.result = '未识别' #识别结果 #百度aip接口 self.APP_ID = 'xxxxx' self.API_KEY = 'xxxxxx' self.SECRET_KEY = 'xxxxx' def record(self):#录入 p = pyaudio.PyAudio() # Create an interface to PortAudio stream = p.open(format=self.sample_format, channels=self.channels, rate=self.fs, frames_per_buffer=self.chunk, input=True, ) frames = [] for i in range(0, int(self.fs / self.chunk * self.seconds)): data = stream.read(self.chunk) frames.append(data) if i % 5 == 0: print("*") stream.stop_stream() stream.close() p.terminate() wf = wave.open(self.filename, 'wb') wf.setnchannels(self.channels) wf.setsampwidth(p.get_sample_size(self.sample_format)) wf.setframerate(self.fs) wf.writeframes(b''.join(frames)) wf.close() def recognition(self): #识别 client = AipSpeech(self.APP_ID, self.API_KEY, self.SECRET_KEY) # 读取文件 def get_file_content(file_path): with open(file_path, 'rb') as fp: return fp.read() # 识别本地文件 result = client.asr(get_file_content(self.filename), 'wav', 16000, { 'dev_pid': 1537, # 默认1537（普通话输入法模型） }) self.result = result['result'][0] def microphone(self): #设备识别,打印系统音频设备参数 p = pyaudio.PyAudio() print(p) for i in range(p.get_device_count()): print(p.get_device_info_by_index(i)) print(p.get_device_info_by_index) if __name__ == '__main__': #从录制到识别出结果整个过程 a=AudioRecognition() print("开始录制") a.record() print("正在识别......") a.recognition() print("结果") print(a.result) #a.microphone()

最新回复(0)