Speex語音的前置處理(一)
1.簡介
語音在采集和傳輸過程中,由于語音源的差異、信道的衰減、噪聲的干擾以及遠近效應,導致信號幅度相差很大。所有在語音處理之前我們需要對語音數據進行前置處理,包括預處理(AGC、VAD、回音消除)、重采樣和噪聲抑制。
所有的代碼都是基于speex開源庫,具體內容可以參考http://speex.org/。
本開發手冊暫時只提供AGC的接口和測試代碼,其他陸續添加進來。
2.接口說明
2.1 介紹
預處理模塊包括自動增益控制、靜音檢測和回音消除。以下是接口函數,具體參考speex\ speex_preprocess.h。
函數名稱 |
功能簡介 |
speex_preprocess_state_init |
創建預處理器 |
speex_preprocess_state_destroy |
銷毀預處理器 |
speex_preprocess_run |
處理一幀數據 |
speex_preprocess |
處理一幀數據(廢棄的) |
speex_preprocess_estimate_update |
更新預處理器 |
speex_preprocess_ctl |
設置和讀取預處理器的參數 |
2.1.1 speex_preprocess_state_init
函數原形 |
SpeexPreprocessState *speex_preprocess_state_init(int frame_size, int sampling_rate); |
功能 |
創建預處理器 |
參數 |
Frmae_size [in]每幀的大小(建議幀長為20ms) Sample_rate [in]采樣率(支持8k、16k、44k) |
返回值 |
成功返回預處理器指針,失敗返回NULL |
說明 |
加入是16k的語音數據,幀長20ms等于320個采樣 |
2.1.2 speex_preprocess_state_destroy
函數原形 |
void speex_preprocess_state_destroy(SpeexPreprocessState *st); |
功能 |
銷毀預處理器 |
參數 |
St [in]處理器指針 |
返回值 |
Void |
說明 |
|
2.1.3 speex_preprocess_run
函數原形 |
int speex_preprocess_run(SpeexPreprocessState *st, spx_int16_t *x); |
功能 |
處理一幀語音數據 |
參數 |
St [in]處理器指針 X [in|out]數據緩存,處理后的數據也存入該緩存中 |
返回值 |
如果VAD打開,返回值為1表示有語音,為0表示靜音或者噪音 |
說明 |
|
2.1.4 speex_preprocess
函數原形 |
int speex_preprocess(SpeexPreprocessState *st, spx_int16_t *x, spx_int32_t *echo); |
功能 |
處理一幀語音數據(廢棄的函數,簡介調用speex_preprocess_run) |
參數 |
St [in]處理器指針 X [in|out]數據緩存,處理后的數據也存入該緩存中 |
返回值 |
|
說明 |
|
2.1.5 speex_preprocess_estimate_update
函數原形 |
void speex_preprocess_estimate_update(SpeexPreprocessState *st, spx_int16_t *x); |
功能 |
更新預處理器,不會計算輸出語音 |
參數 |
St [in]處理器指針 X [in]數據緩存 |
返回值 |
Void |
說明 |
|
2.1.6 speex_preprocess_ctl
函數原形 |
int speex_preprocess_ctl(SpeexPreprocessState *st, int request, void *ptr); |
功能 |
設置預處理器的參數 |
參數 |
St [in]處理器指針 Request [in]參數的類型(由宏來代表不同的參數) Ptr [in|out]參數的值(設置參數時為in,獲得參數參數時為out,這個由宏決定) |
返回值 |
成功返回0,失敗返回-1(表明未知的請求參數) |
說明 |
以下數標識參數類型的宏 /** Set preprocessor denoiser state */ #define SPEEX_PREPROCESS_SET_DENOISE 0 /** Get preprocessor denoiser state */ #define SPEEX_PREPROCESS_GET_DENOISE 1
/** Set preprocessor Automatic Gain Control state */ #define SPEEX_PREPROCESS_SET_AGC 2 /** Get preprocessor Automatic Gain Control state */ #define SPEEX_PREPROCESS_GET_AGC 3
/** Set preprocessor Voice Activity Detection state */ #define SPEEX_PREPROCESS_SET_VAD 4 /** Get preprocessor Voice Activity Detection state */ #define SPEEX_PREPROCESS_GET_VAD 5
/** Set preprocessor Automatic Gain Control level (float) */ #define SPEEX_PREPROCESS_SET_AGC_LEVEL 6 /** Get preprocessor Automatic Gain Control level (float) */ #define SPEEX_PREPROCESS_GET_AGC_LEVEL 7
/** Set preprocessor dereverb state */ #define SPEEX_PREPROCESS_SET_DEREVERB 8 /** Get preprocessor dereverb state */ #define SPEEX_PREPROCESS_GET_DEREVERB 9
/** Set preprocessor dereverb level */ #define SPEEX_PREPROCESS_SET_DEREVERB_LEVEL 10 /** Get preprocessor dereverb level */ #define SPEEX_PREPROCESS_GET_DEREVERB_LEVEL 11
/** Set preprocessor dereverb decay */ #define SPEEX_PREPROCESS_SET_DEREVERB_DECAY 12 /** Get preprocessor dereverb decay */ #define SPEEX_PREPROCESS_GET_DEREVERB_DECAY 13
/** Set probability required for the VAD to go from silence to voice */ #define SPEEX_PREPROCESS_SET_PROB_START 14 /** Get probability required for the VAD to go from silence to voice */ #define SPEEX_PREPROCESS_GET_PROB_START 15
/** Set probability required for the VAD to stay in the voice state (integer percent) */ #define SPEEX_PREPROCESS_SET_PROB_CONTINUE 16 /** Get probability required for the VAD to stay in the voice state (integer percent) */ #define SPEEX_PREPROCESS_GET_PROB_CONTINUE 17
/** Set maximum attenuation of the noise in dB (negative number) */ #define SPEEX_PREPROCESS_SET_NOISE_SUPPRESS 18 /** Get maximum attenuation of the noise in dB (negative number) */ #define SPEEX_PREPROCESS_GET_NOISE_SUPPRESS 19
/** Set maximum attenuation of the residual echo in dB (negative number) */ #define SPEEX_PREPROCESS_SET_ECHO_SUPPRESS 20 /** Get maximum attenuation of the residual echo in dB (negative number) */ #define SPEEX_PREPROCESS_GET_ECHO_SUPPRESS 21
/** Set maximum attenuation of the residual echo in dB when near end is active (negative number) */ #define SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE 22 /** Get maximum attenuation of the residual echo in dB when near end is active (negative number) */ #define SPEEX_PREPROCESS_GET_ECHO_SUPPRESS_ACTIVE 23
/** Set the corresponding echo canceller state so that residual echo suppression can be performed (NULL for no residual echo suppression) */ #define SPEEX_PREPROCESS_SET_ECHO_STATE 24 /** Get the corresponding echo canceller state */ #define SPEEX_PREPROCESS_GET_ECHO_STATE 25
/** Set maximal gain increase in dB/second (int32) */ #define SPEEX_PREPROCESS_SET_AGC_INCREMENT 26
/** Get maximal gain increase in dB/second (int32) */ #define SPEEX_PREPROCESS_GET_AGC_INCREMENT 27
/** Set maximal gain decrease in dB/second (int32) */ #define SPEEX_PREPROCESS_SET_AGC_DECREMENT 28
/** Get maximal gain decrease in dB/second (int32) */ #define SPEEX_PREPROCESS_GET_AGC_DECREMENT 29
/** Set maximal gain in dB (int32) */ #define SPEEX_PREPROCESS_SET_AGC_MAX_GAIN 30
/** Get maximal gain in dB (int32) */ #define SPEEX_PREPROCESS_GET_AGC_MAX_GAIN 31
/* Can't set loudness */ /** Get loudness */ #define SPEEX_PREPROCESS_GET_AGC_LOUDNESS 33
/* Can't set gain */ /** Get current gain (int32 percent) */ #define SPEEX_PREPROCESS_GET_AGC_GAIN 35
/* Can't set spectrum size */ /** Get spectrum size for power spectrum (int32) */ #define SPEEX_PREPROCESS_GET_PSD_SIZE 37
/* Can't set power spectrum */ /** Get power spectrum (int32[] of squared values) */ #define SPEEX_PREPROCESS_GET_PSD 39
/* Can't set noise size */ /** Get spectrum size for noise estimate (int32) */ #define SPEEX_PREPROCESS_GET_NOISE_PSD_SIZE 41
/* Can't set noise estimate */ /** Get noise estimate (int32[] of squared values) */ #define SPEEX_PREPROCESS_GET_NOISE_PSD 43
/* Can't set speech probability */ /** Get speech probability in last frame (int32). */ #define SPEEX_PREPROCESS_GET_PROB 45
/** Set preprocessor Automatic Gain Control level (int32) */ #define SPEEX_PREPROCESS_SET_AGC_TARGET 46 /** Get preprocessor Automatic Gain Control level (int32) */ #define SPEEX_PREPROCESS_GET_AGC_TARGET 47 |
3.實例代碼
3.1 AGC
#define NN 320 /* 語音數據為單通道、16bit、16k */ int _tmain(int argc, _TCHAR* argv[]) {
short in[NN]; int i; SpeexPreprocessState *st; int count=0; float f;
st = speex_preprocess_state_init(NN, 16000);
i=1; speex_preprocess_ctl(st, SPEEX_PREPROCESS_SET_AGC, &i); f=16000; speex_preprocess_ctl(st, SPEEX_PREPROCESS_SET_AGC_LEVEL, &f);
while (1) { int vad; fread(in, sizeof(short), NN, stdin); if (feof(stdin)) break; vad = speex_preprocess_run(st, in); //fprintf (stderr, "%d\n", vad); fwrite(in, sizeof(short), NN, stdout); count++; } speex_preprocess_state_destroy(st);
return 0; } |