Lag between microphone and speaker
Hi, good afternoon!
First of all, I'm new to the mailing list so I'm not sure if this is the correct place to ask this question. I ask it here because I don't know a better place to ask it so, if this is not the correct place, please let me know.
The thing is, I have a USB sound card plugged to a Beaglebone AI64. This works correctly (I mean, I can record my voice and I can play a wav file without problems. using the arecord and aplay tools)
The problem comes with a software I'm writing which needs to read audio from the microphone, then do a process (which takes less than 20ms) and then, after that process, I have to write the audio to the speaker. The software, right now, consists of several files, but, the most important components of the software are Recorder and Player CPP classes which I show you at the end.
The important part is the next: 1º I init the alsa context using the next calls: audioPlayer.OpenDevice("plughw:CARD=Device,DEV=0"); micRecorder.OpenDevice("plughw:CARD=Device,DEV=0");
2º I'm using the sample rate of 16KHz, a sample size of 16bit and 1 channel. So, assuming I've done correctly, this should give me a total frame rate of 342 samples, then, if I multiply it by 2 bytes, I get 684 bytes of total size. 3º Finally, in the main, I do the next thing: while(1) { //bool keyPressed = id.IsKeyPressed(0); //printf("Key is pressed: %s\n", keyPressed ? "true" : "false"); const int8_t *pAudioBuffer = micRecorder.ReadAudio(); if(pAudioBuffer) { // Process a single frame of audio // Note: output_frame can be the same buffer as input_frame for in-place processing if(keyPressed) { // DO A PROCESS WHICH TAKES 19ms more or less } else { memcpy(buff_in,pAudioBuffer, buff_size); } // Opcionalmente, leemos/escribinos de/a fichero. //write(noisy_raw_fd, pAudioBuffer, micRecorder.GetBufferSize()); //write(clean_raw_fd, g_out_buffer, g_out_buffer_size);
if(audioPlayer.WriteAudio((int8_t*)buff_in)<0) { return 1; } } //printf("Numero de frames escritos: %d del total de: %d\n", pcm, frames); }
This works correctly, the problem is after some time I start to note that there is a difference of time between my speaking and my listening and I don't know why. I've enabled all traces in both the recorder and the player, but I don't see any error. Is just that there is a difference between my speaking and my listening. The difference is up to 10 seconds or more.
Please, can anyone help me with this problem?
PD: Please, be aware I'm just a software engineer, I'm not an expert in all technical stuff related to audio. I did this code based on some examples I found on internet which are: I used this code to implement the player class: https://gist.github.com/ghedo/963382/815c98d1ba0eda1b486eb9d80d9a91a81d99528... I used this code to implement the recorder class: https://gist.github.com/albanpeignier/104902
Here is the code
// Player.h #pragma once
#include <iostream> #include <string> #include <alsa/asoundlib.h>
#define SAMPLE_RATE 16000 #define FRAME_SIZE 342 #define CHANNELS 1
class Player { public: Player(const std::string &devname = "", int sampleRate = SAMPLE_RATE, int channels = CHANNELS); ~Player();
int OpenDevice(const std::string &devname = "", int sampleRate = SAMPLE_RATE, int channels = CHANNELS); void CloseDevice();
size_t GetFrameSize(); size_t GetBufferSize(); int WriteAudio(const int8_t *pAudioBuffer);
private: int InitAudioContext();
private: int m_SampleRate; int m_Channels; std::string m_DevName;
snd_pcm_t *m_AudioHandle;
size_t m_FrameBufferSize; size_t m_BufferSize; int8_t *m_pAudioBuffer;
};
// Player.cpp #include "Player.h"
Player::Player(const std::string &devname, int sampleRate, int channels): m_DevName{devname}, m_SampleRate{sampleRate}, m_Channels{channels}, m_AudioHandle{nullptr}, m_FrameBufferSize{FRAME_SIZE} { }
Player::~Player() { CloseDevice(); }
int Player::OpenDevice(const std::string &devname, int sampleRate, int channels) { if(!devname.empty()) m_DevName = devname;
if(sampleRate != SAMPLE_RATE) m_SampleRate = sampleRate;
if(channels != CHANNELS) m_Channels = channels;
return InitAudioContext(); }
void Player::CloseDevice() { std::cout << "buffer freed" << std::endl;
snd_pcm_close (m_AudioHandle);
//free(m_pAudioBuffer);
std::cout << "audio interface closed" << std::endl; }
size_t Player::GetFrameSize() { return m_FrameBufferSize; }
size_t Player::GetBufferSize() { return m_BufferSize; }
int Player::WriteAudio(const int8_t *pAudioBuffer) { int pcm; if (pcm = snd_pcm_writei(m_AudioHandle, pAudioBuffer, FRAME_SIZE) == -EPIPE) { printf("XRUN.\n"); snd_pcm_prepare(m_AudioHandle); } else if (pcm < 0) { printf("ERROR. Can't write to PCM device. %s\n", snd_strerror(pcm)); return 0; } return 1; }
int Player::InitAudioContext() { std::cout << "Iniciando Player" << std::endl; int i; int err; snd_pcm_hw_params_t *hw_params; snd_pcm_format_t format = SND_PCM_FORMAT_S16_LE;
if ((err = snd_pcm_open (&m_AudioHandle, m_DevName.c_str(), SND_PCM_STREAM_PLAYBACK, 0)) < 0) { std::cerr << "cannot open audio device " << m_DevName << " Error: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "audio interface opened" << std::endl;
if ((err = snd_pcm_hw_params_malloc (&hw_params)) < 0) { std::cerr << "cannot allocate hardware parameters structure: " << snd_strerror(err) << std::endl; return (1); }
std::cout << "hw_params allocated" << std::endl;
if ((err = snd_pcm_hw_params_any (m_AudioHandle, hw_params)) < 0) { std::cerr << "cannot initialize hardware parameter structure: " << snd_strerror(err) << std::endl; return (1); }
std::cout << "hw_params initialized" << std::endl;
if ((err = snd_pcm_hw_params_set_access (m_AudioHandle, hw_params, SND_PCM_ACCESS_RW_INTERLEAVED)) < 0) { std::cerr << "cannot set access type: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params access setted" << std::endl;
if ((err = snd_pcm_hw_params_set_format (m_AudioHandle, hw_params, format)) < 0) { std::cerr << "cannot set sample format: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params format setted" << std::endl;
if ((err = snd_pcm_hw_params_set_rate_near (m_AudioHandle, hw_params, (unsigned int *)&m_SampleRate, 0)) < 0) { std::cerr << "cannot set sample rate: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params rate setted" << std::endl;
if ((err = snd_pcm_hw_params_set_channels (m_AudioHandle, hw_params, CHANNELS)) < 0) { std::cerr << "cannot set channel count: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params channels setted" << std::endl;
if ((err = snd_pcm_hw_params (m_AudioHandle, hw_params)) < 0) { std::cerr << "cannot set parameters: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params setted" << std::endl;
snd_pcm_hw_params_free (hw_params);
std::cout << "hw_params freed" << std::endl;
if ((err = snd_pcm_prepare (m_AudioHandle)) < 0) { std::cerr << "cannot prepare audio interface for use: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "audio interface prepared" << std::endl;
std::cout << "format width: " << snd_pcm_format_width(format) << std::endl;
// Tamaño total en bytes del buffer de memoria reservado m_BufferSize = m_FrameBufferSize * snd_pcm_format_width(SND_PCM_FORMAT_S16_LE) / 8 * CHANNELS;
return 0; }
// Recorder.h #pragma once
#include <iostream> #include <string> #include <alsa/asoundlib.h>
#define SAMPLE_RATE 16000 #define FRAME_SIZE 342 #define CHANNELS 1
class Recorder { public: Recorder(const std::string &devname = "", int sampleRate = SAMPLE_RATE, int channels = CHANNELS); ~Recorder();
int OpenDevice(const std::string &devname = "", int sampleRate = SAMPLE_RATE, int channels = CHANNELS); void CloseDevice();
size_t GetFrameSize(); size_t GetBufferSize(); const int8_t *ReadAudio();
private:
int InitAudioContext();
private: int m_SampleRate; int m_Channels; std::string m_DevName;
snd_pcm_t *m_AudioHandle;
size_t m_FrameBufferSize; size_t m_BufferSize; int8_t *m_pAudioBuffer;
};
// Recorder.cpp #include "Recorder.h" /* * Recuperacion ante errores de alsa. */ static int xrun_recovery(snd_pcm_t *handle, int err) { if (err == -EPIPE) { /* under-run */ printf("underrun!\n"); err = snd_pcm_prepare(handle); if (err < 0) printf("Can't recovery from underrun, prepare failed: %s\n", snd_strerror(err)); return 0; } else if (err == -ESTRPIPE) { while ((err = snd_pcm_resume(handle)) == -EAGAIN) sleep(1); /* wait until the suspend flag is released */ if (err < 0) { err = snd_pcm_prepare(handle); if (err < 0) printf("Can't recovery from suspend, prepare failed: %s\n", snd_strerror(err)); } return 0; } return err; }
Recorder::Recorder(const std::string &devname, int sampleRate, int channels): m_DevName{devname}, m_SampleRate{sampleRate}, m_Channels{channels}, m_AudioHandle{nullptr}, m_FrameBufferSize{FRAME_SIZE} { }
Recorder::~Recorder() { CloseDevice(); }
int Recorder::OpenDevice(const std::string &devname, int sampleRate, int channels) { if(!devname.empty()) m_DevName = devname;
if(sampleRate != SAMPLE_RATE) m_SampleRate = sampleRate;
if(channels != CHANNELS) m_Channels = channels;
return InitAudioContext(); }
size_t Recorder::GetFrameSize() { return m_FrameBufferSize; }
/* * Tamaño total en bytes del buffer de memoria reservado */ size_t Recorder::GetBufferSize() { return m_BufferSize; }
/* * Lee en el puntero de memoria resevado, las muestras de audio leidas de la capturadora de audio (microfono). */ const int8_t *Recorder::ReadAudio() { int err; if ((err = snd_pcm_readi(m_AudioHandle, m_pAudioBuffer, m_FrameBufferSize)) != m_FrameBufferSize) { // Si el error es de tipo PIPE o STRPIPE, tratamos de recuperar el dispositivo. if(xrun_recovery(m_AudioHandle, err) < 0) { std::cerr << "read from audio interface failed " << snd_strerror(err) << std::endl; return nullptr; } }
return m_pAudioBuffer; }
void Recorder::CloseDevice() { std::cout << "buffer freed" << std::endl;
snd_pcm_close (m_AudioHandle);
free(m_pAudioBuffer);
std::cout << "audio interface closed" << std::endl; }
int Recorder::InitAudioContext() { std::cout << "Iniciando Grabador" << std::endl; int i; int err; snd_pcm_hw_params_t *hw_params; snd_pcm_format_t format = SND_PCM_FORMAT_S16_LE;
if ((err = snd_pcm_open (&m_AudioHandle, m_DevName.c_str(), SND_PCM_STREAM_CAPTURE, 0)) < 0) { std::cerr << "cannot open audio device " << m_DevName << " Error: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "audio interface opened" << std::endl;
if ((err = snd_pcm_hw_params_malloc (&hw_params)) < 0) { std::cerr << "cannot allocate hardware parameters structure: " << snd_strerror(err) << std::endl; return (1); }
std::cout << "hw_params allocated" << std::endl;
if ((err = snd_pcm_hw_params_any (m_AudioHandle, hw_params)) < 0) { std::cerr << "cannot initialize hardware parameter structure: " << snd_strerror(err) << std::endl; return (1); }
std::cout << "hw_params initialized" << std::endl;
if ((err = snd_pcm_hw_params_set_access (m_AudioHandle, hw_params, SND_PCM_ACCESS_RW_INTERLEAVED)) < 0) { std::cerr << "cannot set access type: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params access setted" << std::endl;
if ((err = snd_pcm_hw_params_set_format (m_AudioHandle, hw_params, format)) < 0) { std::cerr << "cannot set sample format: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params format setted" << std::endl;
if ((err = snd_pcm_hw_params_set_rate_near (m_AudioHandle, hw_params, (unsigned int *)&m_SampleRate, 0)) < 0) { std::cerr << "cannot set sample rate: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params rate setted" << std::endl;
if ((err = snd_pcm_hw_params_set_channels (m_AudioHandle, hw_params, CHANNELS)) < 0) { std::cerr << "cannot set channel count: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params channels setted" << std::endl;
if ((err = snd_pcm_hw_params (m_AudioHandle, hw_params)) < 0) { std::cerr << "cannot set parameters: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "hw_params setted" << std::endl;
snd_pcm_hw_params_free (hw_params);
std::cout << "hw_params freed" << std::endl;
if ((err = snd_pcm_prepare (m_AudioHandle)) < 0) { std::cerr << "cannot prepare audio interface for use: " << snd_strerror (err) << std::endl; return (1); }
std::cout << "audio interface prepared" << std::endl;
std::cout << "formt width: " << snd_pcm_format_width(format) << std::endl;
// Tamaño total en bytes del buffer de memoria reservado m_BufferSize = m_FrameBufferSize * snd_pcm_format_width(format) / 8 * CHANNELS;
// Puntero al buffer de memoria reservado. m_pAudioBuffer = (int8_t *)malloc(m_BufferSize);
return 0; }
participants (1)
-
Guillermo Bernaldo de Quiros Maraver