It get's -ENOMEM.. which appears to be returned by the imx-sdma driver when it attempts to allocate memory with the dma_alloc_coherent(...) function. This seems to get called through a chain of functions starting back to pcm_dmaengine.c > snd_dmaengine_pcm_trigger(...), under the SNDDRV_PCM_TRIGGER_START case which calls dmaengine_pcm_prepare_and_submit(...)... then dmaengine_prep_dma_cyclic(...).. which will call device_prep_dma_cyclic(...). In my case, I'm running on an NXP IMX8M-Quad, hence hte imx-sdma driver.
Alex.
On Tue, Jul 27, 2021 at 11:26 AM Takashi Iwai tiwai@suse.de wrote:
On Mon, 26 Jul 2021 17:07:52 +0200, Alex Roberts wrote:
Hello,
I am developing a dummy codec to interface with an 8-channel, 24-bit ADC. I've got it working on an NXP imx8m through the fsl_sai driver on kernel 5.4.85. I can capture all 8 channels at varying sample rates using arecord, and I've verified correct data capture via opening the resulting .wav file in Audacity. The problem I am having is that occasionally, upon starting arecord - after a fresh power cycle - I get an out of memory error. Other times I get an out of memory after a non-deterministic period of capture. Starting capture again also reports out of memory, but if I wait several minutes and start capture it will start recording again. A power cycle usually helps, but as stated earlier, not 100% of the time.
Do you mean that application gets -ENOMEM error from API, or the system exhausts the memory? The former is often some buffer management issue (e.g. the buffer perallocation didn't happen and yet the dynamic allocation failed), but the latter is rather about the memory leaks.
Takashi
I'm trying to track down where the oom error is coming from, but haven't had much luck. My colleague tried running arecord with valgrind to check for memory leaks and nothing of note was observed. My suspicion is there's something going on with allocated memory for DMA, like fragmentation starts to happen and it can't get a contiguous region for operation. Reserving a larger pool - either via device tree or kernel cmdline arguments in the bootoader - did not seem to help.
Another thought is that it's a boundary/alignment issue due to the 24-bit data, and the error is the result of trying to allocate a chunk of memory for DMA that doesn't align.
I'm very new to ALSA dev with some exposure to kernel dev in general, so please correct me if I'm wrong or completely mis-understanding something.
Any suggestions on where I should / how I can debug this memory error?
Thanks, Alex.
PS: Previously sent this to just alsa-devel mailing list on 7/21, but never saw it show up in the archives. Here is more info since then:
The goal is 8-channel, 96k sampling rate. I've reduced sampling rate and still have the issue. Reducing down to 4-channels helps, but haven't tested long term enough to evaluate by how much.
Narrowed it down to device_prep_dma_cyclic(..) returning NULL within dmaengine_prep_dma_cyclic(..)..... still tracing through source to learn exactly what is going on.