[alsa-devel] Xrun stack trace for 1010LT
Im trying to get xrun free recording from a pair of 1010LTs (ice1712) and ive resorted to setting xrun_debug to record stack traces on xruns.
Im getting xruns every few minutes (between 2 and 15 minute gaps between single xrun events) and every xrun produces identical stack traces like this:
[17127.486287] Pid: 775, comm: IRQ-3 Not tainted 2.6.24.1 #2 [17127.486309] [<c02ade37>] snd_pcm_period_elapsed+0x247/0x2f0 [17127.486325] [<c010312e>] __switch_to+0x9e/0x150 [17127.486336] [<c0128674>] finish_task_switch+0x34/0xd0 [17127.486350] [<c02c2119>] snd_ice1712_interrupt+0xa9/0x1a0 [17127.486364] [<c016ab7c>] handle_IRQ_event+0x5c/0x100 [17127.486379] [<c016b1ba>] thread_simple_irq+0x4a/0x90 [17127.486390] [<c016bb09>] do_irqd+0x229/0x290 [17127.486404] [<c016b8e0>] do_irqd+0x0/0x290 [17127.486411] [<c0141ef2>] kthread+0x42/0x70 [17127.486418] [<c0141eb0>] kthread+0x0/0x70 [17127.486425] [<c0105857>] kernel_thread_helper+0x7/0x10 [17127.486437] =======================
Running the rt kernel (Linux ubuntu-studio 2.6.24.1 #2 SMP PREEMPT RT Thu Dec 4 21:30:26 GMT 2008 i686 GNU/Linux) - although the original installation was Ubuntu studio, the kernel is recompiled. The xruns seem to be all on the same card. IRQs are set to priority of 99 with chrt. Software for record is arecord, modified to use a 10 second ring buffer which is written to disk in a separate thread.
I was hoping that from the stack trace I could work out what was causing the xruns, but from the above it dosent look like there is anything causing them. However this is the first time I am using stack traces so maybe im missing something. Can anyone point to the fault for me?
Thanks, Pete
At Fri, 5 Dec 2008 06:18:04 -0800 (PST), Pete cat wrote:
Im trying to get xrun free recording from a pair of 1010LTs (ice1712) and ive resorted to setting xrun_debug to record stack traces on xruns.
Im getting xruns every few minutes (between 2 and 15 minute gaps between single xrun events) and every xrun produces identical stack traces like this:
[17127.486287] Pid: 775, comm: IRQ-3 Not tainted 2.6.24.1 #2 [17127.486309] [<c02ade37>] snd_pcm_period_elapsed+0x247/0x2f0 [17127.486325] [<c010312e>] __switch_to+0x9e/0x150 [17127.486336] [<c0128674>] finish_task_switch+0x34/0xd0 [17127.486350] [<c02c2119>] snd_ice1712_interrupt+0xa9/0x1a0 [17127.486364] [<c016ab7c>] handle_IRQ_event+0x5c/0x100 [17127.486379] [<c016b1ba>] thread_simple_irq+0x4a/0x90 [17127.486390] [<c016bb09>] do_irqd+0x229/0x290 [17127.486404] [<c016b8e0>] do_irqd+0x0/0x290 [17127.486411] [<c0141ef2>] kthread+0x42/0x70 [17127.486418] [<c0141eb0>] kthread+0x0/0x70 [17127.486425] [<c0105857>] kernel_thread_helper+0x7/0x10 [17127.486437] =======================
Running the rt kernel (Linux ubuntu-studio 2.6.24.1 #2 SMP PREEMPT RT Thu Dec 4 21:30:26 GMT 2008 i686 GNU/Linux) - although the original installation was Ubuntu studio, the kernel is recompiled. The xruns seem to be all on the same card. IRQs are set to priority of 99 with chrt. Software for record is arecord, modified to use a 10 second ring buffer which is written to disk in a separate thread.
Sorry, but this information is useless for RT kernel because the irq is all threaded... On RT kernel, you'd better to use a better analyze tool.
Takashi
I understand. Im only a fairly basic C coder and such, so could you recommend a tool which is fairly easy to use? Or would it be better to revert to a non RT kernel (there did not seem to be much improvement when I moved to the RT kernel)?
Thanks, Pete
Takashi Iwai tiwai@suse.de wrote: At Fri, 5 Dec 2008 06:18:04 -0800 (PST), Pete cat wrote:
Im trying to get xrun free recording from a pair of 1010LTs (ice1712) and ive resorted to setting xrun_debug to record stack traces on xruns.
Im getting xruns every few minutes (between 2 and 15 minute gaps between single xrun events) and every xrun produces identical stack traces like this:
[17127.486287] Pid: 775, comm: IRQ-3 Not tainted 2.6.24.1 #2 [17127.486309] [] snd_pcm_period_elapsed+0x247/0x2f0 [17127.486325] [] __switch_to+0x9e/0x150 [17127.486336] [] finish_task_switch+0x34/0xd0 [17127.486350] [] snd_ice1712_interrupt+0xa9/0x1a0 [17127.486364] [] handle_IRQ_event+0x5c/0x100 [17127.486379] [] thread_simple_irq+0x4a/0x90 [17127.486390] [] do_irqd+0x229/0x290 [17127.486404] [] do_irqd+0x0/0x290 [17127.486411] [] kthread+0x42/0x70 [17127.486418] [] kthread+0x0/0x70 [17127.486425] [] kernel_thread_helper+0x7/0x10 [17127.486437] =======================
Running the rt kernel (Linux ubuntu-studio 2.6.24.1 #2 SMP PREEMPT RT Thu Dec 4 21:30:26 GMT 2008 i686 GNU/Linux) - although the original installation was Ubuntu studio, the kernel is recompiled. The xruns seem to be all on the same card. IRQs are set to priority of 99 with chrt. Software for record is arecord, modified to use a 10 second ring buffer which is written to disk in a separate thread.
Sorry, but this information is useless for RT kernel because the irq is all threaded... On RT kernel, you'd better to use a better analyze tool.
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Fri, Dec 05, 2008 at 06:48:43AM -0800, Pete cat wrote:
I understand. Im only a fairly basic C coder and such, so could you recommend a tool which is fairly easy to use? Or would it be better to revert to a non RT kernel (there did not seem to be much improvement when I moved to the RT kernel)?
FWIW I've never been able to run an RT-patched kernel with pcm_multi. I've been using a non-rt preempt SMP kernel with three Delta 1010s for about three years now without too many problems, currently with alsa-lib-1.0.16 and 2.6.25.x kernel.
John
John thanks,
I spoke to you about 2 weeks ago about this and have been trying the various tweaks you suggested. But so far I have found no configs which have worked flawlessly, although one config had it down to about one xrun every 20 mins or so average.
Thing is I am after capturing live gigs, which you dont get a second chance with. Although I suspect a small xrun would be easy to disguise but its not ideal. So I am trying to hunt down the elusive 'perfection' config.
Its interesting that when I comment out the actual write call in arecord, there are no xruns. Tonight I will try the exact config you mention below. I suspect that the problem I am getting might be related to the hardware I am on though, which does not have much in the way of configurable bios. Perhaps something should be turned off which I can not access through this basic bios.
Anyway on discovering the stack trace system it was obvious that I must get it working so if it works under a non RT kernel thats where I will go next. Although it would be nice if someone could tell me a tool which would work with the RT kernel as it does interest me what could be screwing up this supposedly real time system, just from a technical interest point of view.
John Rigg aldev@sound-man.co.uk wrote: On Fri, Dec 05, 2008 at 06:48:43AM -0800, Pete cat wrote:
I understand. Im only a fairly basic C coder and such, so could you recommend a tool which is fairly easy to use? Or would it be better to revert to a non RT kernel (there did not seem to be much improvement when I moved to the RT kernel)?
FWIW I've never been able to run an RT-patched kernel with pcm_multi. I've been using a non-rt preempt SMP kernel with three Delta 1010s for about three years now without too many problems, currently with alsa-lib-1.0.16 and 2.6.25.x kernel.
John _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Also while I am here, could someone clear up a simple question I have been wondering about for a while.
Is there likely to be any disadvantage to running arecord rather than say Jack to capture? Does (say) Jack read the card in a 'better' way?
Thanks.
On Fri, Dec 05, 2008 at 07:23:00AM -0800, Pete wrote:
Is there likely to be any disadvantage to running arecord rather than say Jack to capture? Does (say) Jack read the card in a 'better' way?
jackd runs with realtime priority if -R option is specified when starting it. Running it without -R produces xruns fairly often, but with -R they almost never happen on a properly configured system. I don't know if arecord can run with realtime priority.
John
So even if the kernel is not a RT kernel, processes can run with real time priority? How is this so? From what ive read about the RT kernel it sounds like the kernel has to cooperate with the processes running on the system to allow a process to awaken (on an interrupt for instance). Otherwise the kernel will be stuck in some non interruptable state for some time, blocking the interrupt from being serviced - is that wrong?
If Jack has a way to grab higher priority through some other mechanisim, could I build this into arecord I wonder. I have of course tried nicing the process to very unnice.
John Rigg aldev@sound-man.co.uk wrote: On Fri, Dec 05, 2008 at 07:23:00AM -0800, Pete wrote:
Is there likely to be any disadvantage to running arecord rather than say Jack to capture? Does (say) Jack read the card in a 'better' way?
jackd runs with realtime priority if -R option is specified when starting it. Running it without -R produces xruns fairly often, but with -R they almost never happen on a properly configured system. I don't know if arecord can run with realtime priority.
John _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
On Fri, Dec 05, 2008 at 08:11:36AM -0800, Pete wrote:
So even if the kernel is not a RT kernel, processes can run with real time priority? How is this so? From what ive read about the RT kernel it sounds like the kernel has to cooperate with the processes running on the system to allow a process to awaken (on an interrupt for instance). Otherwise the kernel will be stuck in some non interruptable state for some time, blocking the interrupt from being serviced - is that wrong?
If Jack has a way to grab higher priority through some other mechanisim, could I build this into arecord I wonder. I have of course tried nicing the process to very unnice.
nice values are irrelevant to processes with rtprio. Any rtprio process will have priority over any process with just a nice value. jackd -R defaults to rtprio 10 (in a range of 0-99). You can raise that by specifying the value on the command line eg. jackd -R -P80 which would increase it to 80 (usually unnecessary and possibly inadvisable). You can look at what's running with rtprio on your system with ps, eg.: ps -Leo pid,cmd,rtprio
rtprio makes use of SCHED_FIFO or SCHED_RR (as opposed to SCHED_NORMAL). These are static soft realtime scheduling policies provided by the standard kernel. The -rt kernels attempt to get closer to hard realtime. Don't confuse the two.
On the current kernels, a non-root user has to belong to a realtime scheduling group (eg. the audio group) and the kernel config needs to have CONFIG_RT_GROUP_SCHED=y to allow rtprio to be used. The relevant configuration must also be present in /etc/security/limits.conf. There are plenty of articles on the web describing how to set up limits.conf.
John
I was confusing the two. Ill work on the kernel over the weekend. In the meantime I recompiled it without the rt patch today and have this in my stack trace:
Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623613] Pid: 4785, comm: arecord Not tainted 2.6.24.1 #1 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623636] [<c02a5a07>] snd_pcm_period_elapsed+0x247/0x2f0 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623653] [<f88ed832>] do_get_write_access+0x232/0x520 [jbd] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623678] [<c02b9cf9>] snd_ice1712_interrupt+0xa9/0x1a0 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623694] [<c0168000>] handle_IRQ_event+0x30/0x60 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623705] [<c016996d>] handle_level_irq+0x7d/0xf0 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623715] [<c010711b>] do_IRQ+0x3b/0x70 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623724] [<c01b3693>] __block_prepare_write+0x233/0x410 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623736] [<c0105523>] common_interrupt+0x23/0x30 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623749] [<f88f007b>] journal_commit_transaction+0xe9b/0xf10 [jbd] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623764] [<c0345597>] _spin_unlock+0x7/0x20 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623776] [<f88ee8f2>] journal_dirty_data+0xb2/0x250 [jbd] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623793] [<f890f8a8>] ext3_journal_dirty_data+0x18/0x50 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623811] [<f89104b0>] ext3_get_block+0x0/0x100 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623830] [<f890eb12>] walk_page_buffers+0x32/0x70 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623849] [<f89116d4>] ext3_ordered_write_end+0x74/0x180 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623866] [<f890f890>] ext3_journal_dirty_data+0x0/0x50 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623887] [<c016f6ee>] generic_file_buffered_write+0x18e/0x670 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623911] [<f89176a4>] __ext3_journal_stop+0x24/0x50 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623931] [<c01ae4b0>] __mark_inode_dirty+0x30/0x1a0 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623948] [<c016fe74>] __generic_file_aio_write_nolock+0x2a4/0x540 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623970] [<c01318d5>] irq_exit+0x45/0x50 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623981] [<c0170175>] generic_file_aio_write+0x65/0xe0 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623998] [<f890d830>] ext3_file_write+0x30/0xc0 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.624016] [<c01908c5>] do_sync_write+0xd5/0x120 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.624035] [<c0140600>] autoremove_wake_function+0x0/0x40 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.624047] [<c0143cac>] hrtimer_nanosleep+0x5c/0x100 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.624064] [<c01907f0>] do_sync_write+0x0/0x120 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.624070] [<c01911c9>] vfs_write+0xb9/0x170 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.624080] [<c0191901>] sys_write+0x41/0x70 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.624090] [<c01044d2>] sysenter_past_esp+0x6b/0xa9 Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.624108] =======================
Several times over. Would this suggest that the ext3 filesystem is my problem? What is going on exactly?
Thanks, Pete
nice values are irrelevant to processes with rtprio. Any rtprio process will have priority over any process with just a nice value. jackd -R defaults to rtprio 10 (in a range of 0-99). You can raise that by specifying the value on the command line eg. jackd -R -P80 which would increase it to 80 (usually unnecessary and possibly inadvisable). You can look at what's running with rtprio on your system with ps, eg.: ps -Leo pid,cmd,rtprio
rtprio makes use of SCHED_FIFO or SCHED_RR (as opposed to SCHED_NORMAL). These are static soft realtime scheduling policies provided by the standard kernel. The -rt kernels attempt to get closer to hard realtime. Don't confuse the two.
On the current kernels, a non-root user has to belong to a realtime scheduling group (eg. the audio group) and the kernel config needs to have CONFIG_RT_GROUP_SCHED=y to allow rtprio to be used. The relevant configuration must also be present in /etc/security/limits.conf. There are plenty of articles on the web describing how to set up limits.conf.
John
On Fri, Dec 05, 2008 at 01:39:33PM -0800, Pete wrote:
I was confusing the two. Ill work on the kernel over the weekend. In the meantime I recompiled it without the rt patch today and have this in my stack trace:
<snip>
Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623776] [<f88ee8f2>] journal_dirty_data+0xb2/0x250 [jbd] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623793] [<f890f8a8>] ext3_journal_dirty_data+0x18/0x50 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623811] [<f89104b0>] ext3_get_block+0x0/0x100 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623830] [<f890eb12>] walk_page_buffers+0x32/0x70 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623849] [<f89116d4>] ext3_ordered_write_end+0x74/0x180 [ext3] Dec 5 21:28:31 ubuntu-studio kernel: [ 1115.623866] [<f890f890>] ext3_journal_dirty_data+0x0/0x50 [ext3]
</snip>
Several times over. Would this suggest that the ext3 filesystem is my problem? What is going on exactly?
Looks like ext3 could be a problem. jackd uses /dev/shm as a temporary filesystem to get round this. You might have to mount the filesystem as ext2 (or use jackd).
John
I have fixed my problem and thought it was best to make a record of what worked in case anyone is ever searching for it. This will I guess be obvious to most people seeing this but if I had been able to find a snippet like this a month ago, I would have saved many dozens of hours searching and trying things! So for those like me who dont intuitivley know this...
Turned out very simple, I just used /dev/sda3 (a 'raw' partition on the disk) as the output file. Now having tested for many hours, it works perfectly, and there has not been a single overrun, on a non RT kernel, with no special permissions, priority, prempt flags or anything. This is probably not suitable for many people as the output 'file' is not as conveniently accessable as a simple file, but if you need to make it work, thats how to do it.
I found that I couldent use arecord to write directly to the partition, im not exactly sure why but I guess its because arecord dosent quite know how to write to a partition. However making arecord output the digital data to stdout and piping it to dd which then writes it to the partition works fine. For the record thats 16 channels at 88200 kHz, 24 bit.
The partition can then be turned into a file using DD again to read it from the partition and write it to a file.
BTW if any developers are reading this, maybe you can improve the performance of some of the tools youre writing by using a scratch disk accessed directly? I would be interested to try if anyone does write it.
Thanks for the help.
John Rigg aldev@sound-man.co.uk wrote: On Fri, Dec 05, 2008 at 07:23:00AM -0800, Pete wrote:
Is there likely to be any disadvantage to running arecord rather than say Jack to capture? Does (say) Jack read the card in a 'better' way?
jackd runs with realtime priority if -R option is specified when starting it. Running it without -R produces xruns fairly often, but with -R they almost never happen on a properly configured system. I don't know if arecord can run with realtime priority.
John _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
participants (4)
-
John Rigg
-
Pete
-
Pete cat
-
Takashi Iwai