[alsa-devel] ESI Juli@ crash with external clock switch - patch
Hi,
I have investigated a crash/kernel thread lockup when Juli@ is switched to external SPDIF clock and the incoming SPDIF stream changes samplerate. The problem appears to occur in the timed thread in charge of reading incoming samplerate and acting upon its change.
The problem disappears with the following patch:
diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index c7f5633..68bb326 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -154,7 +154,7 @@ void snd_ak4114_reinit(struct ak4114 *chip) { chip->init = 1; mb(); - flush_delayed_work(&chip->work); + //flush_delayed_work(&chip->work); ak4114_init_regs(chip); /* bring up statistics / event queing */ chip->init = 0;
I am afraid I do not know enough about kernel workqueues to determine whether this "fix" is OK.
Interestingly, the almost identical driver ak4113.c for a very similar card Infrasonic Quartet (ice1724/quartet.c) does not suffer from this problem (tested now).
Thanks a lot for your opinion.
Best regards,
Pavel Hofman.
At Sun, 11 Jan 2015 14:58:12 +0100, Pavel Hofman wrote:
Hi,
I have investigated a crash/kernel thread lockup when Juli@ is switched to external SPDIF clock and the incoming SPDIF stream changes samplerate. The problem appears to occur in the timed thread in charge of reading incoming samplerate and acting upon its change.
The problem disappears with the following patch:
diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index c7f5633..68bb326 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -154,7 +154,7 @@ void snd_ak4114_reinit(struct ak4114 *chip) { chip->init = 1; mb();
flush_delayed_work(&chip->work);
//flush_delayed_work(&chip->work); ak4114_init_regs(chip); /* bring up statistics / event queing */ chip->init = 0;
I am afraid I do not know enough about kernel workqueues to determine whether this "fix" is OK.
Interestingly, the almost identical driver ak4113.c for a very similar card Infrasonic Quartet (ice1724/quartet.c) does not suffer from this problem (tested now).
Does the patch below work instead? The reason it appears only on Juli@ is that juli@ is the only board using this function.
thanks,
Takashi
--- diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index c7f56339415d..fd7ed9b42e48 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -612,10 +612,10 @@ static void ak4114_stats(struct work_struct *work) { struct ak4114 *chip = container_of(work, struct ak4114, work.work);
- if (!chip->init) + if (!chip->init) { snd_ak4114_check_rate_and_errors(chip, chip->check_flags); - - schedule_delayed_work(&chip->work, HZ / 10); + schedule_delayed_work(&chip->work, HZ / 10); + } }
EXPORT_SYMBOL(snd_ak4114_create);
Hi Takashi,
Dne 11.1.2015 v 16:52 Takashi Iwai napsal(a):
Does the patch below work instead?
Thanks a lot for your Sunday response. Unfortunately this patch does not help, again stuck in the flush_delayed work call - see stack trace below.
The reason it appears only on Juli@ is that juli@ is the only board using this function.
I see, snd_ak4113_reinit of ak4113.c is never called, only ak4113_init_regs. Perhaps Juli should not touch the workqueue in ak4114_reinit and only initialize the regs in similar manner to ak4113?
But then should we restart the workqueue in juli_resume or just reinit ak4114 regs? Quartet does not have the resume functionality implemented at all yet.
The following patch removing any workqueue functions seems to work too (making it the same functionality as in quartet - only reinitializing ak411x registers). The workqueue gets executed since upon changing the incoming rate the running capture is stopped. The recorded file stops growing but arecord does not stop, must be killed with kill -9. Not a problem for now.
========== diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index c7f5633..65efac7 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -154,12 +154,9 @@ void snd_ak4114_reinit(struct ak4114 *chip) { chip->init = 1; mb(); - flush_delayed_work(&chip->work); ak4114_init_regs(chip); /* bring up statistics / event queing */ chip->init = 0; - if (chip->kctls[0]) - schedule_delayed_work(&chip->work, HZ / 10); }
static unsigned int external_rate(unsigned char rcs1)
==========
Thanks a lot,
Pavel.
[ 360.656039] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 360.656040] kworker/3:1 D ffff88021fd94480 0 49 2 0x00000000 [ 360.656052] Workqueue: events ak4114_stats [snd_ak4114] [ 360.656055] ffff880213c9fba0 0000000000000046 ffff880214233000 ffff880213c9ffd8 [ 360.656058] 0000000000014480 0000000000014480 ffff880214233000 ffff880213c9fce8 [ 360.656061] ffff880213c9fcf0 7fffffffffffffff ffff880214233000 ffff880212cb02a8 [ 360.656063] Call Trace: [ 360.656188] [<ffffffff81723129>] schedule+0x29/0x70 [ 360.656191] [<ffffffff81722379>] schedule_timeout+0x239/0x2d0 [ 360.656193] [<ffffffff81722cb1>] ? __schedule+0x381/0x7d0 [ 360.656196] [<ffffffff81723c46>] wait_for_completion+0xa6/0x160 [ 360.656198] [<ffffffff8109a8d0>] ? wake_up_state+0x20/0x20 [ 360.656201] [<ffffffff81084b8d>] flush_work+0xed/0x1b0 [ 360.656203] [<ffffffff81080e60>] ? wake_up_worker+0x30/0x30 [ 360.656205] [<ffffffff81084e2f>] flush_delayed_work+0x3f/0x50 [ 360.656208] [<ffffffffa03e8695>] snd_ak4114_reinit+0x25/0x60 [snd_ak4114] [ 360.656214] [<ffffffffa042b26a>] juli_akm_set_rate_val+0xca/0xf0 [snd_ice1724] [ 360.656219] [<ffffffffa0420642>] snd_vt1724_set_pro_rate+0x122/0x220 [snd_ice1724] [ 360.656224] [<ffffffffa042300d>] snd_vt1724_pro_internal_clock_put+0xad/0x1f0 [snd_ice1724] [ 360.656234] [<ffffffffa0302df5>] ? snd_ctl_find_id+0xb5/0xe0 [snd] [ 360.656240] [<ffffffffa03034fd>] snd_ctl_elem_write+0x11d/0x1a0 [snd] [ 360.656243] [<ffffffff811a1dc5>] ? __kmalloc+0xa5/0x230 [ 360.656248] [<ffffffffa02ff01d>] ? __snd_kmalloc+0x1d/0x90 [snd] [ 360.656253] [<ffffffffa0305266>] ? snd_ctl_ioctl+0xa6/0x7a0 [snd] [ 360.656258] [<ffffffffa03052cc>] snd_ctl_ioctl+0x10c/0x7a0 [snd] [ 360.656261] [<ffffffff8109a8d0>] ? wake_up_state+0x20/0x20 [ 360.656265] [<ffffffff811d03a0>] do_vfs_ioctl+0x2e0/0x4c0 [ 360.656268] [<ffffffff811bd0ce>] ? vfs_read+0xee/0x160 [ 360.656270] [<ffffffff811d0601>] SyS_ioctl+0x81/0xa0 [ 360.656272] [<ffffffff811bdb89>] ? SyS_read+0x49/0xa0 [ 360.656275] [<ffffffff8172f82d>] system_call_fastpath+0x1a/0x1f [ 360.656277] INFO: task amixer:2917 blocked for more than 120 seconds. [ 360.656278] Tainted: G OX 3.13.0-37-generic #64-Ubuntu [ 360.656279] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 360.656281] amixer D ffff88021fd14480 0 2917 2916 0x00000004 [ 360.656283] ffff8800d2ba7d98 0000000000000082 ffff880212d84800 ffff8800d2ba7fd8 [ 360.656286] 0000000000014480 0000000000014480 ffff880212d84800 ffff880212d84800 [ 360.656288] ffff8802119709c0 ffff8802119709c8 ffffffff00000000 ffff8802119709d0 [ 360.656291] Call Trace: [ 360.656293] [<ffffffff81723129>] schedule+0x29/0x70 [ 360.656296] [<ffffffff81725da5>] rwsem_down_write_failed+0x115/0x230 [ 360.656298] [<ffffffff811a1dc5>] ? __kmalloc+0xa5/0x230 [ 360.656302] [<ffffffff8136fd13>] call_rwsem_down_write_failed+0x13/0x20 [ 360.656304] [<ffffffff817257bd>] ? down_write+0x2d/0x30 [ 360.656310] [<ffffffffa030324d>] snd_ctl_release+0x7d/0x130 [snd] [ 360.656312] [<ffffffff811bed64>] __fput+0xe4/0x260 [ 360.656315] [<ffffffff811bef2e>] ____fput+0xe/0x10 [ 360.656318] [<ffffffff81088227>] task_work_run+0xa7/0xe0 [ 360.656321] [<ffffffff81013df7>] do_notify_resume+0x97/0xb0 [ 360.656324] [<ffffffff8172faea>] int_signal+0x12/0x17 [ 360.656326] INFO: task amixer:2934 blocked for more than 120 seconds. [ 360.656327] Tainted: G OX 3.13.0-37-generic #64-Ubuntu [ 360.656329] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 360.656330] amixer D ffff88021fd94480 0 2934 2933 0x00000004 [ 360.656332] ffff8802110cfd98 0000000000000086 ffff88003569c800 ffff8802110cffd8 [ 360.656335] 0000000000014480 0000000000014480 ffff88003569c800 ffff88003569c800 [ 360.656337] ffff8802119709c0 ffff8802119709c8 ffffffff00000000 ffff8802119709d0 [ 360.656339] Call Trace: [ 360.656342] [<ffffffff81723129>] schedule+0x29/0x70 [ 360.656344] [<ffffffff81725da5>] rwsem_down_write_failed+0x115/0x230 [ 360.656347] [<ffffffff8136fd13>] call_rwsem_down_write_failed+0x13/0x20 [ 360.656349] [<ffffffff817257bd>] ? down_write+0x2d/0x30 [ 360.656355] [<ffffffffa030324d>] snd_ctl_release+0x7d/0x130 [snd] [ 360.656357] [<ffffffff811bed64>] __fput+0xe4/0x260 [ 360.656360] [<ffffffff811bef2e>] ____fput+0xe/0x10 [ 360.656362] [<ffffffff81088227>] task_work_run+0xa7/0xe0 [ 360.656365] [<ffffffff81013df7>] do_notify_resume+0x97/0xb0 [ 360.656367] [<ffffffff8172faea>] int_signal+0x12/0x17 [ 360.656369] INFO: task amixer:2938 blocked for more than 120 seconds. [ 360.656370] Tainted: G OX 3.13.0-37-generic #64-Ubuntu [ 360.656372] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 360.656373] amixer D ffff88021fd94480 0 2938 2937 0x00000004 [ 360.656375] ffff88020ff1dd98 0000000000000086 ffff8800d2a1b000 ffff88020ff1dfd8 [ 360.656378] 0000000000014480 0000000000014480 ffff8800d2a1b000 ffff8800d2a1b000 [ 360.656380] ffff8802119709c0 ffff8802119709c8 ffffffff00000000 ffff8802119709d0 [ 360.656383] Call Trace: [ 360.656385] [<ffffffff81723129>] schedule+0x29/0x70 [ 360.656387] [<ffffffff81725da5>] rwsem_down_write_failed+0x115/0x230 [ 360.656390] [<ffffffff8136fd13>] call_rwsem_down_write_failed+0x13/0x20 [ 360.656392] [<ffffffff817257bd>] ? down_write+0x2d/0x30 [ 360.656398] [<ffffffffa030324d>] snd_ctl_release+0x7d/0x130 [snd] [ 360.656400] [<ffffffff811bed64>] __fput+0xe4/0x260 [ 360.656402] [<ffffffff811bef2e>] ____fput+0xe/0x10 [ 360.656405] [<ffffffff81088227>] task_work_run+0xa7/0xe0 [ 360.656408] [<ffffffff81013df7>] do_notify_resume+0x97/0xb0 [ 360.656410] [<ffffffff8172faea>] int_signal+0x12/0x17
At Sun, 11 Jan 2015 18:57:38 +0100, Pavel Hofman wrote:
Hi Takashi,
Dne 11.1.2015 v 16:52 Takashi Iwai napsal(a):
Does the patch below work instead?
Thanks a lot for your Sunday response. Unfortunately this patch does not help, again stuck in the flush_delayed work call - see stack trace below.
OK, then this should be cancel_delayed_work_sync() instead, I suppose. The revised patch (also for ak4113.c) below.
The reason it appears only on Juli@ is that juli@ is the only board using this function.
I see, snd_ak4113_reinit of ak4113.c is never called, only ak4113_init_regs. Perhaps Juli should not touch the workqueue in ak4114_reinit and only initialize the regs in similar manner to ak4113?
No, it's just quartet driver doesn't handle it properly :)
But then should we restart the workqueue in juli_resume or just reinit ak4114 regs? Quartet does not have the resume functionality implemented at all yet.
It doesn't matter much because PM doesn't work with Quartet. But the juli.c also should be improved regarding PM. It should stop the workq at suspend. Also, it'd be preferable to have some control start/stop this background work, e.g. via a control element. Otherwise your machine will be constantly loaded unnecessarily.
I'll prepare a fix patch to these later.
Takashi
--- diff --git a/sound/i2c/other/ak4113.c b/sound/i2c/other/ak4113.c index 1a3a6fa27158..5465506032fd 100644 --- a/sound/i2c/other/ak4113.c +++ b/sound/i2c/other/ak4113.c @@ -141,7 +141,7 @@ void snd_ak4113_reinit(struct ak4113 *chip) { chip->init = 1; mb(); - flush_delayed_work(&chip->work); + cancel_delayed_work_sync(&chip->work); ak4113_init_regs(chip); /* bring up statistics / event queing */ chip->init = 0; @@ -632,8 +632,8 @@ static void ak4113_stats(struct work_struct *work) { struct ak4113 *chip = container_of(work, struct ak4113, work.work);
- if (!chip->init) + if (!chip->init) { snd_ak4113_check_rate_and_errors(chip, chip->check_flags); - - schedule_delayed_work(&chip->work, HZ / 10); + schedule_delayed_work(&chip->work, HZ / 10); + } } diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index c7f56339415d..801ea4692339 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -154,7 +154,7 @@ void snd_ak4114_reinit(struct ak4114 *chip) { chip->init = 1; mb(); - flush_delayed_work(&chip->work); + cancel_delayed_work_sync(&chip->work); ak4114_init_regs(chip); /* bring up statistics / event queing */ chip->init = 0; @@ -612,10 +612,10 @@ static void ak4114_stats(struct work_struct *work) { struct ak4114 *chip = container_of(work, struct ak4114, work.work);
- if (!chip->init) + if (!chip->init) { snd_ak4114_check_rate_and_errors(chip, chip->check_flags); - - schedule_delayed_work(&chip->work, HZ / 10); + schedule_delayed_work(&chip->work, HZ / 10); + } }
EXPORT_SYMBOL(snd_ak4114_create);
Dne 11.1.2015 v 21:36 Takashi Iwai napsal(a):
OK, then this should be cancel_delayed_work_sync() instead, I suppose. The revised patch (also for ak4113.c) below.
I am afraid it is getting stuck in the same way - see the thread stack below.
I see, snd_ak4113_reinit of ak4113.c is never called, only ak4113_init_regs. Perhaps Juli should not touch the workqueue in ak4114_reinit and only initialize the regs in similar manner to ak4113?
No, it's just quartet driver doesn't handle it properly :)
Why is actually the restart of the workqueue needed at reinit? The work (snd_ak4114_check_rate_and_errors) only reads ak4114 regs to controls (using i2c routine synchronized with mutexes) and handles the stream stop.
It doesn't matter much because PM doesn't work with Quartet. But the juli.c also should be improved regarding PM. It should stop the workq at suspend. Also, it'd be preferable to have some control start/stop this background work, e.g. via a control element. Otherwise your machine will be constantly loaded unnecessarily.
I think we can extend the timer, perhaps to HZ/2 - the thread is just a security measure anyway.
I'll prepare a fix patch to these later.
Thanks a lot,
Pavel.
[11280.656021] INFO: task kworker/0:1:46 blocked for more than 120 seconds. [11280.656027] Tainted: G OX 3.13.0-37-generic #64-Ubuntu [11280.656028] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11280.656031] kworker/0:1 D ffff88021fc14480 0 46 2 0x00000000 [11280.656041] Workqueue: events ak4114_stats [snd_ak4114] [11280.656043] ffff880213c71b80 0000000000000046 ffff880213c69800 ffff880213c71fd8 [11280.656047] 0000000000014480 0000000000014480 ffff880213c69800 ffff880213c71cc0 [11280.656051] ffff880213c71cc8 7fffffffffffffff ffff880213c69800 ffff8801e206e968 [11280.656054] Call Trace: [11280.656062] [<ffffffff81723129>] schedule+0x29/0x70 [11280.656065] [<ffffffff81722379>] schedule_timeout+0x239/0x2d0 [11280.656070] [<ffffffff810980f5>] ? check_preempt_curr+0x85/0xa0 [11280.656074] [<ffffffff81098129>] ? ttwu_do_wakeup+0x19/0xc0 [11280.656077] [<ffffffff8109827d>] ? ttwu_do_activate.constprop.74+0x5d/0x70 [11280.656080] [<ffffffff81723c46>] wait_for_completion+0xa6/0x160 [11280.656084] [<ffffffff8109a8d0>] ? wake_up_state+0x20/0x20 [11280.656088] [<ffffffff81084b8d>] flush_work+0xed/0x1b0 [11280.656091] [<ffffffff81080e60>] ? wake_up_worker+0x30/0x30 [11280.656095] [<ffffffff81084cc5>] __cancel_work_timer+0x75/0xf0 [11280.656098] [<ffffffff81084d73>] cancel_delayed_work_sync+0x13/0x20 [11280.656102] [<ffffffffa03ca4e5>] snd_ak4114_reinit+0x25/0x60 [snd_ak4114] [11280.656112] [<ffffffffa05e526a>] juli_akm_set_rate_val+0xca/0xf0 [snd_ice1724] [11280.656119] [<ffffffffa05e52de>] juli_ak4114_change+0x4e/0x60 [snd_ice1724] [11280.656123] [<ffffffffa03cab97>] snd_ak4114_check_rate_and_errors+0x1f7/0x390 [snd_ak4114] [11280.656127] [<ffffffffa03cad58>] ak4114_stats+0x28/0x50 [snd_ak4114] [11280.656130] [<ffffffff810839c2>] process_one_work+0x182/0x450 [11280.656134] [<ffffffff810847b1>] worker_thread+0x121/0x410 [11280.656137] [<ffffffff81084690>] ? rescuer_thread+0x430/0x430 [11280.656140] [<ffffffff8108b492>] kthread+0xd2/0xf0 [11280.656144] [<ffffffff8108b3c0>] ? kthread_create_on_node+0x1c0/0x1c0 [11280.656147] [<ffffffff8172f77c>] ret_from_fork+0x7c/0xb0 [11280.656150] [<ffffffff8108b3c0>] ? kthread_create_on_node+0x1c0/0x1c0 [11280.656205] INFO: task pulseaudio:8418 blocked for more than 120 seconds. [11280.656206] Tainted: G OX 3.13.0-37-generic #64-Ubuntu [11280.656207] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11280.656209] pulseaudio D ffff88021fc14480 0 8418 8416 0x00000000 [11280.656211] ffff8800bb903bd0 0000000000000082 ffff880210263000 ffff8800bb903fd8 [11280.656214] 0000000000014480 0000000000014480 ffff880210263000 ffff8800bb903d18 [11280.656216] ffff8800bb903d20 7fffffffffffffff ffff880210263000 ffff8801e206e968 [11280.656219] Call Trace: [11280.656222] [<ffffffff81723129>] schedule+0x29/0x70 [11280.656224] [<ffffffff81722379>] schedule_timeout+0x239/0x2d0 [11280.656227] [<ffffffff81722cb1>] ? __schedule+0x381/0x7d0 [11280.656229] [<ffffffff81723c46>] wait_for_completion+0xa6/0x160 [11280.656232] [<ffffffff8109a8d0>] ? wake_up_state+0x20/0x20 [11280.656234] [<ffffffff81084b8d>] flush_work+0xed/0x1b0 [11280.656237] [<ffffffff81080e60>] ? wake_up_worker+0x30/0x30 [11280.656239] [<ffffffff81084cf2>] __cancel_work_timer+0xa2/0xf0 [11280.656242] [<ffffffff81084d73>] cancel_delayed_work_sync+0x13/0x20 [11280.656244] [<ffffffffa03ca4e5>] snd_ak4114_reinit+0x25/0x60 [snd_ak4114] [11280.656250] [<ffffffffa05e526a>] juli_akm_set_rate_val+0xca/0xf0 [snd_ice1724] [11280.656255] [<ffffffffa05da642>] snd_vt1724_set_pro_rate+0x122/0x220 [snd_ice1724] [11280.656259] [<ffffffffa05da768>] snd_vt1724_capture_pro_close+0x28/0x40 [snd_ice1724] [11280.656271] [<ffffffffa038db0f>] snd_pcm_release_substream.part.34+0x3f/0x90 [snd_pcm] [11280.656275] [<ffffffffa038dc38>] snd_pcm_release+0xa8/0xd0 [snd_pcm] [11280.656280] [<ffffffff811bed64>] __fput+0xe4/0x260 [11280.656282] [<ffffffff811bef2e>] ____fput+0xe/0x10 [11280.656285] [<ffffffff81088227>] task_work_run+0xa7/0xe0 [11280.656290] [<ffffffff81013df7>] do_notify_resume+0x97/0xb0 [11280.656292] [<ffffffff8172faea>] int_signal+0x12/0x17
At Sun, 11 Jan 2015 22:00:59 +0100, Pavel Hofman wrote:
Dne 11.1.2015 v 21:36 Takashi Iwai napsal(a):
OK, then this should be cancel_delayed_work_sync() instead, I suppose. The revised patch (also for ak4113.c) below.
I am afraid it is getting stuck in the same way - see the thread stack below.
I see, snd_ak4113_reinit of ak4113.c is never called, only ak4113_init_regs. Perhaps Juli should not touch the workqueue in ak4114_reinit and only initialize the regs in similar manner to ak4113?
No, it's just quartet driver doesn't handle it properly :)
Why is actually the restart of the workqueue needed at reinit? The work (snd_ak4114_check_rate_and_errors) only reads ak4114 regs to controls (using i2c routine synchronized with mutexes) and handles the stream stop.
Yeah, restart is necessary only in a certain situation, and is a bug that is done through work itself. This was the cause. I'll prepare fix patches later.
It doesn't matter much because PM doesn't work with Quartet. But the juli.c also should be improved regarding PM. It should stop the workq at suspend. Also, it'd be preferable to have some control start/stop this background work, e.g. via a control element. Otherwise your machine will be constantly loaded unnecessarily.
I think we can extend the timer, perhaps to HZ/2 - the thread is just a security measure anyway.
The HZ/10 isn't that bad, but the problem is that it's unconditionally running even if user doesn't need/want.
Takashi
On 12.1.2015 09:21, Takashi Iwai wrote:
Yeah, restart is necessary only in a certain situation, and is a bug that is done through work itself. This was the cause. I'll prepare fix patches later.
I wish I could help but unfortunately my practical knowledge of kernel workqueues is close to zero :-( Of course I will test the patches and will extend them for quartet with testing too.
The HZ/10 isn't that bad, but the problem is that it's unconditionally running even if user doesn't need/want.
It is useful only for the external clock mode. In fact the detection of incoming SPDIF rate is not reliable for internal clock in Juli (while it works just fine in Quartet, its FPGA pins configure the SPDIF receiver differently). IMO the thread could be running only when clock is switched to external.
Thanks a lot,
Pavel
At Mon, 12 Jan 2015 09:34:52 +0100, Pavel Hofman wrote:
On 12.1.2015 09:21, Takashi Iwai wrote:
Yeah, restart is necessary only in a certain situation, and is a bug that is done through work itself. This was the cause. I'll prepare fix patches later.
I wish I could help but unfortunately my practical knowledge of kernel workqueues is close to zero :-( Of course I will test the patches and will extend them for quartet with testing too.
How about the patch below? This is a quick fix for 3.19 (and stable). More better fixes will follow later once after it's confirmed to work.
The HZ/10 isn't that bad, but the problem is that it's unconditionally running even if user doesn't need/want.
It is useful only for the external clock mode. In fact the detection of incoming SPDIF rate is not reliable for internal clock in Juli (while it works just fine in Quartet, its FPGA pins configure the SPDIF receiver differently). IMO the thread could be running only when clock is switched to external.
Yeah, we can do some smart task change in addition to manual on/off. Maybe it's good to have an enum control for that.
Takashi
-- diff --git a/include/sound/ak4114.h b/include/sound/ak4114.h index 52f02a60dba7..796834b7790c 100644 --- a/include/sound/ak4114.h +++ b/include/sound/ak4114.h @@ -169,6 +169,7 @@ struct ak4114 { ak4114_read_t * read; void * private_data; unsigned int init: 1; + bool in_workq; spinlock_t lock; unsigned char regmap[6]; unsigned char txcsb[5]; diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index c7f56339415d..0f923809522c 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -152,9 +152,11 @@ static void ak4114_init_regs(struct ak4114 *chip)
void snd_ak4114_reinit(struct ak4114 *chip) { + if (chip->in_workq) + return; chip->init = 1; mb(); - flush_delayed_work(&chip->work); + cancel_delayed_work_sync(&chip->work); ak4114_init_regs(chip); /* bring up statistics / event queing */ chip->init = 0; @@ -612,10 +614,12 @@ static void ak4114_stats(struct work_struct *work) { struct ak4114 *chip = container_of(work, struct ak4114, work.work);
- if (!chip->init) + chip->in_workq = true; + if (!chip->init) { snd_ak4114_check_rate_and_errors(chip, chip->check_flags); - - schedule_delayed_work(&chip->work, HZ / 10); + schedule_delayed_work(&chip->work, HZ / 10); + } + chip->in_workq = false; }
EXPORT_SYMBOL(snd_ak4114_create);
Dne 12.1.2015 v 16:43 Takashi Iwai napsal(a):
At Mon, 12 Jan 2015 09:34:52 +0100, Pavel Hofman wrote:
I wish I could help but unfortunately my practical knowledge of kernel workqueues is close to zero :-( Of course I will test the patches and will extend them for quartet with testing too.
How about the patch below? This is a quick fix for 3.19 (and stable). More better fixes will follow later once after it's confirmed to work.
Hi,
Thanks a lot, the patch seems to run fine. Only ak4114_init_regs has to be called every samplerate change, otherwise the receiver does not re-run the samplerate detection and its register with detected samplerate does not update its value.
The following patch seems to run ok:
diff --git a/include/sound/ak4114.h b/include/sound/ak4114.h index 52f02a6..796834b 100644 --- a/include/sound/ak4114.h +++ b/include/sound/ak4114.h @@ -169,6 +169,7 @@ struct ak4114 { ak4114_read_t * read; void * private_data; unsigned int init: 1; + bool in_workq; spinlock_t lock; unsigned char regmap[6]; unsigned char txcsb[5]; diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index c7f5633..6d643b7 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -152,10 +152,12 @@ static void ak4114_init_regs(struct ak4114 *chip)
void snd_ak4114_reinit(struct ak4114 *chip) { + ak4114_init_regs(chip); + if (chip->in_workq) + return; chip->init = 1; mb(); - flush_delayed_work(&chip->work); - ak4114_init_regs(chip); + cancel_delayed_work_sync(&chip->work); /* bring up statistics / event queing */ chip->init = 0; if (chip->kctls[0]) @@ -612,10 +614,12 @@ static void ak4114_stats(struct work_struct *work) { struct ak4114 *chip = container_of(work, struct ak4114, work.work);
- if (!chip->init) + chip->in_workq = true; + if (!chip->init) { snd_ak4114_check_rate_and_errors(chip, chip->check_flags); - - schedule_delayed_work(&chip->work, HZ / 10); + schedule_delayed_work(&chip->work, HZ / 10); + } + chip->in_workq = false; }
EXPORT_SYMBOL(snd_ak4114_create);
The HZ/10 isn't that bad, but the problem is that it's unconditionally running even if user doesn't need/want.
It is useful only for the external clock mode. In fact the detection of incoming SPDIF rate is not reliable for internal clock in Juli (while it works just fine in Quartet, its FPGA pins configure the SPDIF receiver differently). IMO the thread could be running only when clock is switched to external.
Yeah, we can do some smart task change in addition to manual on/off. Maybe it's good to have an enum control for that.
I am not sure users would want/need to disable a feature which detects incoming samplerate. IMO if the work thread is running only in the external clock mode, nothing more is needed.
Thanks a lot and regards,
Pavel.
At Thu, 15 Jan 2015 22:15:46 +0100, Pavel Hofman wrote:
Dne 12.1.2015 v 16:43 Takashi Iwai napsal(a):
At Mon, 12 Jan 2015 09:34:52 +0100, Pavel Hofman wrote:
I wish I could help but unfortunately my practical knowledge of kernel workqueues is close to zero :-( Of course I will test the patches and will extend them for quartet with testing too.
How about the patch below? This is a quick fix for 3.19 (and stable). More better fixes will follow later once after it's confirmed to work.
Hi,
Thanks a lot, the patch seems to run fine. Only ak4114_init_regs has to be called every samplerate change, otherwise the receiver does not re-run the samplerate detection and its register with detected samplerate does not update its value.
OK, I'm going to send a fix series including the relevant correction. Give it a try later.
The following patch seems to run ok:
diff --git a/include/sound/ak4114.h b/include/sound/ak4114.h index 52f02a6..796834b 100644 --- a/include/sound/ak4114.h +++ b/include/sound/ak4114.h @@ -169,6 +169,7 @@ struct ak4114 { ak4114_read_t * read; void * private_data; unsigned int init: 1;
bool in_workq; spinlock_t lock; unsigned char regmap[6]; unsigned char txcsb[5];
diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index c7f5633..6d643b7 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -152,10 +152,12 @@ static void ak4114_init_regs(struct ak4114 *chip)
void snd_ak4114_reinit(struct ak4114 *chip) {
ak4114_init_regs(chip);
if (chip->in_workq)
return; chip->init = 1; mb();
flush_delayed_work(&chip->work);
ak4114_init_regs(chip);
cancel_delayed_work_sync(&chip->work); /* bring up statistics / event queing */ chip->init = 0; if (chip->kctls[0])
@@ -612,10 +614,12 @@ static void ak4114_stats(struct work_struct *work) { struct ak4114 *chip = container_of(work, struct ak4114, work.work);
if (!chip->init)
chip->in_workq = true;
if (!chip->init) { snd_ak4114_check_rate_and_errors(chip, chip->check_flags);
schedule_delayed_work(&chip->work, HZ / 10);
schedule_delayed_work(&chip->work, HZ / 10);
}
chip->in_workq = false;
}
EXPORT_SYMBOL(snd_ak4114_create);
The HZ/10 isn't that bad, but the problem is that it's unconditionally running even if user doesn't need/want.
It is useful only for the external clock mode. In fact the detection of incoming SPDIF rate is not reliable for internal clock in Juli (while it works just fine in Quartet, its FPGA pins configure the SPDIF receiver differently). IMO the thread could be running only when clock is switched to external.
Yeah, we can do some smart task change in addition to manual on/off. Maybe it's good to have an enum control for that.
I am not sure users would want/need to disable a feature which detects incoming samplerate. IMO if the work thread is running only in the external clock mode, nothing more is needed.
Hm, but you can still see the other attributes of SPDIF input frames, right? Or all these useless when the clock is set to internal? If so, it'd be easy to add the dynamic turn on/off per the clock mode.
Takashi
At Fri, 16 Jan 2015 18:13:09 +0100, Takashi Iwai wrote:
At Thu, 15 Jan 2015 22:15:46 +0100, Pavel Hofman wrote:
Dne 12.1.2015 v 16:43 Takashi Iwai napsal(a):
At Mon, 12 Jan 2015 09:34:52 +0100, Pavel Hofman wrote:
I wish I could help but unfortunately my practical knowledge of kernel workqueues is close to zero :-( Of course I will test the patches and will extend them for quartet with testing too.
How about the patch below? This is a quick fix for 3.19 (and stable). More better fixes will follow later once after it's confirmed to work.
Hi,
Thanks a lot, the patch seems to run fine. Only ak4114_init_regs has to be called every samplerate change, otherwise the receiver does not re-run the samplerate detection and its register with detected samplerate does not update its value.
OK, I'm going to send a fix series including the relevant correction. Give it a try later.
The following patch seems to run ok:
diff --git a/include/sound/ak4114.h b/include/sound/ak4114.h index 52f02a6..796834b 100644 --- a/include/sound/ak4114.h +++ b/include/sound/ak4114.h @@ -169,6 +169,7 @@ struct ak4114 { ak4114_read_t * read; void * private_data; unsigned int init: 1;
bool in_workq; spinlock_t lock; unsigned char regmap[6]; unsigned char txcsb[5];
diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index c7f5633..6d643b7 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -152,10 +152,12 @@ static void ak4114_init_regs(struct ak4114 *chip)
void snd_ak4114_reinit(struct ak4114 *chip) {
ak4114_init_regs(chip);
if (chip->in_workq)
return; chip->init = 1; mb();
flush_delayed_work(&chip->work);
ak4114_init_regs(chip);
cancel_delayed_work_sync(&chip->work); /* bring up statistics / event queing */ chip->init = 0; if (chip->kctls[0])
@@ -612,10 +614,12 @@ static void ak4114_stats(struct work_struct *work) { struct ak4114 *chip = container_of(work, struct ak4114, work.work);
if (!chip->init)
chip->in_workq = true;
if (!chip->init) { snd_ak4114_check_rate_and_errors(chip, chip->check_flags);
schedule_delayed_work(&chip->work, HZ / 10);
schedule_delayed_work(&chip->work, HZ / 10);
}
chip->in_workq = false;
}
EXPORT_SYMBOL(snd_ak4114_create);
The HZ/10 isn't that bad, but the problem is that it's unconditionally running even if user doesn't need/want.
It is useful only for the external clock mode. In fact the detection of incoming SPDIF rate is not reliable for internal clock in Juli (while it works just fine in Quartet, its FPGA pins configure the SPDIF receiver differently). IMO the thread could be running only when clock is switched to external.
Yeah, we can do some smart task change in addition to manual on/off. Maybe it's good to have an enum control for that.
I am not sure users would want/need to disable a feature which detects incoming samplerate. IMO if the work thread is running only in the external clock mode, nothing more is needed.
Hm, but you can still see the other attributes of SPDIF input frames, right? Or all these useless when the clock is set to internal? If so, it'd be easy to add the dynamic turn on/off per the clock mode.
The patch below can be applied on top of the patch series I've sent.
Takashi
--- diff --git a/include/sound/ak4114.h b/include/sound/ak4114.h index b6feb7e225f2..9adcd06e4134 100644 --- a/include/sound/ak4114.h +++ b/include/sound/ak4114.h @@ -199,6 +199,7 @@ int snd_ak4114_build(struct ak4114 *ak4114, struct snd_pcm_substream *capture_substream); int snd_ak4114_external_rate(struct ak4114 *ak4114); int snd_ak4114_check_rate_and_errors(struct ak4114 *ak4114, unsigned int flags); +void snd_ak4114_enable_check_rate(struct ak4114 *chip, bool enable);
#ifdef CONFIG_PM void snd_ak4114_suspend(struct ak4114 *chip); diff --git a/sound/i2c/other/ak4114.c b/sound/i2c/other/ak4114.c index 5a4cf3fab4ae..497d50e0a6d5 100644 --- a/sound/i2c/other/ak4114.c +++ b/sound/i2c/other/ak4114.c @@ -64,10 +64,21 @@ static void reg_dump(struct ak4114 *ak4114) } #endif
-static void snd_ak4114_free(struct ak4114 *chip) +static void snd_ak4114_disable_work(struct ak4114 *chip) { atomic_inc(&chip->wq_processing); /* don't schedule new work */ cancel_delayed_work_sync(&chip->work); +} + +static void snd_ak4114_enable_work(struct ak4114 *chip) +{ + if (atomic_dec_and_test(&chip->wq_processing)) + schedule_delayed_work(&chip->work, HZ / 10); +} + +static void snd_ak4114_free(struct ak4114 *chip) +{ + snd_ak4114_disable_work(chip); kfree(chip); }
@@ -161,8 +172,7 @@ void snd_ak4114_reinit(struct ak4114 *chip) ak4114_init_regs(chip); mutex_unlock(&chip->reinit_mutex); /* bring up statistics / event queing */ - if (atomic_dec_and_test(&chip->wq_processing)) - schedule_delayed_work(&chip->work, HZ / 10); + snd_ak4114_enable_work(chip); } EXPORT_SYMBOL(snd_ak4114_reinit);
@@ -506,6 +516,7 @@ int snd_ak4114_build(struct ak4114 *ak4114, } snd_ak4114_proc_init(ak4114); /* trigger workq */ + ak4114->check_rate_enabled = true; schedule_delayed_work(&ak4114->work, HZ / 10); return 0; } @@ -621,15 +632,27 @@ static void ak4114_stats(struct work_struct *work)
if (atomic_inc_return(&chip->wq_processing) == 1) snd_ak4114_check_rate_and_errors(chip, chip->check_flags); - if (atomic_dec_and_test(&chip->wq_processing)) - schedule_delayed_work(&chip->work, HZ / 10); + snd_ak4114_enable_work(chip); +} + +void snd_ak4114_enable_check_rate(struct ak4114 *chip, bool enable) +{ + mutex_lock(&chip->reinit_mutex); + changed = chip->check_rate_enabled != enable; + chip->check_rate_enabled = enable; + mutex_unlock(&chip->reinit_mutex); + if (!changed) + return; + if (enable) + snd_ak4114_enable_work(chip); + else + snd_ak4114_disable_work(chip); }
#ifdef CONFIG_PM void snd_ak4114_suspend(struct ak4114 *chip) { - atomic_inc(&chip->wq_processing); /* don't schedule new work */ - cancel_delayed_work_sync(&chip->work); + snd_ak4114_disable_work(chip); } EXPORT_SYMBOL(snd_ak4114_suspend);
diff --git a/sound/pci/ice1712/juli.c b/sound/pci/ice1712/juli.c index 4f0213427152..5134833d0fa8 100644 --- a/sound/pci/ice1712/juli.c +++ b/sound/pci/ice1712/juli.c @@ -475,8 +475,13 @@ static int juli_add_controls(struct snd_ice1712 *ice) return err;
/* only capture SPDIF over AK4114 */ - return snd_ak4114_build(spec->ak4114, NULL, + err = snd_ak4114_build(spec->ak4114, NULL, ice->pcm->streams[SNDRV_PCM_STREAM_CAPTURE].substream); + if (err < 0) + return err; + + snd_ak4114_enable_check_rate(spec->ak4114, ice->is_spdif_master(ice)); + return 0; }
/* @@ -530,6 +535,7 @@ static unsigned int juli_get_rate(struct snd_ice1712 *ice) /* setting new rate */ static void juli_set_rate(struct snd_ice1712 *ice, unsigned int rate) { + struct juli_spec *spec = ice->spec; unsigned int old, new; unsigned char val;
@@ -543,6 +549,7 @@ static void juli_set_rate(struct snd_ice1712 *ice, unsigned int rate) /* switching to external clock - supplied by external circuits */ val = inb(ICEMT1724(ice, RATE)); outb(val | VT1724_SPDIF_MASTER, ICEMT1724(ice, RATE)); + snd_ak4114_enable_check_rate(spec->ak4114, false); }
static inline unsigned char juli_set_mclk(struct snd_ice1712 *ice, @@ -555,11 +562,13 @@ static inline unsigned char juli_set_mclk(struct snd_ice1712 *ice, /* setting clock to external - SPDIF */ static int juli_set_spdif_clock(struct snd_ice1712 *ice, int type) { + struct juli_spec *spec = ice->spec; unsigned int old; old = ice->gpio.get_data(ice); /* external clock (= 0), multiply 1x, 48kHz */ ice->gpio.set_data(ice, (old & ~GPIO_RATE_MASK) | GPIO_MULTI_1X | GPIO_FREQ_48KHZ); + snd_ak4114_enable_check_rate(spec->ak4114, true); return 0; }
Dne 16.1.2015 v 18:13 Takashi Iwai napsal(a):
OK, I'm going to send a fix series including the relevant correction. Give it a try later.
Thanks a lot for the patches, the external rate switching on Juli now works perfectly (not tested on Quartet yet). One can tell you are seasoned in kernel development, I would get lost in the synchronization and workqueue facilities.
I am not sure users would want/need to disable a feature which detects incoming samplerate. IMO if the work thread is running only in the external clock mode, nothing more is needed.
Hm, but you can still see the other attributes of SPDIF input frames, right? Or all these useless when the clock is set to internal? If so, it'd be easy to add the dynamic turn on/off per the clock mode.
You are right, that would disable update of other controls informing about incoming SPDIF details. These are useful in internal clock mode too - if the soundcard is master for the spdif chain. A new control would make sense then.
I am leaving for a week, then I will test quartet and the PM features.
Again, thanks a lot for the patches.
Pavel.
At Fri, 16 Jan 2015 21:36:10 +0100, Pavel Hofman wrote:
Dne 16.1.2015 v 18:13 Takashi Iwai napsal(a):
OK, I'm going to send a fix series including the relevant correction. Give it a try later.
Thanks a lot for the patches, the external rate switching on Juli now works perfectly (not tested on Quartet yet). One can tell you are seasoned in kernel development, I would get lost in the synchronization and workqueue facilities.
I am not sure users would want/need to disable a feature which detects incoming samplerate. IMO if the work thread is running only in the external clock mode, nothing more is needed.
Hm, but you can still see the other attributes of SPDIF input frames, right? Or all these useless when the clock is set to internal? If so, it'd be easy to add the dynamic turn on/off per the clock mode.
You are right, that would disable update of other controls informing about incoming SPDIF details. These are useful in internal clock mode too - if the soundcard is master for the spdif chain. A new control would make sense then.
Alright.
I am leaving for a week, then I will test quartet and the PM features.
There should be no change regarding quartet, also about PM. The patches doesn't add the PM support to quartet, but rather robustify only the PM of Juli@ (and ak4113/4114 codec side support).
In anyway, I'm going to merge them once when you confirm them working.
thanks,
Takashi
Dne 16.1.2015 v 21:39 Takashi Iwai napsal(a):
At Fri, 16 Jan 2015 21:36:10 +0100,
There should be no change regarding quartet, also about PM. The patches doesn't add the PM support to quartet, but rather robustify only the PM of Juli@ (and ak4113/4114 codec side support).
In anyway, I'm going to merge them once when you confirm them working.
OK, I will do so.
Thanks,
Pavel.
Dne 16.1.2015 v 21:39 Takashi Iwai napsal(a):
At Fri, 16 Jan 2015 21:36:10 +0100, Pavel Hofman wrote:
Dne 16.1.2015 v 18:13 Takashi Iwai napsal(a):
OK, I'm going to send a fix series including the relevant correction. Give it a try later.
Thanks a lot for the patches, the external rate switching on Juli now works perfectly (not tested on Quartet yet). One can tell you are seasoned in kernel development, I would get lost in the synchronization and workqueue facilities.
I am not sure users would want/need to disable a feature which detects incoming samplerate. IMO if the work thread is running only in the external clock mode, nothing more is needed.
Hm, but you can still see the other attributes of SPDIF input frames, right? Or all these useless when the clock is set to internal? If so, it'd be easy to add the dynamic turn on/off per the clock mode.
You are right, that would disable update of other controls informing about incoming SPDIF details. These are useful in internal clock mode too - if the soundcard is master for the spdif chain. A new control would make sense then.
Alright.
I am leaving for a week, then I will test quartet and the PM features.
There should be no change regarding quartet, also about PM. The patches doesn't add the PM support to quartet, but rather robustify only the PM of Juli@ (and ak4113/4114 codec side support).
In anyway, I'm going to merge them once when you confirm them working.
I confirm the patches are working both for Juli and Quartet. Juli tested also for pm-suspend, working fine after resume.
Thanks a lot,
Pavel.
At Wed, 28 Jan 2015 21:51:14 +0100, Pavel Hofman wrote:
Dne 16.1.2015 v 21:39 Takashi Iwai napsal(a):
At Fri, 16 Jan 2015 21:36:10 +0100, Pavel Hofman wrote:
Dne 16.1.2015 v 18:13 Takashi Iwai napsal(a):
OK, I'm going to send a fix series including the relevant correction. Give it a try later.
Thanks a lot for the patches, the external rate switching on Juli now works perfectly (not tested on Quartet yet). One can tell you are seasoned in kernel development, I would get lost in the synchronization and workqueue facilities.
I am not sure users would want/need to disable a feature which detects incoming samplerate. IMO if the work thread is running only in the external clock mode, nothing more is needed.
Hm, but you can still see the other attributes of SPDIF input frames, right? Or all these useless when the clock is set to internal? If so, it'd be easy to add the dynamic turn on/off per the clock mode.
You are right, that would disable update of other controls informing about incoming SPDIF details. These are useful in internal clock mode too - if the soundcard is master for the spdif chain. A new control would make sense then.
Alright.
I am leaving for a week, then I will test quartet and the PM features.
There should be no change regarding quartet, also about PM. The patches doesn't add the PM support to quartet, but rather robustify only the PM of Juli@ (and ak4113/4114 codec side support).
In anyway, I'm going to merge them once when you confirm them working.
I confirm the patches are working both for Juli and Quartet. Juli tested also for pm-suspend, working fine after resume.
OK, I'm going to merge the patches.
Thanks for testing.
Takashi
Dne 28.1.2015 v 22:21 Takashi Iwai napsal(a):
I confirm the patches are working both for Juli and Quartet. Juli tested also for pm-suspend, working fine after resume.
OK, I'm going to merge the patches.
Thanks for testing.
I do thank a lot for the patches.
Regards,
Pavel.
participants (2)
-
Pavel Hofman
-
Takashi Iwai