[alsa-devel] crash/reboot with rawmidi on ice1712 dual opteron
Hi,
on one particular machine, an IBM workstation (Intellistation A Pro with dual opteron 250, 4GB RAM), opening a rawmidi input port on an M-Audio 2496 card will cause an immediate reboot of the machine. I've tried with different SMP kernels and different ALSA versions (including 1.0.14rc3), but always the same. Syslog does not show anything.
Using a USB MIDI interface works fine. Also, MIDI input on a similar workstation (Intellistation Z Pro, dual xeon, 1GB RAM) with M-Audio 2496 works without problems.
Any ideas?
Thanks, Florian
At Fri, 13 Apr 2007 21:16:55 +0200, Florian wrote:
Hi,
on one particular machine, an IBM workstation (Intellistation A Pro with dual opteron 250, 4GB RAM), opening a rawmidi input port on an M-Audio 2496 card will cause an immediate reboot of the machine. I've tried with different SMP kernels and different ALSA versions (including 1.0.14rc3), but always the same. Syslog does not show anything.
Using a USB MIDI interface works fine. Also, MIDI input on a similar workstation (Intellistation Z Pro, dual xeon, 1GB RAM) with M-Audio 2496 works without problems.
Any ideas?
Hm, this sounds like a problem of ice1712 driver, not the rawmidi core side. Though, the symptom is a bit puzzling; most of mpu401 problems is the lock-up rather than an immediate reboot.
Or, do you set up the machine to do automatic reboot at a kernel panic?
Takashi
thanks for the reply. Some more info: - system is RH Enterprise AS 4 update 4 - the soundcard is in a 64-bit PCI-X slot (so few soundcards fit at all) - pcm with ice1712 is working fine - I disabled /proc/sys/kernel/panic_on_oops, but it still rebooted the same way. Is there another way to prevent rebooting on panic or so? - by using printk with serial console, I could trace the reboot to occur at the very first outb() call in mpu401_uart.c:64 with data 0, addr 0x304C
Is this likely to be a hardware incompatibility?
On a 3rd machine with dual Xeon (same OS, same PCI slots, same soundcard), the computer freezes after successfully receiving a few MIDI bytes (between 4 and 30 bytes). Any hints where I can start debugging such a lock-up?
Thanks, Florian
On 4/17/2007 12:33 PM, Takashi Iwai wrote:
At Fri, 13 Apr 2007 21:16:55 +0200, Florian wrote:
Hi,
on one particular machine, an IBM workstation (Intellistation A Pro with dual opteron 250, 4GB RAM), opening a rawmidi input port on an M-Audio 2496 card will cause an immediate reboot of the machine. I've tried with different SMP kernels and different ALSA versions (including 1.0.14rc3), but always the same. Syslog does not show anything.
Using a USB MIDI interface works fine. Also, MIDI input on a similar workstation (Intellistation Z Pro, dual xeon, 1GB RAM) with M-Audio 2496 works without problems.
Any ideas?
Hm, this sounds like a problem of ice1712 driver, not the rawmidi core side. Though, the symptom is a bit puzzling; most of mpu401 problems is the lock-up rather than an immediate reboot.
Or, do you set up the machine to do automatic reboot at a kernel panic?
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
At Wed, 18 Apr 2007 17:15:41 +0200, Florian wrote:
thanks for the reply. Some more info:
- system is RH Enterprise AS 4 update 4
- the soundcard is in a 64-bit PCI-X slot (so few soundcards fit at all)
- pcm with ice1712 is working fine
- I disabled /proc/sys/kernel/panic_on_oops, but it still rebooted the same way. Is there another way to prevent rebooting on panic or so?
- by using printk with serial console, I could trace the reboot to occur at the very first outb() call in mpu401_uart.c:64 with data 0, addr 0x304C
The address sounds a bit strange to me, but maybe depending on BIOS. Check /proc/ioports whether this really is within the range of the corresponding soundcard.
Takashi
Thanks for the reply. The relevant excerpt from /proc/ioports is
3000-3fff : PCI Bus #02 3000-303f : 0000:02:01.0 3000-303f : ICE1712 3040-305f : 0000:02:01.0 3040-305f : ICE1712 3060-306f : 0000:02:01.0 3060-306f : ICE1712 3070-307f : 0000:02:01.0 3070-307f : ICE1712 [full listing at end of message]
so I guess 304C is in range of the M-Audio Audiophile 24/96. Any other ideas what I can try, either in ALSA code or elsewhere?
Thanks, Florian
On 4/24/2007 2:54 PM, Takashi Iwai wrote:
At Wed, 18 Apr 2007 17:15:41 +0200, Florian wrote:
thanks for the reply. Some more info:
- system is RH Enterprise AS 4 update 4
- the soundcard is in a 64-bit PCI-X slot (so few soundcards fit at all)
- pcm with ice1712 is working fine
- I disabled /proc/sys/kernel/panic_on_oops, but it still rebooted the same way. Is there another way to prevent rebooting on panic or so?
- by using printk with serial console, I could trace the reboot to occur at the very first outb() call in mpu401_uart.c:64 with data 0, addr 0x304C
The address sounds a bit strange to me, but maybe depending on BIOS. Check /proc/ioports whether this really is within the range of the corresponding soundcard.
0000-001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-006f : keyboard 0070-0077 : rtc 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 02f8-02ff : serial 0376-0376 : ide1 0378-037a : parport0 037b-037f : parport0 03c0-03df : vga+ 03f8-03ff : serial 04d0-04d1 : pnp 00:05 1000-10ff : 0000:00:07.5 1000-10ff : AMD AMD8111 1100-117f : pnp 00:05 1180-11ff : pnp 00:05 1400-143f : 0000:00:07.5 1400-143f : AMD AMD8111 1440-145f : 0000:00:07.2 1440-145f : amd8111_smbus2 1460-146f : 0000:00:07.1 1460-1467 : ide0 1468-146f : ide1 2000-2fff : PCI Bus #01 2000-200f : 0000:01:02.0 2000-200f : sata_sil 2010-2013 : 0000:01:02.0 2010-2013 : sata_sil 2014-2017 : 0000:01:02.0 2014-2017 : sata_sil 2018-201f : 0000:01:02.0 2018-201f : sata_sil 2020-2027 : 0000:01:02.0 2020-2027 : sata_sil 3000-3fff : PCI Bus #02 3000-303f : 0000:02:01.0 3000-303f : ICE1712 3040-305f : 0000:02:01.0 3040-305f : ICE1712 3060-306f : 0000:02:01.0 3060-306f : ICE1712 3070-307f : 0000:02:01.0 3070-307f : ICE1712 4000-4fff : PCI Bus #81 4000-4fff : PCI Bus #83 4000-40ff : 0000:83:04.0 4400-44ff : 0000:83:04.0 4800-48ff : 0000:83:04.1 4c00-4cff : 0000:83:04.1 8000-8003 : PM1a_EVT_BLK 8004-8005 : PM1a_CNT_BLK 8008-800b : PM_TMR 8010-8015 : ACPI CPU throttle 8020-8023 : GPE0_BLK 80b0-80b7 : GPE1_BLK 80e0-80ef : amd756_smbus
At Tue, 24 Apr 2007 15:27:23 +0200, Florian wrote:
Thanks for the reply. The relevant excerpt from /proc/ioports is
3000-3fff : PCI Bus #02 3000-303f : 0000:02:01.0 3000-303f : ICE1712 3040-305f : 0000:02:01.0 3040-305f : ICE1712 3060-306f : 0000:02:01.0 3060-306f : ICE1712 3070-307f : 0000:02:01.0 3070-307f : ICE1712 [full listing at end of message]
so I guess 304C is in range of the M-Audio Audiophile 24/96. Any other ideas what I can try, either in ALSA code or elsewhere?
What shows /proc/asound/cards? Does it point 0x3000 or 0x3040?
Takashi
Hi Takashi,
the ice1712 points to 0x3040.
Thanks, Florian
On 4/24/2007 3:39 PM, Takashi Iwai wrote:
At Tue, 24 Apr 2007 15:27:23 +0200, Florian wrote:
Thanks for the reply. The relevant excerpt from /proc/ioports is
3000-3fff : PCI Bus #02 3000-303f : 0000:02:01.0 3000-303f : ICE1712 3040-305f : 0000:02:01.0 3040-305f : ICE1712 3060-306f : 0000:02:01.0 3060-306f : ICE1712 3070-307f : 0000:02:01.0 3070-307f : ICE1712 [full listing at end of message]
so I guess 304C is in range of the M-Audio Audiophile 24/96. Any other ideas what I can try, either in ALSA code or elsewhere?
What shows /proc/asound/cards? Does it point 0x3000 or 0x3040?
Takashi
At Tue, 24 Apr 2007 15:43:54 +0200, Florian wrote:
Hi Takashi,
the ice1712 points to 0x3040.
Hm... could you show the output of "lspci -v" (regarding ice1712) ? I wonder why 0x3000-0x303f is ignored.
Takashi
Thanks, Florian
On 4/24/2007 3:39 PM, Takashi Iwai wrote:
At Tue, 24 Apr 2007 15:27:23 +0200, Florian wrote:
Thanks for the reply. The relevant excerpt from /proc/ioports is
3000-3fff : PCI Bus #02 3000-303f : 0000:02:01.0 3000-303f : ICE1712 3040-305f : 0000:02:01.0 3040-305f : ICE1712 3060-306f : 0000:02:01.0 3060-306f : ICE1712 3070-307f : 0000:02:01.0 3070-307f : ICE1712 [full listing at end of message]
so I guess 304C is in range of the M-Audio Audiophile 24/96. Any other ideas what I can try, either in ALSA code or elsewhere?
What shows /proc/asound/cards? Does it point 0x3000 or 0x3040?
Takashi
-- Florian Bomers bome.com
Music Software, Development Tools: http://www.bome.com Java Sound extensions, plugins: http://www.tritonus.org The Java Sound Resources: http://www.jsresources.org
Please quote this email in your reply. Thanks!
lspci -v shows:
02:01.0 Multimedia audio controller: VIA Technologies Inc. ICE1712 [Envy24] PCI Multi-Channel I/O Controller (rev 02) Subsystem: VIA Technologies Inc. M-Audio Delta Audiophile Flags: bus master, medium devsel, latency 64, IRQ 20 I/O ports at 3040 [size=32] I/O ports at 3070 [size=16] I/O ports at 3060 [size=16] I/O ports at 3000 [size=64] Capabilities: [80] Power Management version 1
Florian
On 4/24/2007 3:51 PM, Takashi Iwai wrote:
At Tue, 24 Apr 2007 15:43:54 +0200, Florian wrote:
Hi Takashi,
the ice1712 points to 0x3040.
Hm... could you show the output of "lspci -v" (regarding ice1712) ? I wonder why 0x3000-0x303f is ignored.
Takashi
Thanks, Florian
On 4/24/2007 3:39 PM, Takashi Iwai wrote:
At Tue, 24 Apr 2007 15:27:23 +0200, Florian wrote:
Thanks for the reply. The relevant excerpt from /proc/ioports is
3000-3fff : PCI Bus #02 3000-303f : 0000:02:01.0 3000-303f : ICE1712 3040-305f : 0000:02:01.0 3040-305f : ICE1712 3060-306f : 0000:02:01.0 3060-306f : ICE1712 3070-307f : 0000:02:01.0 3070-307f : ICE1712 [full listing at end of message]
so I guess 304C is in range of the M-Audio Audiophile 24/96. Any other ideas what I can try, either in ALSA code or elsewhere?
What shows /proc/asound/cards? Does it point 0x3000 or 0x3040?
Takashi
-- Florian Bomers bome.com
Music Software, Development Tools: http://www.bome.com Java Sound extensions, plugins: http://www.tritonus.org The Java Sound Resources: http://www.jsresources.org
Please quote this email in your reply. Thanks!
Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
At Tue, 24 Apr 2007 16:12:10 +0200, Florian wrote:
lspci -v shows:
02:01.0 Multimedia audio controller: VIA Technologies Inc. ICE1712 [Envy24] PCI Multi-Channel I/O Controller (rev 02) Subsystem: VIA Technologies Inc. M-Audio Delta Audiophile Flags: bus master, medium devsel, latency 64, IRQ 20 I/O ports at 3040 [size=32] I/O ports at 3070 [size=16] I/O ports at 3060 [size=16] I/O ports at 3000 [size=64] Capabilities: [80] Power Management version 1
Ah OK, it's non-linear...
When the hang-up occurs at the first write, it must be in snd_mpu401_uart_cmd(). At the very beginning, it calls mpu->write(mpu, 0x00, MPU401D(mpu)); Try to comment out this and see what happens.
Do I understand correctly that this bug happens when you open a rawmidi device for read, e.g. % cat /dev/snd/midiC0D0 > /dev/null ??
Takashi
When the hang-up occurs at the first write, it must be in snd_mpu401_uart_cmd(). At the very beginning, it calls mpu->write(mpu, 0x00, MPU401D(mpu)); Try to comment out this and see what happens.
I had tried that - I think that I just commented out the reset command. It would not crash or reboot, but it did not haver functionality either.
Do I understand correctly that this bug happens when you open a rawmidi device for read, e.g. % cat /dev/snd/midiC0D0 > /dev/null
yes. I usually used amidi -p hw:0 -d
Perhaps an easiest but foolishest way to trace this is to put printk at each io-port access and any other important points, and give some sleep at each point, then watch the kernel message. You can get rid of spin_lock_*() around that, just for testing.
I've done this until I traced it to the first outb() call, i.e. the initialization mentioned above. The first outb() will cause the reboot.
Florian
??
Takashi
At Tue, 24 Apr 2007 16:53:20 +0200, Florian wrote:
When the hang-up occurs at the first write, it must be in snd_mpu401_uart_cmd(). At the very beginning, it calls mpu->write(mpu, 0x00, MPU401D(mpu)); Try to comment out this and see what happens.
I had tried that - I think that I just commented out the reset command.
The reset command contains a series of writes. The write access (zero to 0x304c) is the very first part, and this isn't always necessary. For example, trident doesn't like this sequence. So, just commenting out this write should be fairly harmless to the later behavior.
So, commenting only the first zero write is worth to try (if you didn't do yet).
It would not crash or reboot, but it did not haver functionality either.
Do I understand correctly that this bug happens when you open a rawmidi device for read, e.g. % cat /dev/snd/midiC0D0 > /dev/null
yes. I usually used amidi -p hw:0 -d
Perhaps an easiest but foolishest way to trace this is to put printk at each io-port access and any other important points, and give some sleep at each point, then watch the kernel message. You can get rid of spin_lock_*() around that, just for testing.
I've done this until I traced it to the first outb() call, i.e. the initialization mentioned above. The first outb() will cause the reboot.
And this causes an immediate reboot, not panic or oops, right? You shouldn't do this kind of debug on X but on VGA console, BTW.
Takashi
IT WORKED! I fixed it in this way:
mpu401_uart.c:229
if (mpu->hardware != MPU401_HW_TRID4DWAVE && mpu->hardware != MPU401_HW_ICE1712) { mpu->write(mpu, 0x00, MPU401D(mpu)); /*snd_mpu401_uart_clear_rx(mpu);*/ }
I don't know if this will work on non-AMD machines and if it will work on all ice1712 machines... Next week I can test it with different M-Audio cards on single-processor machines (Pentium and AMD).
Thanks a lot! Florian
On 4/24/2007 5:04 PM, Takashi Iwai wrote:
At Tue, 24 Apr 2007 16:53:20 +0200, Florian wrote:
When the hang-up occurs at the first write, it must be in snd_mpu401_uart_cmd(). At the very beginning, it calls mpu->write(mpu, 0x00, MPU401D(mpu)); Try to comment out this and see what happens.
I had tried that - I think that I just commented out the reset command.
The reset command contains a series of writes. The write access (zero to 0x304c) is the very first part, and this isn't always necessary. For example, trident doesn't like this sequence. So, just commenting out this write should be fairly harmless to the later behavior.
So, commenting only the first zero write is worth to try (if you didn't do yet).
It would not crash or reboot, but it did not haver functionality either.
Do I understand correctly that this bug happens when you open a rawmidi device for read, e.g. % cat /dev/snd/midiC0D0 > /dev/null
yes. I usually used amidi -p hw:0 -d
Perhaps an easiest but foolishest way to trace this is to put printk at each io-port access and any other important points, and give some sleep at each point, then watch the kernel message. You can get rid of spin_lock_*() around that, just for testing.
I've done this until I traced it to the first outb() call, i.e. the initialization mentioned above. The first outb() will cause the reboot.
And this causes an immediate reboot, not panic or oops, right? You shouldn't do this kind of debug on X but on VGA console, BTW.
Takashi
Hi Florian,
IT WORKED! I fixed it in this way:
mpu401_uart.c:229
if (mpu->hardware != MPU401_HW_TRID4DWAVE && mpu->hardware != MPU401_HW_ICE1712) { mpu->write(mpu, 0x00, MPU401D(mpu)); /*snd_mpu401_uart_clear_rx(mpu);*/ }
I don't know if this will work on non-AMD machines and if it will work on all ice1712 machines...
I can test it here - which version is your patch against?
Cheers!
Daniel
it's the hg version from last Monday. In any case, just look for the string "MPU401_HW_TRID" - it only appears once in the file, then add this condition:
&& mpu->hardware != MPU401_HW_ICE1712) {
One other thing: this fix does not fix MIDI output. Unfortunately I can't test this now.
Florian
On 4/25/2007 11:12 AM, Daniel James wrote:
Hi Florian,
IT WORKED! I fixed it in this way:
mpu401_uart.c:229
if (mpu->hardware != MPU401_HW_TRID4DWAVE && mpu->hardware != MPU401_HW_ICE1712) { mpu->write(mpu, 0x00, MPU401D(mpu)); /*snd_mpu401_uart_clear_rx(mpu);*/ }
I don't know if this will work on non-AMD machines and if it will work on all ice1712 machines...
I can test it here - which version is your patch against?
Cheers!
Daniel _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
At Wed, 25 Apr 2007 15:07:44 +0200, Florian wrote:
it's the hg version from last Monday. In any case, just look for the string "MPU401_HW_TRID" - it only appears once in the file, then add this condition:
&& mpu->hardware != MPU401_HW_ICE1712) {
One other thing: this fix does not fix MIDI output.
I don't know of the MIDI output problem? Descriptions?
Takashi
I don't know of the MIDI output problem? Descriptions?
it's the same symptom: MIDI input works, but opening MIDI output will reboot immediately. I haven't found a similar line to exclude for MIDI output...
Florian
On 4/25/2007 3:11 PM, Takashi Iwai wrote:
At Wed, 25 Apr 2007 15:07:44 +0200, Florian wrote:
it's the hg version from last Monday. In any case, just look for the string "MPU401_HW_TRID" - it only appears once in the file, then add this condition:
&& mpu->hardware != MPU401_HW_ICE1712) {
One other thing: this fix does not fix MIDI output.
I don't know of the MIDI output problem? Descriptions?
Takashi
Hi Takashi, hi Florian,
I found a similar MIDI crashing bug on a dual Opteron machine in 2005, even with only one processor installed, which I reported at the time on alsa-devel. The card was an M-Audio Audiophile 24/96, normally reliable.
I couldn't replicate this problem on an Asus single processor Opteron board, so I concluded it was a quirk of my dual socket Tyan S2875 motherboard:
http://article.gmane.org/gmane.linux.alsa.devel/24682/ http://article.gmane.org/gmane.linux.alsa.devel/25323/
Last time I tested it, earlier this year I think, the bug was still there. I even flashed the BIOS of the Tyan board in case it was a BIOS bug, but it made no difference. I installed a 32-bit distro and that made no difference either.
Maybe there is something more generally wrong here, which only affects dual-processor AMD64 hardware. I can make the Tyan machine available over SSH if that helps, it has a fixed IP address.
Cheers!
Daniel
At Tue, 24 Apr 2007 15:23:52 +0100, Daniel James wrote:
Hi Takashi, hi Florian,
I found a similar MIDI crashing bug on a dual Opteron machine in 2005, even with only one processor installed, which I reported at the time on alsa-devel. The card was an M-Audio Audiophile 24/96, normally reliable.
I couldn't replicate this problem on an Asus single processor Opteron board, so I concluded it was a quirk of my dual socket Tyan S2875 motherboard:
http://article.gmane.org/gmane.linux.alsa.devel/24682/ http://article.gmane.org/gmane.linux.alsa.devel/25323/
Last time I tested it, earlier this year I think, the bug was still there. I even flashed the BIOS of the Tyan board in case it was a BIOS bug, but it made no difference. I installed a 32-bit distro and that made no difference either.
Maybe there is something more generally wrong here, which only affects dual-processor AMD64 hardware. I can make the Tyan machine available over SSH if that helps, it has a fixed IP address.
The most important thing is to find out what triggers which result. As far as I understand from Florian's analysis, the io-port access results in a machine reboot, not a kernel panic or so. It's scary because the controls is completely out of kernel.
Perhaps an easiest but foolishest way to trace this is to put printk at each io-port access and any other important points, and give some sleep at each point, then watch the kernel message. You can get rid of spin_lock_*() around that, just for testing.
Takashi
Hi Takashi,
The most important thing is to find out what triggers which result. As far as I understand from Florian's analysis, the io-port access results in a machine reboot, not a kernel panic or so.
In my case, I saw a hard lock-up as soon as I typed the name of any MIDI program, even 'aconnect'. There was no panic or log information, just a complete freeze.
Cheers!
Daniel
participants (3)
-
Daniel James
-
Florian
-
Takashi Iwai