snd_hda_intel initialization failure with Xen PCI passthrough
Jason Andryuk
jandryuk at gmail.com
Thu Mar 24 16:16:38 CET 2022
On Wed, Mar 23, 2022 at 3:05 PM Takashi Iwai <tiwai at suse.de> wrote:
>
> On Wed, 23 Mar 2022 19:52:21 +0100,
> Jason Andryuk wrote:
> >
> > On Wed, Mar 23, 2022 at 5:41 AM Takashi Iwai <tiwai at suse.de> wrote:
> > >
> > > On Tue, 22 Mar 2022 19:57:27 +0100,
> > > Jason Andryuk wrote:
> > > >
> > > > Hi,
> > > >
> > > > I'm running Xen hypervisor and using PCI passthrough to assign an
> > > > Intel HDA audio device (00:1f.3 Audio device: Intel Corporation Cannon
> > > > Point-LP High Definition Audio Controller (rev 30)) to a Xen HVM
> > > > virtual machine. I do this for both Linux 5.4.185 and a different
> > > > Windows 10 VM (only one at a time). The Windows VM seems to work
> > > > every time. The Linux VM has issues after the first VM boot. This is
> > > > one boot of the physical hardware and multiple boots of the virtual
> > > > machines.
> > > >
> > > > For Linux, on first boot, the sound card is detected and works
> > > > properly. After that, things usually don't work. I just ran a reboot
> > > > loop and it was:
> > > > 1st boot - audio detected and working
> > > > 2 & 3 - no audio
> > > > 4th - audio detected and working
> > > > 5 - 20 - no audio
> > > >
> > > > For boots 2, 3, 5-7, dmesg shows:
> > > > [ 0.760401] hdaudio hdaudioC0D0: no AFG or MFG node found
> > > > [ 0.760415] snd_hda_intel 0000:00:06.0: no codecs initialized
> > > >
> > > > For boots 8+, the errors changed to:
> > > > [ 0.783397] hdaudio hdaudioC0D0: cannot read sub nodes for FG 0x10
> > > > [ 0.783413] snd_hda_intel 0000:00:06.0: no codecs initialized
> > > >
> > > > At this point, I booted a Windows 10 VM and audio works
> > > >
> > > > Trying to boot Linux again gives a new error message
> > > > [ 0.789041] snd_hda_intel 0000:00:06.0: Unknown capability 0
> > > > [ 1.811205] snd_hda_intel 0000:00:06.0: No response from codec,
> > > > resetting bus: last cmd=0x0eef0004
> > > > [ 1.811246] hdaudio hdaudioC0D0: cannot read sub nodes for FG 0x10ee
> > > > [ 1.811263] snd_hda_intel 0000:00:06.0: no codecs initialized
> > > >
> > > > Reboot VM and it's back to:
> > > > [ 0.775917] hdaudio hdaudioC0D0: no AFG or MFG node found
> > > > [ 0.775932] snd_hda_intel 0000:00:06.0: no codecs initialized
> > > >
> > > > Reboot VM and again:
> > > > [ 0.789069] hdaudio hdaudioC0D0: cannot read sub nodes for FG 0x10
> > > > [ 0.789084] snd_hda_intel 0000:00:06.0: no codecs initialized
> > > >
> > > > Reboot physical laptop:
> > > > 1. boot Windows 10 - audio works
> > > > 2. boot Linux - audio works
> > > > 3. reboot Linux - no audio
> > > > [ 0.773111] hdaudio hdaudioC0D0: no AFG or MFG node found
> > > > [ 0.773151] snd_hda_intel 0000:00:06.0: no codecs initialized
> > > >
> > > > This seems to me like Windows does a better job resetting the card to
> > > > get the audio hardware working. Any suggestions on what to
> > > > investigate?
> >
> > Thanks for taking a look, Takashi.
> >
> > > First off, 5.4.x is way too old to debug, please confirm the issue
> > > with the latest kernel.
> > >
> > > And, one test I'd try is to unload snd-hda-intel module before
> > > rebooting. Does the problem persist?
> >
> > For my 5.4.186 VM, the module is built-in. I tried `echo 0000:00:03.0
> > > /sys/bus/pci/driver/snd_hda_intel/unbind` before rebooting, but that
> > did not work.
> >
> > I switched to Fedora 35 in the VM with kernel 5.16.16. That worked
> > the first time and failed the second.
> >
> > First working:
> > [ 3.094907] snd_hda_intel 0000:00:06.0: DSP detected with PCI
> > class/subclass/prog-if info 0x040380
> > [ 3.094912] snd_hda_intel 0000:00:06.0: NHLT table not found
> > [ 3.197480] snd_hda_codec_realtek hdaudioC0D0: autoconfig for
> > ALC3204: line_outs=1 (0x14/0x0/0x0/0x0/0x0) type:speaker
> > [ 3.197484] snd_hda_codec_realtek hdaudioC0D0: speaker_outs=0
> > (0x0/0x0/0x0/0x0/0x0)
> > [ 3.197485] snd_hda_codec_realtek hdaudioC0D0: hp_outs=1
> > (0x21/0x0/0x0/0x0/0x0)
> > [ 3.197486] snd_hda_codec_realtek hdaudioC0D0: mono: mono_out=0x0
> > [ 3.197487] snd_hda_codec_realtek hdaudioC0D0: inputs:
> > [ 3.197488] snd_hda_codec_realtek hdaudioC0D0: Headset Mic=0x19
> > [ 3.197489] snd_hda_codec_realtek hdaudioC0D0: Headphone Mic=0x1a
> > [ 3.197489] snd_hda_codec_realtek hdaudioC0D0: Internal Mic=0x12
> > [ 66.801958] snd_hda_intel 0000:00:06.0: azx_get_response timeout,
> > switching to polling mode: last cmd=0x00170500
> >
> > Second boot audio still failed after doing `echo 0000:00:06.0 >
> > /sys/bus/pci/driver/snd_hda_intel/unbind` and rmmod-ing lots of snd_*
> > modules. I rmmod-ed the snd_*intel ones, but other snd* modules
> > including snd_hrtimer were in use and could not be removed.
>
> That's weird. If you logout the desktop and go to VT, you can unload
> snd-hda-intel. Then the other modules should be unloadable.
>
> And do you see the problem without VM? That is, the host shows the
> same symptom?
Good idea. Yes, it does work back in Dom0 (the host). By default,
the sound card remains bound to the pciback driver after
de-assignment. If I do `xl pci-assignable-remove -r 0000:00:1f.3`,
the audio driver is re-bound and it works.
The xen-pciback driver has a permissive configuration knob. The
default is 0, and in that mode it limits access to PCI configuration
space to try to allow only known good items. When set to 1, there is
no restriction on the config space. This doesn't apply to dom0 - it
doesn't have restrictions.
With permissive=1, the linux snd_hda_intel driver seems to work fine.
It's when permissive=0 that we have the "only first boot works" issue.
Here is a message when it inhibits operation:
Linux (19): [ 0.718092] pci 0000:00:06.0: reg 0x20: [mem
0xf2000000-0xf20fffff 64bit]
Linux-dm (20): [00:06.0] Write-back to unknown field 0x44 (partially)
inhibited (0x00)
Linux-dm (20): [00:06.0] If the device doesn't work, try enabling
permissive mode
Linux-dm (20): [00:06.0] (unsafe) and if it helps report the problem
to xen-devel
Linux (19): [ 0.762790] snd_hda_intel 0000:00:06.0: no codecs initialized
Windows also has this, but it still works.
Windows10-dm (24): [00:06.0] Write-back to unknown field 0x79
(partially) inhibited (0x00)
Windows10-dm (24): [00:06.0] If the device doesn't work, try enabling
permissive mode
Windows10-dm (24): [00:06.0] (unsafe) and if it helps report the
problem to xen-devel
Those messages are from QEMU and it only prints on the first one, so
there could be more.
Sorry I didn't notice those earlier.
Preferably it would work with permissive=0. While looking at this,
I've noticed some other issues with the Xen PCI passthrough code not
resetting the device. I'm going to look into that more.
Thanks,
Jason
More information about the Alsa-devel
mailing list