On Thu, 15 Apr 2021 04:26:18 +0200, Hans Hu(SH-RD) wrote:
On Wed, 14 Apr 2021 14:00:23 +0200, Hans Hu(SH-RD) wrote:
Hi Takashi,
ZHAOXIN HDAC controller has one design limitation when it works in non-snoop mode. This design limitation is: hardware can't guarantee that the write CORB cycle always completes first before the write CORBWP cycle. Here is the error scene: int snd_hdac_bus_send_cmd(struct hdac_bus *bus, unsigned int val) { ... bus->corb.buf[wp] = cpu_to_le32(val); // cycle_1: write value to CORB snd_hdac_chip_writew(bus, CORBWP, wp); // cycle_2: write wp to CORBWP ... } Normally, after cycle_2, DMA engine will start working and get data from CORB and sent it out. But if cycle_1 haven’t finished yet at this time(limitation occurs), which means the value haven’t been written into CORB, then DMA engine will get unexpected value, then error occurred. (feels like one kind of CORB underrun).
If we add one read CORB cycle between cycle_1 and cycle_2, cycle_1 and cycle_2 operations will be synchronized, this limitation will be fixed. We have written a draft patch based on this situation and hope to be accepted, the patch example is as follows: diff --git a/include/sound/hdaudio.h b/include/sound/hdaudio.h index 22af68b..c338db00 100644 --- a/include/sound/hdaudio.h +++ b/include/sound/hdaudio.h @@ -339,6 +339,7 @@ struct hdac_bus { bool sync_write:1; /* sync after verb write */ bool use_posbuf:1; /* use position buffer */ bool snoop:1; /* enable snooping */
bool no_snoop_corb_sync:1; bool align_bdle_4k:1; /* BDLE align 4K boundary */ bool reverse_assign:1; /* assign devices in reverse order */ bool corbrp_self_clear:1; /* CORBRP clears itself after reset */
diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index 062da7a..6c90cdd 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -167,6 +167,8 @@ int snd_hdac_bus_send_cmd(struct hdac_bus *bus, unsigned int val)
bus->rirb.cmds[addr]++; bus->corb.buf[wp] = cpu_to_le32(val);
if (bus->no_snoop_corb_sync)
val = bus->corb.buf[wp]; snd_hdac_chip_writew(bus, CORBWP, wp);
Just wondering whether using WRITE_ONCE() macro would be enough? e.g. WRITE_ONCE(bus->corb.buf[wp], cpu_to_le32(val)); Also, no_snoop_corb_sync is a bit confusing. What you do is rather sync of the written value, so it can be taken as if other way round. Maybe no_coherent_corb_write or such would be better understandable.
thanks,
Takashi
Thanks for your suggestion, it's my fault not clearly explain the root case about this limitation, which is, these two kind of instructions have independent physical paths, even C2M instruction(cycle_1) already retired before C2P instruction(cycle_2 with non-snoop) start, hardware still can't guarantee the coherence. But we can use WRITE_ONCE() to enhance the patch and to prevent it from being optimized by the compiler.
Well, I'm not entirely sure whether WRITE_ONCE() would really work in this case. I know some other chips require the explicit read (e.g. i915 driver), and this might be the case, too. So it's just a question and something worth to check.
if (bus->no_coherent_corb_write)
WRITE_ONCE(cpu_to_le32(val), bus->corb.buf[wp]);
This limitation only appears in the non-snoop mode, does this need to be reflected in the variable name?
I suggested renaming just because the term "sync" is a bit confusing. We have already sync_write flag that is meant for a completely different "sync" meaning (to sync with the response for each verb write).
thanks,
Takashi