[alsa-devel] [PATCH 0/6] Yet another experiments with LPE audio
Another day, another patchset. Here I put a largish cleanup patch to just shuffle the code, in addition to the new S16 and S32 formats support and the support for a single period with no period wakeup as we've discussed in the previous thread. One patch to remove BATCH flag is same and kept in the middle of this series.
I guess I can merge the first three patches now. As Pierre still seems to have a concern about BATCH, the rest three patches can postpone. It worked stably on my machine, but it'd be appreciated if anyone else can test it on other machines, too (especially with BYT).
As usual, they are found in topic/intel-lpe-audio-cleanup branch.
Takashi
===
Takashi Iwai (6): ALSA: x86: Rearrange defines ALSA: x86: Support S32 format ALSA: x86: Support S16 format ALSA: x86: Don't pass SNDRV_PCM_INFO_BATCH flag ALSA: x86: Allow single period PCM operation ALSA: x86: Allow no-period-wakeup setup
sound/x86/intel_hdmi_audio.c | 53 +++++++++++++++------- sound/x86/intel_hdmi_audio.h | 64 ++++++++++++++++---------- sound/x86/intel_hdmi_lpe_audio.h | 97 +++++++++++++++++----------------------- 3 files changed, 118 insertions(+), 96 deletions(-)
We have two header files and everything is mixed up chaotically. Move the chip-specific definitions like the hardware registers to intel_hdmi_lpe_audio.h, and the rest, the implementation specific stuff into intel_hdmi_audio.h.
In addition, put some more comments to the register fields, and fix the incorrect name prefix for AUD_HDMI_STATUS bits, too.
The whole changes are merely a code shuffling, and there is no functional change.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/x86/intel_hdmi_audio.c | 5 ++- sound/x86/intel_hdmi_audio.h | 64 +++++++++++++++++---------- sound/x86/intel_hdmi_lpe_audio.h | 95 +++++++++++++++++----------------------- 3 files changed, 83 insertions(+), 81 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index dd7944c5ebc2..eb7044d2f314 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -252,7 +252,8 @@ static void had_ack_irqs(struct snd_intelhad *ctx) /* Reset buffer pointers */ static void had_reset_audio(struct snd_intelhad *intelhaddata) { - had_write_register(intelhaddata, AUD_HDMI_STATUS, 1); + had_write_register(intelhaddata, AUD_HDMI_STATUS, + AUD_HDMI_STATUSG_MASK_FUNCRST); had_write_register(intelhaddata, AUD_HDMI_STATUS, 0); }
@@ -989,7 +990,7 @@ static void wait_clear_underrun_bit(struct snd_intelhad *intelhaddata) for (i = 0; i < MAX_CNT; i++) { /* clear bit30, 31 AUD_HDMI_STATUS */ had_read_register(intelhaddata, AUD_HDMI_STATUS, &val); - if (!(val & AUD_CONFIG_MASK_UNDERRUN)) + if (!(val & AUD_HDMI_STATUS_MASK_UNDERRUN)) return; had_write_register(intelhaddata, AUD_HDMI_STATUS, val); } diff --git a/sound/x86/intel_hdmi_audio.h b/sound/x86/intel_hdmi_audio.h index 7e2546b853ca..fe8d99cb839f 100644 --- a/sound/x86/intel_hdmi_audio.h +++ b/sound/x86/intel_hdmi_audio.h @@ -35,32 +35,50 @@ #define PCM_INDEX 0 #define MAX_PB_STREAMS 1 #define MAX_CAP_STREAMS 0 - -#define HDMI_INFO_FRAME_WORD1 0x000a0184 -#define DP_INFO_FRAME_WORD1 0x00441b84 -#define FIFO_THRESHOLD 0xFE -#define DMA_FIFO_THRESHOLD 0x7 #define BYTES_PER_WORD 0x4 +#define INTEL_HAD "HdmiLpeAudio" + +/* + * CEA speaker placement: + * + * FL FLC FC FRC FR + * + * LFE + * + * RL RLC RC RRC RR + * + * The Left/Right Surround channel _notions_ LS/RS in SMPTE 320M + * corresponds to CEA RL/RR; The SMPTE channel _assignment_ C/LFE is + * swapped to CEA LFE/FC. + */ +enum cea_speaker_placement { + FL = (1 << 0), /* Front Left */ + FC = (1 << 1), /* Front Center */ + FR = (1 << 2), /* Front Right */ + FLC = (1 << 3), /* Front Left Center */ + FRC = (1 << 4), /* Front Right Center */ + RL = (1 << 5), /* Rear Left */ + RC = (1 << 6), /* Rear Center */ + RR = (1 << 7), /* Rear Right */ + RLC = (1 << 8), /* Rear Left Center */ + RRC = (1 << 9), /* Rear Right Center */ + LFE = (1 << 10), /* Low Frequency Effect */ +};
-/* Sampling rate as per IEC60958 Ver 3 */ -#define CH_STATUS_MAP_32KHZ 0x3 -#define CH_STATUS_MAP_44KHZ 0x0 -#define CH_STATUS_MAP_48KHZ 0x2 -#define CH_STATUS_MAP_88KHZ 0x8 -#define CH_STATUS_MAP_96KHZ 0xA -#define CH_STATUS_MAP_176KHZ 0xC -#define CH_STATUS_MAP_192KHZ 0xE +struct cea_channel_speaker_allocation { + int ca_index; + int speakers[8];
-#define MAX_SMPL_WIDTH_20 0x0 -#define MAX_SMPL_WIDTH_24 0x1 -#define SMPL_WIDTH_16BITS 0x1 -#define SMPL_WIDTH_24BITS 0x5 -#define CHANNEL_ALLOCATION 0x1F -#define VALID_DIP_WORDS 3 -#define LAYOUT0 0 -#define LAYOUT1 1 -#define SWAP_LFE_CENTER 0x00fac4c8 -#define AUD_CONFIG_CH_MASK 0x70 + /* derived values, just for convenience */ + int channels; + int spk_mask; +}; + +struct channel_map_table { + unsigned char map; /* ALSA API channel map position */ + unsigned char cea_slot; /* CEA slot value */ + int spk_mask; /* speaker position bit mask */ +};
struct pcm_stream_info { struct snd_pcm_substream *substream; diff --git a/sound/x86/intel_hdmi_lpe_audio.h b/sound/x86/intel_hdmi_lpe_audio.h index 48cab1b84c7b..a13a66771f73 100644 --- a/sound/x86/intel_hdmi_lpe_audio.h +++ b/sound/x86/intel_hdmi_lpe_audio.h @@ -23,7 +23,6 @@ #ifndef __INTEL_HDMI_LPE_AUDIO_H #define __INTEL_HDMI_LPE_AUDIO_H
-#define HAD_MAX_DEVICES 1 #define HAD_MIN_CHANNEL 2 #define HAD_MAX_CHANNEL 8 #define HAD_NUM_OF_RING_BUFS 4 @@ -55,9 +54,7 @@ #define DIS_SAMPLE_RATE_74_25 74250 #define DIS_SAMPLE_RATE_148_5 148500 #define HAD_REG_WIDTH 0x08 -#define HAD_MAX_HW_BUFS 0x04 #define HAD_MAX_DIP_WORDS 16 -#define INTEL_HAD "HdmiLpeAudio"
/* DP Link Rates */ #define DP_2_7_GHZ 270000 @@ -112,72 +109,34 @@ enum hdmi_ctrl_reg_offset { AUD_HDMIW_INFOFR = 0x68, /* v2 */ };
-/* - * CEA speaker placement: - * - * FL FLC FC FRC FR - * - * LFE - * - * RL RLC RC RRC RR - * - * The Left/Right Surround channel _notions_ LS/RS in SMPTE 320M - * corresponds to CEA RL/RR; The SMPTE channel _assignment_ C/LFE is - * swapped to CEA LFE/FC. - */ -enum cea_speaker_placement { - FL = (1 << 0), /* Front Left */ - FC = (1 << 1), /* Front Center */ - FR = (1 << 2), /* Front Right */ - FLC = (1 << 3), /* Front Left Center */ - FRC = (1 << 4), /* Front Right Center */ - RL = (1 << 5), /* Rear Left */ - RC = (1 << 6), /* Rear Center */ - RR = (1 << 7), /* Rear Right */ - RLC = (1 << 8), /* Rear Left Center */ - RRC = (1 << 9), /* Rear Right Center */ - LFE = (1 << 10), /* Low Frequency Effect */ -}; - -struct cea_channel_speaker_allocation { - int ca_index; - int speakers[8]; - - /* derived values, just for convenience */ - int channels; - int spk_mask; -}; - -struct channel_map_table { - unsigned char map; /* ALSA API channel map position */ - unsigned char cea_slot; /* CEA slot value */ - int spk_mask; /* speaker position bit mask */ -}; - /* Audio configuration */ union aud_cfg { struct { u32 aud_en:1; - u32 layout:1; + u32 layout:1; /* LAYOUT[01], see below */ u32 fmt:2; u32 num_ch:3; u32 set:1; u32 flat:1; u32 val_bit:1; u32 user_bit:1; - u32 underrun:1; - u32 packet_mode:1; - u32 left_align:1; - u32 bogus_sample:1; - u32 dp_modei:1; + u32 underrun:1; /* 0: send null packets, + * 1: send silence stream + */ + u32 packet_mode:1; /* 0: 32bit container, 1: 16bit */ + u32 left_align:1; /* 0: MSB bits 0-23, 1: bits 8-31 */ + u32 bogus_sample:1; /* bogus sample for odd channels */ + u32 dp_modei:1; /* 0: HDMI, 1: DP */ u32 rsvd:16; } regx; u32 regval; };
-#define AUD_CONFIG_BLOCK_BIT (1 << 7) #define AUD_CONFIG_VALID_BIT (1 << 9) #define AUD_CONFIG_DP_MODE (1 << 15) +#define AUD_CONFIG_CH_MASK 0x70 +#define LAYOUT0 0 /* interleaved stereo */ +#define LAYOUT1 1 /* for channels >= 2 */
/* Audio Channel Status 0 Attributes */ union aud_ch_status_0 { @@ -190,13 +149,22 @@ union aud_ch_status_0 { u32 ctg_code:8; u32 src_num:4; u32 ch_num:4; - u32 samp_freq:4; + u32 samp_freq:4; /* CH_STATUS_MAP_XXX */ u32 clk_acc:2; u32 rsvd:2; } regx; u32 regval; };
+/* samp_freq values - Sampling rate as per IEC60958 Ver 3 */ +#define CH_STATUS_MAP_32KHZ 0x3 +#define CH_STATUS_MAP_44KHZ 0x0 +#define CH_STATUS_MAP_48KHZ 0x2 +#define CH_STATUS_MAP_88KHZ 0x8 +#define CH_STATUS_MAP_96KHZ 0xA +#define CH_STATUS_MAP_176KHZ 0xC +#define CH_STATUS_MAP_192KHZ 0xE + /* Audio Channel Status 1 Attributes */ union aud_ch_status_1 { struct { @@ -207,6 +175,11 @@ union aud_ch_status_1 { u32 regval; };
+#define MAX_SMPL_WIDTH_20 0x0 +#define MAX_SMPL_WIDTH_24 0x1 +#define SMPL_WIDTH_16BITS 0x1 +#define SMPL_WIDTH_24BITS 0x5 + /* CTS register */ union aud_hdmi_cts { struct { @@ -239,6 +212,9 @@ union aud_buf_config { u32 regval; };
+#define FIFO_THRESHOLD 0xFE +#define DMA_FIFO_THRESHOLD 0x7 + /* Audio Sample Swapping offset */ union aud_buf_ch_swap { struct { @@ -255,6 +231,8 @@ union aud_buf_ch_swap { u32 regval; };
+#define SWAP_LFE_CENTER 0x00fac4c8 /* octal 76543210 */ + /* Address for Audio Buffer */ union aud_buf_addr { struct { @@ -306,6 +284,9 @@ union aud_info_frame1 { u32 regval; };
+#define HDMI_INFO_FRAME_WORD1 0x000a0184 +#define DP_INFO_FRAME_WORD1 0x00441b84 + /* DIP frame 2 */ union aud_info_frame2 { struct { @@ -333,13 +314,15 @@ union aud_info_frame3 { u32 regval; };
+#define VALID_DIP_WORDS 3 + /* AUD_HDMI_STATUS bits */ #define HDMI_AUDIO_UNDERRUN (1U << 31) #define HDMI_AUDIO_BUFFER_DONE (1U << 29)
/* AUD_HDMI_STATUS register mask */ -#define AUD_CONFIG_MASK_UNDERRUN 0xC0000000 -#define AUD_CONFIG_MASK_SRDBG 0x00000002 -#define AUD_CONFIG_MASK_FUNCRST 0x00000001 +#define AUD_HDMI_STATUS_MASK_UNDERRUN 0xC0000000 +#define AUD_HDMI_STATUS_MASK_SRDBG 0x00000002 +#define AUD_HDMI_STATUSG_MASK_FUNCRST 0x00000001
#endif
The hardware has the support for the left-aligned 24bit format in 32bit packet. This corresponds to S32 format in ALSA. We need to set the msbits restriction as well to inform user-space that only MSB 24bit are available.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/x86/intel_hdmi_audio.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index eb7044d2f314..a0401476cd93 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -136,7 +136,8 @@ static const struct snd_pcm_hardware had_pcm_hardware = { SNDRV_PCM_INFO_MMAP| SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_BATCH), - .formats = SNDRV_PCM_FMTBIT_S24, + .formats = (SNDRV_PCM_FMTBIT_S24_LE | + SNDRV_PCM_FMTBIT_S32_LE), .rates = SNDRV_PCM_RATE_32000 | SNDRV_PCM_RATE_44100 | SNDRV_PCM_RATE_48000 | @@ -267,7 +268,6 @@ static int had_prog_status_reg(struct snd_pcm_substream *substream, union aud_cfg cfg_val = {.regval = 0}; union aud_ch_status_0 ch_stat0 = {.regval = 0}; union aud_ch_status_1 ch_stat1 = {.regval = 0}; - int format;
ch_stat0.regx.lpcm_id = (intelhaddata->aes_bits & IEC958_AES0_NONAUDIO) >> 1; @@ -307,17 +307,20 @@ static int had_prog_status_reg(struct snd_pcm_substream *substream, had_write_register(intelhaddata, AUD_CH_STATUS_0, ch_stat0.regval);
- format = substream->runtime->format; - - if (format == SNDRV_PCM_FORMAT_S16_LE) { + switch (substream->runtime->format) { +#if 0 /* FIXME: not supported yet */ + case SNDRV_PCM_FORMAT_S16_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_20; ch_stat1.regx.wrd_len = SMPL_WIDTH_16BITS; - } else if (format == SNDRV_PCM_FORMAT_S24_LE) { + break; +#endif + case SNDRV_PCM_FORMAT_S24_LE: + case SNDRV_PCM_FORMAT_S32_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_24; ch_stat1.regx.wrd_len = SMPL_WIDTH_24BITS; - } else { - ch_stat1.regx.max_wrd_len = 0; - ch_stat1.regx.wrd_len = 0; + break; + default: + return -EINVAL; }
had_write_register(intelhaddata, @@ -351,6 +354,9 @@ static int had_init_audio_ctrl(struct snd_pcm_substream *substream, else cfg_val.regx.layout = LAYOUT1;
+ if (substream->runtime->format == SNDRV_PCM_FORMAT_S32_LE) + cfg_val.regx.left_align = 1; + cfg_val.regx.val_bit = 1;
/* fix up the DP bits */ @@ -1057,6 +1063,10 @@ static int had_pcm_open(struct snd_pcm_substream *substream) if (retval < 0) goto error;
+ retval = snd_pcm_hw_constraint_msbits(runtime, 0, 32, 24); + if (retval < 0) + goto error; + /* expose PCM substream */ spin_lock_irq(&intelhaddata->had_spinlock); intelhaddata->stream_info.substream = substream;
Now we support S16 PCM format in addition. For this, we need to set packet_mode=1 in AUD_CONFIG register.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/x86/intel_hdmi_audio.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index a0401476cd93..5697552f489a 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -136,7 +136,8 @@ static const struct snd_pcm_hardware had_pcm_hardware = { SNDRV_PCM_INFO_MMAP| SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_BATCH), - .formats = (SNDRV_PCM_FMTBIT_S24_LE | + .formats = (SNDRV_PCM_FMTBIT_S16_LE | + SNDRV_PCM_FMTBIT_S24_LE | SNDRV_PCM_FMTBIT_S32_LE), .rates = SNDRV_PCM_RATE_32000 | SNDRV_PCM_RATE_44100 | @@ -308,12 +309,10 @@ static int had_prog_status_reg(struct snd_pcm_substream *substream, AUD_CH_STATUS_0, ch_stat0.regval);
switch (substream->runtime->format) { -#if 0 /* FIXME: not supported yet */ case SNDRV_PCM_FORMAT_S16_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_20; ch_stat1.regx.wrd_len = SMPL_WIDTH_16BITS; break; -#endif case SNDRV_PCM_FORMAT_S24_LE: case SNDRV_PCM_FORMAT_S32_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_24; @@ -354,6 +353,9 @@ static int had_init_audio_ctrl(struct snd_pcm_substream *substream, else cfg_val.regx.layout = LAYOUT1;
+ if (substream->runtime->format == SNDRV_PCM_FORMAT_S16_LE) + cfg_val.regx.packet_mode = 1; + if (substream->runtime->format == SNDRV_PCM_FORMAT_S32_LE) cfg_val.regx.left_align = 1;
On 02/07/2017 07:11 AM, Takashi Iwai wrote:
Now we support S16 PCM format in addition. For this, we need to set packet_mode=1 in AUD_CONFIG register.
While I think of it, there is a hidden benefit here. If this mode works, then most likely we can push IEC61937 data formatted as S16 PCM directly. For passthrough we needed to left-shit the data, if this mode does what it says it should make some people happy. The U,C,V should not be added in the data stream however (as done by the hdmi: plugin), there is a separate register to program then.
Signed-off-by: Takashi Iwai tiwai@suse.de
sound/x86/intel_hdmi_audio.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index a0401476cd93..5697552f489a 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -136,7 +136,8 @@ static const struct snd_pcm_hardware had_pcm_hardware = { SNDRV_PCM_INFO_MMAP| SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_BATCH),
- .formats = (SNDRV_PCM_FMTBIT_S24_LE |
- .formats = (SNDRV_PCM_FMTBIT_S16_LE |
SNDRV_PCM_FMTBIT_S32_LE), .rates = SNDRV_PCM_RATE_32000 | SNDRV_PCM_RATE_44100 |SNDRV_PCM_FMTBIT_S24_LE |
@@ -308,12 +309,10 @@ static int had_prog_status_reg(struct snd_pcm_substream *substream, AUD_CH_STATUS_0, ch_stat0.regval);
switch (substream->runtime->format) { -#if 0 /* FIXME: not supported yet */ case SNDRV_PCM_FORMAT_S16_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_20; ch_stat1.regx.wrd_len = SMPL_WIDTH_16BITS; break; -#endif case SNDRV_PCM_FORMAT_S24_LE: case SNDRV_PCM_FORMAT_S32_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_24; @@ -354,6 +353,9 @@ static int had_init_audio_ctrl(struct snd_pcm_substream *substream, else cfg_val.regx.layout = LAYOUT1;
- if (substream->runtime->format == SNDRV_PCM_FORMAT_S16_LE)
cfg_val.regx.packet_mode = 1;
- if (substream->runtime->format == SNDRV_PCM_FORMAT_S32_LE) cfg_val.regx.left_align = 1;
On Tue, 07 Feb 2017 17:48:59 +0100, Pierre-Louis Bossart wrote:
On 02/07/2017 07:11 AM, Takashi Iwai wrote:
Now we support S16 PCM format in addition. For this, we need to set packet_mode=1 in AUD_CONFIG register.
While I think of it, there is a hidden benefit here. If this mode works, then most likely we can push IEC61937 data formatted as S16 PCM directly. For passthrough we needed to left-shit the data, if this mode does what it says it should make some people happy. The U,C,V should not be added in the data stream however (as done by the hdmi: plugin), there is a separate register to program then.
Yes, embedding the raw IEC data stream is an interesting move, indeed. But right now I'm not sure which configuration allows that operation mode.
In my patch, I just played with AUD_CONFIG packet_mode bit. Maybe other fields (fmt, flat, user_bit) may play some relevant role?
Also, Channel Status 0 register have lots of uninitialized fields. Some of ch0 bits are carried from AES status bits, but many are left untouched.
thanks,
Takashi
Signed-off-by: Takashi Iwai tiwai@suse.de
sound/x86/intel_hdmi_audio.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index a0401476cd93..5697552f489a 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -136,7 +136,8 @@ static const struct snd_pcm_hardware had_pcm_hardware = { SNDRV_PCM_INFO_MMAP| SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_BATCH),
- .formats = (SNDRV_PCM_FMTBIT_S24_LE |
- .formats = (SNDRV_PCM_FMTBIT_S16_LE |
SNDRV_PCM_FMTBIT_S32_LE), .rates = SNDRV_PCM_RATE_32000 | SNDRV_PCM_RATE_44100 |SNDRV_PCM_FMTBIT_S24_LE |
@@ -308,12 +309,10 @@ static int had_prog_status_reg(struct snd_pcm_substream *substream, AUD_CH_STATUS_0, ch_stat0.regval); switch (substream->runtime->format) { -#if 0 /* FIXME: not supported yet */ case SNDRV_PCM_FORMAT_S16_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_20; ch_stat1.regx.wrd_len = SMPL_WIDTH_16BITS; break; -#endif case SNDRV_PCM_FORMAT_S24_LE: case SNDRV_PCM_FORMAT_S32_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_24; @@ -354,6 +353,9 @@ static int had_init_audio_ctrl(struct snd_pcm_substream *substream, else cfg_val.regx.layout = LAYOUT1;
- if (substream->runtime->format == SNDRV_PCM_FORMAT_S16_LE)
cfg_val.regx.packet_mode = 1;
- if (substream->runtime->format == SNDRV_PCM_FORMAT_S32_LE) cfg_val.regx.left_align = 1;
On 02/07/2017 11:09 AM, Takashi Iwai wrote:
On Tue, 07 Feb 2017 17:48:59 +0100, Pierre-Louis Bossart wrote:
On 02/07/2017 07:11 AM, Takashi Iwai wrote:
Now we support S16 PCM format in addition. For this, we need to set packet_mode=1 in AUD_CONFIG register.
While I think of it, there is a hidden benefit here. If this mode works, then most likely we can push IEC61937 data formatted as S16 PCM directly. For passthrough we needed to left-shit the data, if this mode does what it says it should make some people happy. The U,C,V should not be added in the data stream however (as done by the hdmi: plugin), there is a separate register to program then.
Yes, embedding the raw IEC data stream is an interesting move, indeed. But right now I'm not sure which configuration allows that operation mode.
In my patch, I just played with AUD_CONFIG packet_mode bit. Maybe other fields (fmt, flat, user_bit) may play some relevant role?
Also, Channel Status 0 register have lots of uninitialized fields. Some of ch0 bits are carried from AES status bits, but many are left untouched.
I think there are two registers for a total for 40bits that can store what will go in the channel status bits. The remaining bits to 192 are not supported. That's an optimization that existed even before I joined, no idea why they tried to save 152 bits...
In most cases you don't even need to configure stuff, the receivers don't trust all these bits and will determine PCM/IEC61937 encoding based on the actual payload, not the control bits (with the side effect that switching between PCM/IC61937 takes time)
thanks,
Takashi
Signed-off-by: Takashi Iwai tiwai@suse.de
sound/x86/intel_hdmi_audio.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index a0401476cd93..5697552f489a 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -136,7 +136,8 @@ static const struct snd_pcm_hardware had_pcm_hardware = { SNDRV_PCM_INFO_MMAP| SNDRV_PCM_INFO_MMAP_VALID | SNDRV_PCM_INFO_BATCH),
- .formats = (SNDRV_PCM_FMTBIT_S24_LE |
- .formats = (SNDRV_PCM_FMTBIT_S16_LE |
.rates = SNDRV_PCM_RATE_32000 | SNDRV_PCM_RATE_44100 |SNDRV_PCM_FMTBIT_S24_LE | SNDRV_PCM_FMTBIT_S32_LE),
@@ -308,12 +309,10 @@ static int had_prog_status_reg(struct snd_pcm_substream *substream, AUD_CH_STATUS_0, ch_stat0.regval); switch (substream->runtime->format) { -#if 0 /* FIXME: not supported yet */ case SNDRV_PCM_FORMAT_S16_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_20; ch_stat1.regx.wrd_len = SMPL_WIDTH_16BITS; break; -#endif case SNDRV_PCM_FORMAT_S24_LE: case SNDRV_PCM_FORMAT_S32_LE: ch_stat1.regx.max_wrd_len = MAX_SMPL_WIDTH_24; @@ -354,6 +353,9 @@ static int had_init_audio_ctrl(struct snd_pcm_substream *substream, else cfg_val.regx.layout = LAYOUT1;
- if (substream->runtime->format == SNDRV_PCM_FORMAT_S16_LE)
cfg_val.regx.packet_mode = 1;
- if (substream->runtime->format == SNDRV_PCM_FORMAT_S32_LE) cfg_val.regx.left_align = 1;
The PCM engine on LPE audio isn't like a batch-style process any longer, but rather it deals with the standard ring buffer. Remove the BATCH info flag so that PA can handle the buffer in timer-sched mode.
Similarly, the DOUBLE flag is also superfluous. Drop both bits.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/x86/intel_hdmi_audio.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index 5697552f489a..8f87ac3057c6 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -132,10 +132,8 @@ static const struct channel_map_table map_tables[] = { /* hardware capability structure */ static const struct snd_pcm_hardware had_pcm_hardware = { .info = (SNDRV_PCM_INFO_INTERLEAVED | - SNDRV_PCM_INFO_DOUBLE | - SNDRV_PCM_INFO_MMAP| - SNDRV_PCM_INFO_MMAP_VALID | - SNDRV_PCM_INFO_BATCH), + SNDRV_PCM_INFO_MMAP | + SNDRV_PCM_INFO_MMAP_VALID), .formats = (SNDRV_PCM_FMTBIT_S16_LE | SNDRV_PCM_FMTBIT_S24_LE | SNDRV_PCM_FMTBIT_S32_LE),
This is an implementation of PCM streaming with only 1 period. Since the hardware requires the refresh of BDs after each BD processing finishes, we'd need at least two BDs. The trick is that both BDs point to the same content: the address of the PCM buffer head, and the whole buffer size. Then it loops over to the whole buffer again after it finished once.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/x86/intel_hdmi_audio.c | 9 ++++++++- sound/x86/intel_hdmi_lpe_audio.h | 2 +- 2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index 8f87ac3057c6..979b396272ae 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -846,6 +846,11 @@ static int had_prog_n(u32 aud_samp_freq, u32 *n_param, * * For nperiods < 4, the remaining BDs out of 4 are marked as invalid, so that * the hardware skips those BDs in the loop. + * + * An exceptional setup is the case with nperiods=1. Since we have to update + * BDs after finishing one BD processing, we'd need at least two BDs, where + * both BDs point to the same content, the same address, the same size of the + * whole PCM buffer. */
#define AUD_BUF_ADDR(x) (AUD_BUF_A_ADDR + (x) * HAD_REG_WIDTH) @@ -888,6 +893,8 @@ static void had_init_ringbuf(struct snd_pcm_substream *substream,
num_periods = runtime->periods; intelhaddata->num_bds = min(num_periods, HAD_NUM_OF_RING_BUFS); + /* set the minimum 2 BDs for num_periods=1 */ + intelhaddata->num_bds = max(intelhaddata->num_bds, 2U); intelhaddata->period_bytes = frames_to_bytes(runtime, runtime->period_size); WARN_ON(intelhaddata->period_bytes & 0x3f); @@ -897,7 +904,7 @@ static void had_init_ringbuf(struct snd_pcm_substream *substream, intelhaddata->pcmbuf_filled = 0;
for (i = 0; i < HAD_NUM_OF_RING_BUFS; i++) { - if (i < num_periods) + if (i < intelhaddata->num_bds) had_prog_bd(substream, intelhaddata); else /* invalidate the rest */ had_invalidate_bd(intelhaddata, i); diff --git a/sound/x86/intel_hdmi_lpe_audio.h b/sound/x86/intel_hdmi_lpe_audio.h index a13a66771f73..195918f1a4ae 100644 --- a/sound/x86/intel_hdmi_lpe_audio.h +++ b/sound/x86/intel_hdmi_lpe_audio.h @@ -31,7 +31,7 @@ #define HAD_MAX_BUFFER ((1024 * 1024 - 1) & ~0x3f) #define HAD_DEFAULT_BUFFER (600 * 1024) /* default prealloc size */ #define HAD_MAX_PERIODS 256 /* arbitrary, but should suffice */ -#define HAD_MIN_PERIODS 2 +#define HAD_MIN_PERIODS 1 #define HAD_MAX_PERIOD_BYTES ((HAD_MAX_BUFFER / HAD_MIN_PERIODS) & ~0x3f) #define HAD_MIN_PERIOD_BYTES 1024 /* might be smaller */ #define HAD_FIFO_SIZE 0 /* fifo not being used */
In the current implementation, the driver may update the BDs even at PCM pointer callback. This allows us to skip the period interrupt effectively.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/x86/intel_hdmi_audio.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index 979b396272ae..9dacac2b833c 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -133,7 +133,8 @@ static const struct channel_map_table map_tables[] = { static const struct snd_pcm_hardware had_pcm_hardware = { .info = (SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_MMAP | - SNDRV_PCM_INFO_MMAP_VALID), + SNDRV_PCM_INFO_MMAP_VALID | + SNDRV_PCM_INFO_NO_PERIOD_WAKEUP), .formats = (SNDRV_PCM_FMTBIT_S16_LE | SNDRV_PCM_FMTBIT_S24_LE | SNDRV_PCM_FMTBIT_S32_LE), @@ -864,7 +865,9 @@ static void had_prog_bd(struct snd_pcm_substream *substream, int ofs = intelhaddata->pcmbuf_filled * intelhaddata->period_bytes; u32 addr = substream->runtime->dma_addr + ofs;
- addr |= AUD_BUF_VALID | AUD_BUF_INTR_EN; + addr |= AUD_BUF_VALID; + if (!substream->runtime->no_period_wakeup) + addr |= AUD_BUF_INTR_EN; had_write_register(intelhaddata, AUD_BUF_ADDR(idx), addr); had_write_register(intelhaddata, AUD_BUF_LEN(idx), intelhaddata->period_bytes);
On Tue, 07 Feb 2017 14:11:16 +0100, Takashi Iwai wrote:
Another day, another patchset. Here I put a largish cleanup patch to just shuffle the code, in addition to the new S16 and S32 formats support and the support for a single period with no period wakeup as we've discussed in the previous thread. One patch to remove BATCH flag is same and kept in the middle of this series.
I guess I can merge the first three patches now.
It turned out that S16 and S32 causes xrun with dmix by some reason. Still investigating, but we can postpone these, too.
Takashi
On Tue, 07 Feb 2017 16:44:12 +0100, Takashi Iwai wrote:
On Tue, 07 Feb 2017 14:11:16 +0100, Takashi Iwai wrote:
Another day, another patchset. Here I put a largish cleanup patch to just shuffle the code, in addition to the new S16 and S32 formats support and the support for a single period with no period wakeup as we've discussed in the previous thread. One patch to remove BATCH flag is same and kept in the middle of this series.
I guess I can merge the first three patches now.
It turned out that S16 and S32 causes xrun with dmix by some reason. Still investigating, but we can postpone these, too.
And this gave some interesting result. At first I thought this being a bug in the kernel driver. Then I found out that this is no kernel driver bug but some performance problem of dmix code, appearing only when accessing uncached pages on Cherrytrail. (The same issue may happen likely on other Atom platforms, but need to check.)
By using a generic dmix code with semaphore, this performance problem is resolved. So, S16/S32 supports are OK in the end.
But this leads to another question wrt the kernel driver code: why the driver allocates / maps with uncached page, not with write- combined? Pierre, Jerome, any clue?
The WC is a better choice than UC in general (of course only if possible), and this seems working fine as long as I've tested.
Below is the patch to convert to WC.
BTW, Ian Morrison (Cc'ed) has tested my recent changes intensively, including the BATCH flag removal. His tests were positive, so far. So I'm likely going to merge the stuff into for-next branch in this week until any regression is found.
One odd thing Ian reported is, however, the occasional GPU error (e.g. "Atomic update failure on pipe A"). IIRC, I've seen such GPU issues in the past while debugging, too. This seemed happening with the old branch, and it doesn't look directly related with my changes. My guess is that a fast repeated sequence of audio operations may block the GPU. That is, we might need to add some delay to avoid the too quick operations. I'll check more about this in tomorrow.
thanks,
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: x86: Use write-combine page attr instead of uncached
WC page attr seems working fine, and it's much better from performance POV.
Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/x86/intel_hdmi_audio.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/x86/intel_hdmi_audio.c b/sound/x86/intel_hdmi_audio.c index ac07c67a41cf..643a302ae883 100644 --- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -1126,9 +1126,9 @@ static int had_pcm_hw_params(struct snd_pcm_substream *substream, /* mark the pages as uncached region */ addr = (unsigned long) substream->runtime->dma_area; pages = (substream->runtime->dma_bytes + PAGE_SIZE - 1) / PAGE_SIZE; - retval = set_memory_uc(addr, pages); + retval = set_memory_wc(addr, pages); if (retval) { - dev_err(intelhaddata->dev, "set_memory_uc failed.Error:%d\n", + dev_err(intelhaddata->dev, "set_memory_wc failed.Error:%d\n", retval); return retval; } @@ -1296,7 +1296,7 @@ static snd_pcm_uframes_t had_pcm_pointer(struct snd_pcm_substream *substream) static int had_pcm_mmap(struct snd_pcm_substream *substream, struct vm_area_struct *vma) { - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); return remap_pfn_range(vma, vma->vm_start, substream->dma_buffer.addr >> PAGE_SHIFT, vma->vm_end - vma->vm_start, vma->vm_page_prot);
-----Original Message----- From: alsa-devel-bounces@alsa-project.org [mailto:alsa-devel-bounces@alsa- project.org] On Behalf Of Takashi Iwai Sent: Friday, February 10, 2017 12:59 AM To: alsa-devel@alsa-project.org Cc: Ian W MORRISON ianwmorrison@gmail.com; Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; Anand, Jerome jerome.anand@intel.com Subject: Re: [alsa-devel] [PATCH 0/6] Yet another experiments with LPE audio
On Tue, 07 Feb 2017 16:44:12 +0100, Takashi Iwai wrote:
On Tue, 07 Feb 2017 14:11:16 +0100, Takashi Iwai wrote:
Another day, another patchset. Here I put a largish cleanup patch to just shuffle the code, in addition to the new S16 and S32 formats support and the support for a single period with no period wakeup as we've discussed in the previous thread. One patch to remove BATCH flag is same and kept in the middle of this series.
I guess I can merge the first three patches now.
It turned out that S16 and S32 causes xrun with dmix by some reason. Still investigating, but we can postpone these, too.
And this gave some interesting result. At first I thought this being a bug in the kernel driver. Then I found out that this is no kernel driver bug but some performance problem of dmix code, appearing only when accessing uncached pages on Cherrytrail. (The same issue may happen likely on other Atom platforms, but need to check.)
By using a generic dmix code with semaphore, this performance problem is resolved. So, S16/S32 supports are OK in the end.
But this leads to another question wrt the kernel driver code: why the driver allocates / maps with uncached page, not with write- combined? Pierre, Jerome, any clue?
In CHT and BYT the organization of the hardware fabric is such that the HDMI DMA transactions are not snooped and so it will fetch data only from DDR. In most non-atom platforms it is snooped, and so fabric will return data from cache if it is updated.
In the past we faced problem where the DMA was fetching some old data because cache was not flushed into DDR. That's where we marked the pages as uncached.
On Fri, 10 Feb 2017 04:01:21 +0100, Ughreja, Rakesh A wrote:
-----Original Message----- From: alsa-devel-bounces@alsa-project.org [mailto:alsa-devel-bounces@alsa- project.org] On Behalf Of Takashi Iwai Sent: Friday, February 10, 2017 12:59 AM To: alsa-devel@alsa-project.org Cc: Ian W MORRISON ianwmorrison@gmail.com; Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com; Anand, Jerome jerome.anand@intel.com Subject: Re: [alsa-devel] [PATCH 0/6] Yet another experiments with LPE audio
On Tue, 07 Feb 2017 16:44:12 +0100, Takashi Iwai wrote:
On Tue, 07 Feb 2017 14:11:16 +0100, Takashi Iwai wrote:
Another day, another patchset. Here I put a largish cleanup patch to just shuffle the code, in addition to the new S16 and S32 formats support and the support for a single period with no period wakeup as we've discussed in the previous thread. One patch to remove BATCH flag is same and kept in the middle of this series.
I guess I can merge the first three patches now.
It turned out that S16 and S32 causes xrun with dmix by some reason. Still investigating, but we can postpone these, too.
And this gave some interesting result. At first I thought this being a bug in the kernel driver. Then I found out that this is no kernel driver bug but some performance problem of dmix code, appearing only when accessing uncached pages on Cherrytrail. (The same issue may happen likely on other Atom platforms, but need to check.)
By using a generic dmix code with semaphore, this performance problem is resolved. So, S16/S32 supports are OK in the end.
But this leads to another question wrt the kernel driver code: why the driver allocates / maps with uncached page, not with write- combined? Pierre, Jerome, any clue?
In CHT and BYT the organization of the hardware fabric is such that the HDMI DMA transactions are not snooped and so it will fetch data only from DDR. In most non-atom platforms it is snooped, and so fabric will return data from cache if it is updated.
In the past we faced problem where the DMA was fetching some old data because cache was not flushed into DDR. That's where we marked the pages as uncached.
Thanks, that's my expectation. The similar hacks are applied to some audio platforms like AMD HDMI HD-audio. But my question is about the write-combined (WC) pages. There are four modes about page caching: write-back (WB), writhe-through (WT), write-combined (WC) and uncached (UC).
Usually WC is enough to work as uncached like the use case above, not necessary to disable the whole cache via UC. WC worked fine for HD-audio, at least. For LPE audio, is UC really stated as mandatory requirement?
thanks,
Takashi
By using a generic dmix code with semaphore, this performance problem is resolved. So, S16/S32 supports are OK in the end.
But this leads to another question wrt the kernel driver code: why the driver allocates / maps with uncached page, not with write- combined? Pierre, Jerome, any clue?
In CHT and BYT the organization of the hardware fabric is such that the HDMI DMA transactions are not snooped and so it will fetch data only from DDR. In most non-atom platforms it is snooped, and so fabric will return data from cache if it is updated.
In the past we faced problem where the DMA was fetching some old data because cache was not flushed into DDR. That's where we marked the pages as uncached.
Thanks, that's my expectation. The similar hacks are applied to some audio platforms like AMD HDMI HD-audio. But my question is about the write-combined (WC) pages. There are four modes about page caching: write-back (WB), writhe-through (WT), write-combined (WC) and uncached (UC).
Usually WC is enough to work as uncached like the use case above, not necessary to disable the whole cache via UC. WC worked fine for HD-audio, at least. For LPE audio, is UC really stated as mandatory requirement?
No, as such there is no guidelines from HW guys. Technically WC would work as well.
Question I have is, when we use the smaller period size with say 2 periods with WC, how do we ensure that the last write has been propagated to the main memory. Last write size may be smaller than the WC cache size.
thanks,
Takashi
On Fri, 10 Feb 2017 09:06:14 +0100, Ughreja, Rakesh A wrote:
By using a generic dmix code with semaphore, this performance problem is resolved. So, S16/S32 supports are OK in the end.
But this leads to another question wrt the kernel driver code: why the driver allocates / maps with uncached page, not with write- combined? Pierre, Jerome, any clue?
In CHT and BYT the organization of the hardware fabric is such that the HDMI DMA transactions are not snooped and so it will fetch data only from DDR. In most non-atom platforms it is snooped, and so fabric will return data from cache if it is updated.
In the past we faced problem where the DMA was fetching some old data because cache was not flushed into DDR. That's where we marked the pages as uncached.
Thanks, that's my expectation. The similar hacks are applied to some audio platforms like AMD HDMI HD-audio. But my question is about the write-combined (WC) pages. There are four modes about page caching: write-back (WB), writhe-through (WT), write-combined (WC) and uncached (UC).
Usually WC is enough to work as uncached like the use case above, not necessary to disable the whole cache via UC. WC worked fine for HD-audio, at least. For LPE audio, is UC really stated as mandatory requirement?
No, as such there is no guidelines from HW guys. Technically WC would work as well.
Question I have is, when we use the smaller period size with say 2 periods with WC, how do we ensure that the last write has been propagated to the main memory. Last write size may be smaller than the WC cache size.
A good question. I vaguely remembered that it's rather cache align size, but I might be wrong here. It needs more check.
In anyway, WC cache can't help for dmix problem, as the dmix x86 implementation requires the read from the buffer, where the read on WC page ends up with the total flushing. So this can be postponed.
thanks,
Takashi
participants (3)
-
Pierre-Louis Bossart
-
Takashi Iwai
-
Ughreja, Rakesh A