[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <s5hr1b38fe7.wl-tiwai@suse.de>
Date: Fri, 26 Nov 2021 09:55:44 +0100
From: Takashi Iwai <tiwai@...e.de>
To: Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
Jaroslav Kysela <perex@...ex.cz>,
Takashi Iwai <tiwai@...e.com>,
Cezary Rojewski <cezary.rojewski@...el.com>,
Liam Girdwood <liam.r.girdwood@...ux.intel.com>,
Jie Yang <yang.jie@...ux.intel.com>,
alsa-devel@...a-project.org
Subject: Re: ALSA: hda: Make proper use of timecounter
On Wed, 24 Nov 2021 23:40:01 +0100,
Thomas Gleixner wrote:
>
> HDA uses a timecounter to read a hardware clock running at 24 MHz. The
> conversion factor is set with a mult value of 125 and a shift value of 0,
> which is not converting the hardware clock to nanoseconds, it is converting
> to 1/3 nanoseconds because the conversion factor from 24Mhz to nanoseconds
> is 125/3. The usage sites divide the "nanoseconds" value returned by
> timecounter_read() by 3 to get a real nanoseconds value.
>
> There is a lengthy comment in azx_timecounter_init() explaining this
> choice. That comment makes blatantly wrong assumptions about how
> timecounters work and what can overflow.
>
> The comment says:
>
> * Applying the 1/3 factor as part of the multiplication
> * requires at least 20 bits for a decent precision, however
> * overflows occur after about 4 hours or less, not a option.
>
> timecounters operate on time deltas between two readouts of a clock and use
> the mult/shift pair to calculate a precise nanoseconds value:
>
> delta_nsec = (delta_clock * mult) >> shift;
>
> The fractional part is also taken into account and preserved to prevent
> accumulated rounding errors. For details see cyclecounter_cyc2ns().
>
> The mult/shift pair has to be chosen so that the multiplication of the
> maximum expected delta value does not result in a 64bit overflow. As the
> counter wraps around on 32bit, the maximum observable delta between two
> reads is (1 << 32) - 1 which is about 178.9 seconds.
>
> That in turn means the maximum multiplication factor which fits into an u32
> will not cause a 64bit overflow ever because it's guaranteed that:
>
> ((1 << 32) - 1) ^ 2 < (1 << 64)
>
> The resulting correct multiplication factor is 2796202667 and the shift
> value is 26, i.e. 26 bit precision. The overflow of the multiplication
> would happen exactly at a clock readout delta of 6597069765 which is way
> after the wrap around of the hardware clock at around 274.8 seconds which
> is off from the claimed 4 hours by more than an order of magnitude.
>
> If the counter ever wraps around the last read value then the calculation
> is off by the number of wrap arounds times 178.9 seconds because the
> overflow cannot be observed.
>
> Use clocks_calc_mult_shift(), which calculates the most accurate mult/shift
> pair based on the given clock frequency, and remove the bogus comment along
> with the divisions at the readout sites.
>
> Fixes: 5d890f591d15 ("ALSA: hda: support for wallclock timestamps")
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
Looks like a nice fix / cleanup. Pierre, could you check it?
thanks,
Takashi
> ---
> sound/hda/hdac_stream.c | 14 ++++----------
> sound/pci/hda/hda_controller.c | 1 -
> sound/soc/intel/skylake/skl-pcm.c | 1 -
> 3 files changed, 4 insertions(+), 12 deletions(-)
>
> --- a/sound/hda/hdac_stream.c
> +++ b/sound/hda/hdac_stream.c
> @@ -534,17 +534,11 @@ static void azx_timecounter_init(struct
> cc->mask = CLOCKSOURCE_MASK(32);
>
> /*
> - * Converting from 24 MHz to ns means applying a 125/3 factor.
> - * To avoid any saturation issues in intermediate operations,
> - * the 125 factor is applied first. The division is applied
> - * last after reading the timecounter value.
> - * Applying the 1/3 factor as part of the multiplication
> - * requires at least 20 bits for a decent precision, however
> - * overflows occur after about 4 hours or less, not a option.
> + * Calculate the optimal mult/shift values. The counter wraps
> + * around after ~178.9 seconds.
> */
> -
> - cc->mult = 125; /* saturation after 195 years */
> - cc->shift = 0;
> + clocks_calc_mult_shift(&cc->mult, &cc->shift, 24000000,
> + NSEC_PER_SEC, 178);
>
> nsec = 0; /* audio time is elapsed time since trigger */
> timecounter_init(tc, cc, nsec);
> --- a/sound/pci/hda/hda_controller.c
> +++ b/sound/pci/hda/hda_controller.c
> @@ -504,7 +504,6 @@ static int azx_get_time_info(struct snd_
> snd_pcm_gettime(substream->runtime, system_ts);
>
> nsec = timecounter_read(&azx_dev->core.tc);
> - nsec = div_u64(nsec, 3); /* can be optimized */
> if (audio_tstamp_config->report_delay)
> nsec = azx_adjust_codec_delay(substream, nsec);
>
> --- a/sound/soc/intel/skylake/skl-pcm.c
> +++ b/sound/soc/intel/skylake/skl-pcm.c
> @@ -1251,7 +1251,6 @@ static int skl_platform_soc_get_time_inf
> snd_pcm_gettime(substream->runtime, system_ts);
>
> nsec = timecounter_read(&hstr->tc);
> - nsec = div_u64(nsec, 3); /* can be optimized */
> if (audio_tstamp_config->report_delay)
> nsec = skl_adjust_codec_delay(substream, nsec);
>
>
Powered by blists - more mailing lists