linux-kernel - Re: [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230418130837.zfueixeuxrallhtc@pengutronix.de>
Date:   Tue, 18 Apr 2023 15:08:37 +0200
From:   Uwe Kleine-König <u.kleine-koenig@...gutronix.de>
To:     Conor Dooley <conor.dooley@...rochip.com>
Cc:     Thierry Reding <thierry.reding@...il.com>,
        Daire McNamara <daire.mcnamara@...rochip.com>,
        linux-kernel@...r.kernel.org, linux-pwm@...r.kernel.org,
        linux-riscv@...ts.infradead.org
Subject: Re: [PATCH v16 1/2] pwm: add microchip soft ip corePWM driver

On Tue, Apr 18, 2023 at 12:27:33PM +0100, Conor Dooley wrote:
> Hey Uwe,
> 
> On Tue, Apr 11, 2023 at 06:25:54PM +0200, Uwe Kleine-König wrote:
> > On Tue, Apr 11, 2023 at 02:56:15PM +0100, Conor Dooley wrote:
> > > On Tue, Apr 11, 2023 at 12:55:47PM +0200, Uwe Kleine-König wrote:
> > > > On Tue, Apr 11, 2023 at 09:56:34AM +0100, Conor Dooley wrote:
> > > > > Add a driver that supports the Microchip FPGA "soft" PWM IP core.
> 
> > > > > +static int mchp_core_pwm_calc_period(const struct pwm_state *state, unsigned long clk_rate,
> > > > > +				     u16 *prescale, u16 *period_steps)
> > > > > +{
> > > > > +	u64 tmp;
> > > > > +	u32 remainder;
> > > > > +
> > > > > +	/*
> > > > > +	 * Calculate the period cycles and prescale values.
> > > > > +	 * The registers are each 8 bits wide & multiplied to compute the period
> > > > > +	 * using the formula:
> > > > > +	 *           (prescale + 1) * (period_steps + 1)
> > > > > +	 * period = -------------------------------------
> > > > > +	 *                      clk_rate
> > > > > +	 * so the maximum period that can be generated is 0x10000 times the
> > > > > +	 * period of the input clock.
> > > > > +	 * However, due to the design of the "hardware", it is not possible to
> > > > > +	 * attain a 100% duty cycle if the full range of period_steps is used.
> > > > > +	 * Therefore period_steps is restricted to 0xfe and the maximum multiple
> > > > > +	 * of the clock period attainable is (0xff + 1) * (0xfe + 1) = 0xff00
> > > > > +	 *
> > > > > +	 * The prescale and period_steps registers operate similarly to
> > > > > +	 * CLK_DIVIDER_ONE_BASED, where the value used by the hardware is that
> > > > > +	 * in the register plus one.
> > > > > +	 * It's therefore not possible to set a period lower than 1/clk_rate, so
> > > > > +	 * if tmp is 0, abort. Without aborting, we will set a period that is
> > > > > +	 * greater than that requested and, more importantly, will trigger the
> > > > > +	 * neg-/pos-edge issue described in the limitations.
> > > > > +	 */
> > > > > +	tmp = mul_u64_u64_div_u64(state->period, clk_rate, NSEC_PER_SEC);
> > > > > +	if (!tmp)
> > > > > +		return -EINVAL;
> > > > > +
> > > > > +	if (tmp >= MCHPCOREPWM_PERIOD_MAX) {
> > > > > +		*prescale = MCHPCOREPWM_PRESCALE_MAX;
> > > > > +		*period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX;
> > > > > +
> > > > > +		return 0;
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > > +	 * There are multiple strategies that could be used to choose the
> > > > > +	 * prescale & period_steps values.
> > > > > +	 * Here the idea is to pick values so that the selection of duty cycles
> > > > > +	 * is as finegrain as possible.
> > > > > +	 * This "optimal" value for prescale can be calculated using the maximum
> > > > > +	 * permitted value of period_steps, 0xfe.
> > > > > +	 *
> > > > > +	 *                period * clk_rate
> > > > > +	 * prescale = ------------------------- - 1
> > > > > +	 *            NSEC_PER_SEC * (0xfe + 1)
> > > > > +	 *
> > > > > +	 * However, we are purely interested in the integer upper bound of this
> > > > > +	 * calculation, so this division should be rounded up before subtracting
> > > > > +	 * 1
> > > > > +	 *
> > > > > +	 *  period * clk_rate
> > > > > +	 * ------------------- was precomputed as `tmp`
> > > > > +	 *    NSEC_PER_SEC
> > > > > +	 */
> > > > > +	*prescale = DIV64_U64_ROUND_UP(tmp, MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1;
> > > > 
> > > > If state->period * clk_rate is 765000000001 you get tmp = 765 and then
> > > > *prescale = 2. However roundup(765000000001 / (1000000000 * 255)) - 1 is
> > > > 3. The problem here is that you're rounding down in the calculation of
> > > > tmp. Of course this is constructed because 765000000001 is prime, but
> > > > I'm sure you get the point :-)
> > > 
> > > Hold that thought for a moment..
> > 
> > OK, so the correction below is to make up for the wrong rounding here.
> > I'd like to have that in a comment. Otherwise it wasn't clear to me.
> 
> Sure, no problem.
> 
> > > > Also we know that tmp is < 0xff00, so we don't need a 64 bit division
> > > > here.
> > > 
> > > Neither here nor below, true.
> > > 
> > > > > +	/*
> > > > > +	 * Because 0xff is not a permitted value some error will seep into the
> > > > > +	 * calculation of prescale as prescale grows. Specifically, this error
> > > > > +	 * occurs where the remainder of the prescale calculation is less than
> > > > > +	 * prescale.
> > > > > +	 * For small values of prescale, only a handful of values will need
> > > > > +	 * correction, but overall this applies to almost half of the valid
> > > > > +	 * values for tmp.
> > > > > +	 *
> > > > > +	 * To keep the algorithm's decision making consistent, this case is
> > > > > +	 * checked for and the simple solution is to, in these cases,
> > > > > +	 * decrement prescale and check that the resulting value of period_steps
> > > > > +	 * is valid.
> > > > > +	 *
> > > > > +	 * period_steps can be computed from prescale:
> > > > > +	 *                      period * clk_rate
> > > > > +	 * period_steps = ----------------------------- - 1
> > > > > +	 *                NSEC_PER_SEC * (prescale + 1)
> > > > > +	 *
> > > > > +	 */
> > > > > +	div_u64_rem(tmp, (MCHPCOREPWM_PERIOD_STEPS_MAX + 1), &remainder);
> > > > > +	if (remainder < *prescale) {
> > > > > +		u16 smaller_prescale = *prescale - 1;
> > > > > +
> > > > > +		*period_steps = div_u64(tmp, smaller_prescale + 1) - 1;
> > > > > +		if (*period_steps < 255) {
> > > > > +			*prescale = smaller_prescale;
> > > > > +
> > > > > +			return 0;
> > > > > +		}
> > > > > +	}
> > > 
> > > ...so in your prime case above, we would initially compute a prescale
> > > value that is too large, and then wind up hitting the test of the
> > > remainder here, thereby realising that the smaller prescale value is a
> > > better fit?
> > > Perhaps that's not an acceptable way to handle the issue though.
> > 
> > IMHO it is, but the comment explaining needs some improvement. It should
> > also make clear why rounding cannot lead to prescale - 2 being a
> > good/better candidate. (I haven't thought it through, maybe needs some
> > improvement?) I wonder if the computation can find all improvements
> > given that it only used tmp, but not ->period and clk_rate?!
> 
> For tmp > 0xff00 it may be off by more than 255, but that isn't possible
> at this stage as we've already eliminated the possibility above.
> 
> > > > I don't understand that part. It triggers for tmp = 511. So you prefer
> > > > 
> > > > 	prescale = 1
> > > > 	period_steps = 254
> > > > 
> > > > yielding period = 510 / clkrate over
> > > > 
> > > > 	prescale = 2
> > > > 	period_steps = 170
> > > > 
> > > > yielding 513 / clkrate. I wonder why.
> > 
> > I missed the -= 1 in my example. So it's:
> > 
> > 	It triggers for tmp = 511. So you prefer
> > 
> > 		prescale = 1
> > 		period_steps = 254
> > 
> > 	yielding period = 510 / clkrate over
> > 
> > 		prescale = 2
> > 		period_steps = 169
> > 
> > 	yielding 510 / clkrate.
> > 
> > Here it's obvious that the former is the better one. But I wonder why
> > the former isn't found instantly. Wouldn't
> > 
> > 	*prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1
> > 
> > give a better approximation in general? (Of course with an additional
> > check that *prescale >= 0 then.) 
> > 
> > ... thinking a bit ... yes, I think that's true:
> > 
> > If you pick *prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1,
> > then for each period_steps value ≤ 254 we have:
> > 
> > 	  (*prescale + 1) * (period_steps + 1)
> > 	≤ (*prescale + 1) * 255
> > 	≤ (tmp // (MCHPCOREPWM_PERIOD_STEPS_MAX + 1)) * 255
> > 	≤ (tmp // 255) * 255
> > 	≤ (tmp / 255) * 255
> > 	= tmp
> 
> tmp = 256
> 
> *prescale = 256 // (254 + 1) - 1
>           ≈ 0
> 
> *prescale = 256 // (prescale + 1) - 1
>           = 256 / (0 + 1) - 1
> 	  = 255

I don't understand what you wanna say here. If tmp = 256 my suggestion
is to pick prescale = 0 and period_steps = 254. Then

	 (prescale + 1) * (period_steps + 1) ≤ tmp

and period_steps is big enough to ensure a a finegrained choice for the
duty_cycle.

> That's then gonna give us one of the broken configurations from the
> limitations.
> 
> tmp = 257
> 
> *prescale = 257 // (254 + 1) - 1
>           ≈ 0
> 
> *prescale = 257 // (prescale + 1) - 1
>           = 257 / (0 + 1) - 1
> 	  = 256
> 	  = 0 (registers are 8-bit)

I think you mean s/prescale/period_steps/ in the second part, but that's
not what I meant. I meant to suggest:

	*prescale = tmp / (MCHPCOREPWM_PERIOD_STEPS_MAX + 1) - 1
	period_steps = MCHPCOREPWM_PERIOD_STEPS_MAX = 254

> I'm quite obviously missing something that you may think is obvious
> here, but is not immediately clear to me.

That would be an explanation, yes. :-)

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)