lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aa20591f-3939-4776-9025-b8d7159f4c63@linaro.org>
Date: Thu, 11 Jul 2024 12:22:30 +0100
From: Bryan O'Donoghue <bryan.odonoghue@...aro.org>
To: Quentin Schulz <quentin.schulz@...rry.de>,
 Sakari Ailus <sakari.ailus@...ux.intel.com>,
 Mauro Carvalho Chehab <mchehab@...nel.org>,
 Quentin Schulz <quentin.schulz@...obroma-systems.com>,
 Jacopo Mondi <jacopo@...ndi.org>
Cc: Johan Hovold <johan@...nel.org>,
 Kieran Bingham <kieran.bingham@...asonboard.com>,
 linux-media@...r.kernel.org, linux-kernel@...r.kernel.org,
 stable@...r.kernel.org
Subject: Re: [PATCH 2/2] media: ov5675: Elongate reset to first transaction
 minimum gap

On 11/07/2024 11:40, Quentin Schulz wrote:
> Hi Bryan,
> 
> On 7/11/24 12:20 PM, Bryan O'Donoghue wrote:
>> The ov5675 specification says that the gap between XSHUTDN deassert 
>> and the
>> first I2C transaction should be a minimum of 8192 XVCLK cycles.
>>
>> Right now we use a usleep_rage() that gives a sleep time of between about
>> 430 and 860 microseconds.
>>
>> On the Lenovo X13s we have observed that in about 1/20 cases the current
>> timing is too tight and we start transacting before the ov5675's reset
>> cycle completes, leading to I2C bus transaction failures.
>>
>> The reset racing is sometimes triggered at initial chip probe but, more
>> usually on a subsequent power-off/power-on cycle e.g.
>>
>> [   71.451662] ov5675 24-0010: failed to write reg 0x0103. error = -5
>> [   71.451686] ov5675 24-0010: failed to set plls
>>
>> The current quiescence period we have is too tight, doubling the minimum
>> appears to fix the issue observed on X13s.
>>
>> Fixes: 49d9ad719e89 ("media: ov5675: add device-tree support and 
>> support runtime PM")
>> Cc: stable@...r.kernel.org
>> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@...aro.org>
>> ---
>>   drivers/media/i2c/ov5675.c | 9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/media/i2c/ov5675.c b/drivers/media/i2c/ov5675.c
>> index 92bd35133a5d..0498f8f3064d 100644
>> --- a/drivers/media/i2c/ov5675.c
>> +++ b/drivers/media/i2c/ov5675.c
>> @@ -1018,8 +1018,13 @@ static int ov5675_power_on(struct device *dev)
>>       gpiod_set_value_cansleep(ov5675->reset_gpio, 0);
>> -    /* 8192 xvclk cycles prior to the first SCCB transation */
>> -    usleep_range(delay_us, delay_us * 2);
>> +    /* The spec calls for a minimum delay of 8192 XVCLK cycles prior to
>> +     * transacting on the I2C bus, which translates to about 430
>> +     * microseconds at 19.2 MHz.
>> +     * Testing shows the range 8192 - 16384 cycles to be unreliable.
>> +     * Grant a more liberal 2x -3x clock cycle grace time.
>> +     */
>> +    usleep_range(delay_us * 2, delay_us * 3);
> 
> Would it make sense to have power_off have the same logic? We do a 
> usleep_range of those same values currently, so keeping them in sync 
> seems to make sense to me.

I have no evidence to suggest there's a problem on the shutdown path, 
that's why I left the quiescence period as-is there.

> Also, I'm wondering if it isn't an issue with the gpio not being high 
> right after gpoiod_set_value_cansleep() returns, i.e. the time it 
> actually takes for the HW to reach the IO level that means "high" for 
> the camera. And that this increased sleep is just a way to mitigate that?

No, that's not what I found.

I tried changing

         usleep_range(2000, 2200);

to
         usleep_range(200000, 220000);

but could still elicit the I2C transaction failure. If the time it took 
for the GPIO to hit logical 1 were the issue then multiplying the reset 
time by 100 would certainly account for that.

// BOD set the chip into reset
         gpiod_set_value_cansleep(ov5675->reset_gpio, 1);

// BOD apply power
         ret = regulator_bulk_enable(OV5675_NUM_SUPPLIES, ov5675->supplies);
         if (ret) {
                 clk_disable_unprepare(ov5675->xvclk);
                 return ret;
         }

         /* Reset pulse should be at least 2ms and reset gpio released 
only once
          * regulators are stable.
          */

// BOD spec specifies 2 milliseconds here not a count of XVCLKs
         usleep_range(2000, 2200);

         gpiod_set_value_cansleep(ov5675->reset_gpio, 0);

// BOD spec calls for a _minimum_ of 8192 XVCLK cycles before I2C
         /* 8192 xvclk cycles prior to the first SCCB transation */
         usleep_range(delay_us, delay_us * 2);

The issue is initiating an I2C transaction too early _after_ reset 
completion not the duration of that reset.

As I stated in the cover letter, I tried a longer reset duration, a 
higher drive-strength on the GPIO as well as I didn't put in my cover 
letter, inverting the logic of the GPIO reset, which unsurprisingly 
didn't work.

No matter how long we hold the chip in reset, unless we give more grace 
time _subsequent_ to the reset before initiating an I2C transaction, we 
will encounter transaction failures.

This is a fairly common and logical fault if you think about it.

XVCLK is providing a clock to the ov5675 core to "do stuff" whatever 
that stuff is. Bring up an internal firmware, lock a fundamental PLL - 
whatever.

If we start an I2C transaction before the hypothetical internal core has 
booted up then - meh no bueno we'll get no transaction response.

That's the error - speaking too soon.

A little like myself in the mornings, cranky before I've had my coffee 
and unresponsive.

;)

> With this patch we essentially postpone the power_on by another 430ms 
> making it almost a full second before we can start using the camera. 
> That's quite a lot I think? We don't have a usecase right now that 
> requires this to be blazing fast (and we anyway would need at the very 
> least 430ms), so take this remark as what it is, a remark.

Not a full second, a millisecond.

8/10ths of 1 millisecond instead of 4/10ths of one millisecond.

19.2MHz is 52.083333333333 nanoseconds per clock

52.083333333333 * 8192 => 426666 nanoseconds => 0.426666 milliseconds or 
426.6 microseconds

So our post reset quiesence minimum @ 19.2MHz moves from 426.6 
microseconds to 853.

> The change looks fine to me even though it feels like a band-aid patch.

I mean it's not a second - if you feel very strongly that 426 
milliseconds * 2 is wrong, I guess I could add some more complex logic 
however I like this simple fix for backporting.

---
bod

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ