lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAFs7P=hZVfUnTKYgOUwfwT6S9TO72iuJMBPbBG0i+U-4Au=O=Q@mail.gmail.com>
Date:   Tue, 12 Sep 2023 12:23:14 -0400
From:   Joshua Pius <joshuapius@...gle.com>
To:     Ankit K Nautiyal <ankit.k.nautiyal@...el.com>
Cc:     Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
        intel-gfx@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        dri-devel@...ts.freedesktop.org,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Pablo Ceballos <pceballos@...gle.com>,
        Niko Tsirakis <ntsirakis@...gle.com>
Subject: Re: [v3] drm/i915/display/lspcon: Increase LSPCON mode settle timeout

Yes, we've proposed this change before. The reasoning is still the
same. Added below to include in this thread as well. Is there a reason
the below explanation and test is not sufficient?

This issue affected several different CometLake-based Chrome OS device
designs. The details of the original report are in the Google partner
issue tracker (issue # 178169843), but I believe this requires a
Google partner account to access:
https://partnerissuetracker.corp.google.com/issues/178169843

The summary is that we were seeing these "*ERROR* LSPCON mode hasn't
settled" messages in the kernel logs followed by the display not
working at all. We increased the timeout to 500ms while investigation
continued and this reduced the number of occurrences of this issue:
https://chromium.googlesource.com/chromiumos/third_party/kernel/+/7b2899fc1a6f9409e8075b3153baaf02c4d1fc75

The problem continued to occur on about 2% of devices even after
increasing the timeout to 500ms. The investigation continued in issue
# 188035814, with engineers from Parade and Intel involved.
Ultimately, the recommendation from Intel engineers was to increase
the timeout further:
https://partnerissuetracker.corp.google.com/issues/188035814

The timeout was then increased to 1000ms:
https://chromium.googlesource.com/chromiumos/third_party/kernel/+/a16cfc2062e768c8e5ad8fa09b8ca127aa1ead9a

I recently ran 100 reboot trials on one device and found that the
median time for the LSPCON mode to settle was 440ms and the max was
444ms. But we know from the original reports that even after we set
the timeout to 500ms the issue continued to happen on some small
percentage of devices. So this is why I picked the larger value of
800ms.

>> This is to eliminate all cases of "*ERROR* LSPCON mode hasn't settled",
>> followed by link training errors. Intel engineers recommended increasing
>> this timeout and that does resolve the issue.
>>
>> On some CometLake-based device designs the Parade PS175 takes more than
>> 400ms to settle in PCON mode. 100 reboot trials on one device resulted
>> in a median settle time of 440ms and a maximum of 444ms. Even after
>> increasing the timeout to 500ms, 2% of devices still had this error. So
>> this increases the timeout to 800ms.
>>
>> Signed-off-by: Pablo Ceballos <pceballos@...gle.com>
>
>I think we've been here before. Do you have a publicly available gitlab
>issue with the proper logs? If not, please file one at [1].
>
>BR,
>Jani.
>
>[1] https://gitlab.freedesktop.org/drm/intel/issues/new
>
>
>> ---
>>
>> V2: Added more details in the commit message
>> V3: Only apply the increased timeout if the vendor is Parade
>>
>> drivers/gpu/drm/i915/display/intel_lspcon.c | 21 ++++++++++++++++++++-
>>  1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/display/intel_lspcon.c b/drivers/gpu/drm/i915/display/intel_lspcon.c
>> index bb3b5355a0d9..b07eab84cc63 100644
>> --- a/drivers/gpu/drm/i915/display/intel_lspcon.c
>> +++ b/drivers/gpu/drm/i915/display/intel_lspcon.c
>> @@ -153,6 +153,24 @@ static enum drm_lspcon_mode lspcon_get_current_mode(struct intel_lspcon *lspcon)
>>   return current_mode;
>>  }
>>
>> +static u32 lspcon_get_mode_settle_timeout(struct intel_lspcon *lspcon)
>> +{
>> + u32 timeout_ms = 400;
>> +
>> + /*
>> + * On some CometLake-based device designs the Parade PS175 takes more
>> + * than 400ms to settle in PCON mode. 100 reboot trials on one device
>> + * resulted in a median settle time of 440ms and a maximum of 444ms.
>> + * Even after increasing the timeout to 500ms, 2% of devices still had
>> + * this error. So this sets the timeout to 800ms.
>> + */
>> + if (lspcon->vendor == LSPCON_VENDOR_PARADE)
>> + timeout_ms = 800;
>> +
>> + return timeout_ms;
>> +}
>> +
>> +
>>  static enum drm_lspcon_mode lspcon_wait_mode(struct intel_lspcon *lspcon,
>>       enum drm_lspcon_mode mode)
>>  {
>> @@ -167,7 +185,8 @@ static enum drm_lspcon_mode lspcon_wait_mode(struct intel_lspcon *lspcon,
>>   drm_dbg_kms(&i915->drm, "Waiting for LSPCON mode %s to settle\n",
>>      lspcon_mode_name(mode));
>>
>> - wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 400);
>> + wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode,
>> + lspcon_get_mode_settle_timeout(lspcon));
>>   if (current_mode != mode)
>>   drm_err(&i915->drm, "LSPCON mode hasn't settled\n");

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ