lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <512f9f0d-a399-27fb-08d0-7311b73fd2a1@quicinc.com>
Date:   Wed, 14 Dec 2022 15:46:46 -0800
From:   Abhinav Kumar <quic_abhinavk@...cinc.com>
To:     Doug Anderson <dianders@...omium.org>,
        Kuogee Hsieh <quic_khsieh@...cinc.com>
CC:     <robdclark@...il.com>, <sean@...rly.run>, <swboyd@...omium.org>,
        <vkoul@...nel.org>, <daniel@...ll.ch>, <airlied@...il.com>,
        <agross@...nel.org>, <dmitry.baryshkov@...aro.org>,
        <andersson@...nel.org>, <quic_sbillaka@...cinc.com>,
        <freedreno@...ts.freedesktop.org>,
        <dri-devel@...ts.freedesktop.org>, <linux-arm-msm@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq
 is not for aux transfer

Hi Doug

On 12/14/2022 2:29 PM, Doug Anderson wrote:
> Hi,
> 
> On Wed, Dec 14, 2022 at 1:21 PM Kuogee Hsieh <quic_khsieh@...cinc.com> wrote:
>>
>> There are 3 possible interrupt sources are handled by DP controller,
>> HPDstatus, Controller state changes and Aux read/write transaction.
>> At every irq, DP controller have to check isr status of every interrupt
>> sources and service the interrupt if its isr status bits shows interrupts
>> are pending. There is potential race condition may happen at current aux
>> isr handler implementation since it is always complete dp_aux_cmd_fifo_tx()
>> even irq is not for aux read or write transaction. This may cause aux read
>> transaction return premature if host aux data read is in the middle of
>> waiting for sink to complete transferring data to host while irq happen.
>> This will cause host's receiving buffer contains unexpected data. This
>> patch fixes this problem by checking aux isr and return immediately at
>> aux isr handler if there are no any isr status bits set.
>>
>> Follows are the signature at kernel logs when problem happen,
>> EDID has corrupt header
>> panel-simple-dp-aux aux-aea0000.edp: Couldn't identify panel via EDID
>> panel-simple-dp-aux aux-aea0000.edp: error -EIO: Couldn't detect panel nor find a fallback
>>
>> Signed-off-by: Kuogee Hsieh <quic_khsieh@...cinc.com>
>> ---
>>   drivers/gpu/drm/msm/dp/dp_aux.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c b/drivers/gpu/drm/msm/dp/dp_aux.c
>> index d030a93..8f8b12a 100644
>> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
>> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
>> @@ -423,6 +423,13 @@ void dp_aux_isr(struct drm_dp_aux *dp_aux)
>>
>>          isr = dp_catalog_aux_get_irq(aux->catalog);
>>
>> +       /*
>> +        * if this irq is not for aux transfer,
>> +        * then return immediately
>> +        */
> 
> Why do you need 4 lines for a comment that fits on one line?
Yes, we can fit this to one line.
> 
>> +       if (!isr)
>> +               return;
> 
> I can confirm that this works for me. I could reproduce the EDID
> problems in the past and I can't after this patch. ...so I could give
> a:
> 
> Tested-by: Douglas Anderson <dianders@...omium.org>
> 
> I'm not an expert on this part of the code, so feel free to ignore my
> other comments if everyone else thinks this patch is fine as-is, but
> to me something here feels a little fragile. It feels a little weird
> that we'll "complete" for _any_ interrupt that comes through now
> rather than relying on dp_aux_native_handler() / dp_aux_i2c_handler()
> to specifically identify interrupts that caused the end of the
> transfer. I guess that idea is that every possible interrupt we get
> causes the end of the transfer?
> 
> -Doug

So this turned out to be more tricky and was a good finding from kuogee.

In the bad EDID case, it was technically not bad EDID.

What was happening was, the VIDEO_READY interrupt was continuously 
firing. Ideally, this should fire only once but due to some error 
condition it kept firing. We dont exactly know why yet what was the 
error condition making it continuously fire.

In the DP ISR, the dp_aux_isr() gets called even if it was not an aux 
interrupt which fired (so the call flow in this case was 
dp_display_irq_handler (triggered for VIDEO_READY) ---> dp_aux_isr()
So we should certainly have some protection to return early from this 
routine if there was no aux interrupt which fired.

Which is what this fix is doing.

Its not completing any interrupt, its just returning early if no aux 
interrupt fired.

So based on the logs I have seen and given that its fixing this 
stability issue.

Fixes: c943b4948b58 ("drm/msm/dp: add displayPort driver support")

Reviewed-by: Abhinav Kumar <quic_abhinavk@...cinc.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ