linux-kernel - Re: [PATCH v1 2/3] drm/msm/dp: decoupling dp->id out of dp controller_id at scxxxx_dp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ba89c17a-bf8d-0eda-d975-26e88d43e0aa@quicinc.com>
Date:   Tue, 28 Jun 2022 08:22:47 -0700
From:   Kuogee Hsieh <quic_khsieh@...cinc.com>
To:     Doug Anderson <dianders@...omium.org>,
        Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
CC:     Abhinav Kumar <quic_abhinavk@...cinc.com>,
        Stephen Boyd <swboyd@...omium.org>,
        Andy Gross <agross@...nel.org>,
        David Airlie <airlied@...ux.ie>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        "Daniel Vetter" <daniel@...ll.ch>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        "Rob Clark" <robdclark@...il.com>, Sean Paul <sean@...rly.run>,
        Vinod Koul <vkoul@...nel.org>,
        "Aravind Venkateswaran (QUIC)" <quic_aravindh@...cinc.com>,
        Sankeerth Billakanti <quic_sbillaka@...cinc.com>,
        freedreno <freedreno@...ts.freedesktop.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 2/3] drm/msm/dp: decoupling dp->id out of dp
 controller_id at scxxxx_dp_cfg table


On 6/27/2022 4:20 PM, Doug Anderson wrote:
> Hi,
>
> On Sat, Jun 25, 2022 at 1:48 AM Dmitry Baryshkov
> <dmitry.baryshkov@...aro.org> wrote:
>> On Sat, 25 Jun 2022 at 04:23, Abhinav Kumar <quic_abhinavk@...cinc.com> wrote:
>>> On 6/24/2022 5:11 PM, Dmitry Baryshkov wrote:
>>>> On Sat, 25 Jun 2022 at 03:03, Abhinav Kumar <quic_abhinavk@...cinc.com> wrote:
>>>>> On 6/24/2022 4:56 PM, Kuogee Hsieh wrote:
>>>>> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
>>>>> b/drivers/gpu/drm/msm/dp/dp_display.c
>>>>> index dcd80c8a794c..7816e82452ca 100644
>>>>> --- a/drivers/gpu/drm/msm/dp/dp_display.c
>>>>> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
>>>>> @@ -140,8 +140,8 @@ static const struct msm_dp_config sc7180_dp_cfg = {
>>>>>
>>>>>     static const struct msm_dp_config sc7280_dp_cfg = {
>>>>>            .descs = (const struct msm_dp_desc[]) {
>>>>> -               [MSM_DP_CONTROLLER_0] = { .io_start = 0x0ae90000,
>>>>> .connector_type = DRM_MODE_CONNECTOR_DisplayPort, .wide_bus_en = true },
>>>>>                    [MSM_DP_CONTROLLER_1] = { .io_start = 0x0aea0000,
>>>>> .connector_type = DRM_MODE_CONNECTOR_eDP, .wide_bus_en = true },
>>>>> +               [MSM_DP_CONTROLLER_0] = { .io_start = 0x0ae90000,
>>>>> .connector_type = DRM_MODE_CONNECTOR_DisplayPort, .wide_bus_en = true },
>>>>>            },
>>>>>            .num_descs = 2,
>>>>>     };
>>>>>
>>>>>
>>>>> The reason order is important is because  in this function below, even
>>>>> though it matches the address to find which one to use it loops through
>>>>> the array and so the value of *id will change depending on which one is
>>>>> located where.
>>>>>
>>>>> static const struct msm_dp_desc *dp_display_get_desc(struct
>>>>> platform_device *pdev,
>>>>>                                 unsigned int *id)
>>>>> {
>>>>>        const struct msm_dp_config *cfg = of_device_get_match_data(&pdev->dev);
>>>>>        struct resource *res;
>>>>>        int i;
>>>>>
>>>>>        res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>>>>>        if (!res)
>>>>>            return NULL;
>>>>>
>>>>>        for (i = 0; i < cfg->num_descs; i++) {
>>>>>            if (cfg->descs[i].io_start == res->start) {
>>>>>                *id = i;
>>>> The id is set to the index of the corresponding DP instance in the
>>>> descs array, which is MSM_DP_CONTROLLER_n. Correct up to now.
>>> Right, this is where I misunderstood his explanation.
>>>
>>> Even if we swap the order, but retain the index correctly it will still
>>> work today.
>>>
>>> Hes not sure of the root-cause of why turning on the primary display
>>> first fixes the issue.
>>>
>>> I think till we root-cause that, lets put this on hold.
>> Agreed. Let's find the root cause.
> FWIW, I was poking a little bit about the glitch that Kuogee was
> trying to fix here. Through a bunch of hacky heuristics I think the
> dpu_hw_ctl_trigger_flush_v1() is the function that "causes" the
> corruption. Specifically I managed to do something like:
>
> if (hacky_heuristic)
>    pr_info("About to call flush\n);
>    mdelay(2000);
> }
> ctl->ops.trigger_flush(ctl)
> if (hacky_heuristic)
>    pr_info("Finished calling flush\n);
>    mdelay(2000);
>    pr_info("Finished calling flush delay done\n);
> }

flush bit need to up update at real time.

otherwise unexpected side effects will happen.

i try same thing, but I got fence timeout error.

Anyway, I had submit new patch to fix corruption issue.

Thanks for your efforts and helps.

> I then watched my display and reproduced the problem. When I saw the
> problem I looked over at the console and saw "Finished calling flush"
> was the last thing printed.
>
> I don't know if this helps much. Presumably we're flushing a bunch of
> previous operations so whatever we had queued up probably matters
> more, but maybe it'll give a clue?
>
>
> Other notes FWIW:
>
> * If you increase the amount of time of the glitching, you can
> actually see that we are glitching both the internal and external
> displays.
>
> * You can actually make the glitch stay on the screen "permanently" by
> unplugging the external display while the internal screen is off. I
> don't know why it doesn't always happen, but it seems to happen often
> enough. Presumably if someone knew the display controller well they
> could try to figure out what state it was in and debug the problem.
>
>
> -Doug