lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <96e2e13c-f01c-4baf-a9a3-cbaa48fb10c7@amd.com>
Date:   Mon, 20 Nov 2023 17:24:14 +0100
From:   Christian König <christian.koenig@....com>
To:     Alex Deucher <alexdeucher@...il.com>,
        Christian König <ckoenig.leichtzumerken@...il.com>
Cc:     Dave Airlie <airlied@...il.com>,
        Linux regressions mailing list <regressions@...ts.linux.dev>,
        linux-kernel@...r.kernel.org,
        "amd-gfx@...ts.freedesktop.org" <amd-gfx@...ts.freedesktop.org>,
        Luben Tuikov <luben.tuikov@....com>,
        dri-devel@...ts.freedesktop.org, Phillip Susi <phill@...susis.net>,
        Alex Deucher <alexander.deucher@....com>
Subject: Re: Radeon regression in 6.6 kernel

Am 20.11.23 um 17:08 schrieb Alex Deucher:
> On Mon, Nov 20, 2023 at 10:57 AM Christian König
> <ckoenig.leichtzumerken@...il.com> wrote:
>> Am 19.11.23 um 07:47 schrieb Dave Airlie:
>>>> On 12.11.23 01:46, Phillip Susi wrote:
>>>>> I had been testing some things on a post 6.6-rc5 kernel for a week or
>>>>> two and then when I pulled to a post 6.6 release kernel, I found that
>>>>> system suspend was broken.  It seems that the radeon driver failed to
>>>>> suspend, leaving the display dead, the wayland display server hung, and
>>>>> the system still running.  I have been trying to bisect it for the last
>>>>> few days and have only been able to narrow it down to the following 3
>>>>> commits:
>>>>>
>>>>> There are only 'skip'ped commits left to test.
>>>>> The first bad commit could be any of:
>>>>> 56e449603f0ac580700621a356d35d5716a62ce5
>>>>> c07bf1636f0005f9eb7956404490672286ea59d3
>>>>> b70438004a14f4d0f9890b3297cd66248728546c
>>>>> We cannot bisect more!
>>>> Hmm, not a single reply from the amdgpu folks. Wondering how we can
>>>> encourage them to look into this.
>>>>
>>>> Phillip, reporting issues by mail should still work, but you might have
>>>> more luck here, as that's where the amdgpu afaics prefer to track bugs:
>>>> https://gitlab.freedesktop.org/drm/amd/-/issues
>>>>
>>>> When you file an issue there, please mention it here.
>>>>
>>>> Furthermore it might help if you could verify if 6.7-rc1 (or rc2, which
>>>> comes out later today) or 6.6.2-rc1 improve things.
>>> It would also be good to test if reverting any of these is possible or not.
>> Well none of the commits mentioned can affect radeon in any way. Radeon
>> simply doesn't use the scheduler.
>>
>> My suspicion is that the user is actually using amdgpu instead of
>> radeon. The switch potentially occurred accidentally, for example by
>> compiling amdgpu support for SI/CIK.
>>
>> Those amdgpu problems for older ASIC have already been worked on and
>> should be fixed by now.
> In this case it's a navi23 (so radeon in the marketing sense).

Thanks, couldn't find that in the mail thread.

In that case those are the already known problems with the scheduler 
changes, aren't they?

Christian.

>
> Alex
>
>> Regards,
>> Christian.
>>
>>> File the gitlab issue and we should poke amd a but more to take a look.
>>>
>>> Dave.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ