lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5eb58748-72e0-3eb4-593a-6e482133af17@amd.com>
Date:   Thu, 25 Mar 2021 09:29:18 +0100
From:   Christian König <christian.koenig@....com>
To:     Oleksandr Natalenko <oleksandr@...alenko.name>,
        linux-kernel@...r.kernel.org
Cc:     Ilkka Prusi <ilkka.prusi@...inet.fi>,
        Chris Rankin <rankincj@...il.com>,
        Huang Rui <ray.huang@....com>, David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        dri-devel@...ts.freedesktop.org, linux-media@...r.kernel.org,
        linaro-mm-sig@...ts.linaro.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: WARNING: AMDGPU DRM warning in 5.11.9

Hi,

Am 25.03.21 um 09:17 schrieb Oleksandr Natalenko:
> Hello.
>
> On Thu, Mar 25, 2021 at 07:57:33AM +0200, Ilkka Prusi wrote:
>> On 24.3.2021 16.16, Chris Rankin wrote:
>>> Hi,
>>>
>>> Theee warnings ares not present in my dmesg log from 5.11.8:
>>>
>>> [   43.390159] ------------[ cut here ]------------
>>> [   43.393574] WARNING: CPU: 2 PID: 1268 at
>>> drivers/gpu/drm/ttm/ttm_bo.c:517 ttm_bo_release+0x172/0x282 [ttm]
>>> [   43.401940] Modules linked in: nf_nat_ftp nf_conntrack_ftp cfg80211
>> Changing WARN_ON to WARN_ON_ONCE in drivers/gpu/drm/ttm/ttm_bo.c
>> ttm_bo_release() reduces the flood of messages into single splat.
>>
>> This warning appears to come from 57fcd550eb15bce ("drm/ttm: Warn on pinning
>> without holding a reference)" and reverting it might be one choice.
>>
>>
>>> There are others, but I am assuming there is a common cause here.
>>>
>>> Cheers,
>>> Chris
>>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index a76eb2c14e8c..50b53355b265 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -514,7 +514,7 @@ static void ttm_bo_release(struct kref *kref)
>>                   * shrinkers, now that they are queued for
>>                   * destruction.
>>                   */
>> -               if (WARN_ON(bo->pin_count)) {
>> +               if (WARN_ON_ONCE(bo->pin_count)) {
>>                          bo->pin_count = 0;
>>                          ttm_bo_del_from_lru(bo);
>>                          ttm_bo_add_mem_to_lru(bo, &bo->mem);
>>
>>
>>
>> --
>>   - Ilkka
>>
> WARN_ON_ONCE() will just hide the underlying problem. Do we know why
> this happens at all?

The patch was incorrectly back ported to 5.11 without also porting the 
driver changes to not trigger this warning back as well.

We are probably going to revert it for 5.11.10.

Regards,
Christian.

>
> Same for me, BTW, with v5.11.9:
>
> ```
> [~]> lspci | grep VGA
> 0a:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Lexa PRO [Radeon 540/540X/550/550X / RX 540X/550/550X] (rev c7)
>
> [ 3676.033140] ------------[ cut here ]------------
> [ 3676.033153] WARNING: CPU: 7 PID: 1318 at drivers/gpu/drm/ttm/ttm_bo.c:517 ttm_bo_release+0x375/0x500 [ttm]
> …
> [ 3676.033340] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS 3302 03/05/2021
> …
> [ 3676.033469] Call Trace:
> [ 3676.033473]  ttm_bo_move_accel_cleanup+0x1ab/0x3a0 [ttm]
> [ 3676.033478]  amdgpu_bo_move+0x334/0x860 [amdgpu]
> [ 3676.033580]  ttm_bo_validate+0x1f1/0x2d0 [ttm]
> [ 3676.033585]  amdgpu_cs_bo_validate+0x9b/0x1c0 [amdgpu]
> [ 3676.033665]  amdgpu_cs_list_validate+0x115/0x150 [amdgpu]
> [ 3676.033743]  amdgpu_cs_ioctl+0x873/0x20a0 [amdgpu]
> [ 3676.033960]  drm_ioctl_kernel+0xb8/0x140 [drm]
> [ 3676.033977]  drm_ioctl+0x222/0x3c0 [drm]
> [ 3676.034071]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
> [ 3676.034145]  __x64_sys_ioctl+0x83/0xb0
> [ 3676.034149]  do_syscall_64+0x33/0x40
> …
> [ 3676.034171] ---[ end trace 66e9865b027112f3 ]---
> ```
>
> Thanks.
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ