lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 10 Jun 2021 08:43:06 +0200
From:   Christian König <christian.koenig@....com>
To:     Ondrej Zary <linux@...y.sk>
Cc:     Ben Skeggs <bskeggs@...hat.com>, dri-devel@...ts.freedesktop.org,
        nouveau@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer
 dereference in nouveau_bo_sync_for_device



Am 09.06.21 um 22:00 schrieb Ondrej Zary:
> On Wednesday 09 June 2021 11:21:05 Christian König wrote:
>> Am 09.06.21 um 09:10 schrieb Ondrej Zary:
>>> On Wednesday 09 June 2021, Christian König wrote:
>>>> Am 09.06.21 um 08:57 schrieb Ondrej Zary:
>>>>> [SNIP]
>>>>>> Thanks for the heads up. So the problem with my patch is already fixed,
>>>>>> isn't it?
>>>>> The NULL pointer dereference in nouveau_bo_wr16 introduced in
>>>>> 141b15e59175aa174ca1f7596188bd15a7ca17ba was fixed by
>>>>> aea656b0d05ec5b8ed5beb2f94c4dd42ea834e9d.
>>>>>
>>>>> That's the bug I hit when bisecting the original problem:
>>>>> NULL pointer dereference in nouveau_bo_sync_for_device
>>>>> It's caused by:
>>>>> # first bad commit: [e34b8feeaa4b65725b25f49c9b08a0f8707e8e86] drm/ttm: merge ttm_dma_tt back into ttm_tt
>>>> Good that I've asked :)
>>>>
>>>> Ok that's a bit strange. e34b8feeaa4b65725b25f49c9b08a0f8707e8e86 was
>>>> created mostly automated.
>>>>
>>>> Do you have the original backtrace of that NULL pointer deref once more?
>>> The original backtrace is here: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.org%2Flkml%2F2021%2F6%2F5%2F350&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C4309ff021d5e4cbe948b08d92b813106%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637588657045383056%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=t70c9ktzPJzDaEAcO4wpQMv3TUo5b53cUy66AkLeVwE%3D&amp;reserved=0
>> And the problem is that ttm_dma->dma_address is NULL, right? Mhm, I
>> don't see how that can happen since nouveau is using ttm_sg_tt_init().
>>
>> Apart from that what nouveau does here is rather questionable since you
>> need a coherent architecture for most things anyway, but that's not what
>> we are trying to fix here.
>>
>> Can you try to narrow down if ttm_sg_tt_init is called before calling
>> this function for the tt object in question?
> ttm_sg_tt_init is not called:
> [   12.150124] nouveau 0000:01:00.0: DRM: VRAM: 31 MiB
> [   12.150133] nouveau 0000:01:00.0: DRM: GART: 128 MiB
> [   12.150143] nouveau 0000:01:00.0: DRM: BMP version 5.6
> [   12.150151] nouveau 0000:01:00.0: DRM: No DCB data found in VBIOS
> [   12.151362] ttm_tt_init
> [   12.151370] ttm_tt_init_fields
> [   12.151374] ttm_tt_alloc_page_directory
> [   12.151615] BUG: kernel NULL pointer dereference, address: 00000000

Please add dump_stack(); to ttm_tt_init() and report back with the 
backtrace.

I can't see how this is called from the nouveau code, only possibility I 
see is that it is maybe called through the AGP code somehow.

Christian.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ