lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51160271-197b-035f-34f6-22de5468d5b9@amd.com>
Date:   Mon, 12 Jul 2021 21:22:40 +0200
From:   Christian König <christian.koenig@....com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Jon Masters <jcm@...masters.org>,
        Matthew Auld <matthew.auld@...el.com>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>
Subject: Re: Linux 5.14-rc1

Hi guys,

Am 12.07.21 um 21:14 schrieb Linus Torvalds:
> On Mon, Jul 12, 2021 at 12:08 AM Jon Masters <jcm@...masters.org> wrote:
>> I happened to be installing a Fedora 34 (x86) VM for something and did a
>> test kernel compile that hung on boot. Setting up a serial console I get
>> the below backtrace from ttm but I have not had chance to look at it.
> It's a NULL pointer in qxl_bo_delete_mem_notify(), with the code
> disassembling to
>
>    16: 55                    push   %rbp
>    17: 48 89 fd              mov    %rdi,%rbp
>    1a: e8 a2 02 00 00        callq  0x2c1
>    1f: 84 c0                test   %al,%al
>    21: 74 0d                je     0x30
>    23: 48 8b 85 68 01 00 00 mov    0x168(%rbp),%rax
>    2a:* 83 78 10 03          cmpl   $0x3,0x10(%rax) <-- trapping instruction
>    2e: 74 02                je     0x32
>    30: 5d                    pop    %rbp
>    31: c3                    retq
>
> and that "cmpl $3" looks exactly like that
>
>          if (bo->resource->mem_type == TTM_PL_PRIV
>
> and the bug is almost certainly from commit d3116756a710 ("drm/ttm:
> rename bo->mem and make it a pointer"), which did
>
> -       if (bo->mem.mem_type == TTM_PL_PRIV ...
> +       if (bo->resource->mem_type == TTM_PL_PRIV ...
>
> and claimed "No functional change".
>
> But clearly the "bo->resource" pointer is NULL.
>
> Added guilty parties and dri-devel mailing list.
>
> Christian? Full report at
>
>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2Fa9473821-1d53-0037-7590-aeaf8e85e72a%40jonmasters.org%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C06dd885408e84008a9a208d945694d9f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637617140858341274%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UlqsiWTjfJZ4%2FeIJJMh1AeCqs5SeFjNG%2F22UiuVAIII%3D&amp;reserved=0
>
> but there's not a whole lot else there that is interesting except for
> the call trace:
>
>    ttm_bo_cleanup_memtype_use+0x22/0x60 [ttm]
>    ttm_bo_release+0x1a1/0x300 [ttm]
>    ttm_bo_delayed_delete+0x1be/0x220 [ttm]
>    ttm_device_delayed_workqueue+0x18/0x40 [ttm]
>    process_one_work+0x1ec/0x390
>    worker_thread+0x53/0x3e0
>
> so it's presumably the cleanup phase and perhaps "bo->resource" has
> been deallocated and cleared?

That's a known issue. Fixed by:

commit 3efe180d5105d367ae1dfadb97892ab93a89a783
Author: Christian König <christian.koenig@....com>
Date:   Tue Jul 6 08:51:25 2021 +0200

     drm/qxl: add NULL check for bo->resource

     When allocations fails that can be NULL now.

Previously the structure was embedded into the buffer object and when 
allocation failed (or never happened in a temporary buffer) the 
structure was just zeroed.

Going to double check tomorrow why that hasn't showed up in your tree yet.

Christian.


>
>                    Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ