lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <129403.1691660102@turing-police>
Date:   Thu, 10 Aug 2023 05:35:02 -0400
From:   "Valdis Klētnieks" <valdis.kletnieks@...edu>
To:     Srinivasan Shanmugam <srinivasan.shanmugam@....com>
Cc:     Alex Deucher <alexander.deucher@....com>,
        David Airlie <airlied@...il.com>,
        amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
        linux-kernel@...r.kernel.org
Subject: next-20230726 and later - crash in radeon module during init

I am seeing the following consistent crash at boot:

[   61.211213][  T819] [drm] radeon kernel modesetting enabled.
[   61.584870][  T819] vga_switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle
[   61.667507][  T819] ATPX version 1, functions 0x00000033
[   61.748228][  T819] general protection fault, probably for non-canonical address 0x54080068930549a0: 0000 [#1] PREEMPT SMP
[   61.829840][  T819] CPU: 3 PID: 819 Comm: (udev-worker) Tainted: G          I     T  6.5.0-rc4-next-20230804 #58 5cce04b101a5bb4a6c0368bfff037f6f096b3d3e
[   61.911411][  T819] Hardware name: Dell Inc. Inspiron 5559/052K07, BIOS 1.9.0 09/07/2020
[   61.993285][  T819] RIP: 0010:strnlen+0x21/0x40
[   62.074885][  T819] Code: 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 55 48 89 e5 48 8d 14 37 31 c0 48 85 f6 74 16 48 89 f8 eb 09 48 83 c0 01 48 39 c2 74 0e <80> 38 00 75 f2 48 29 f8 5d c3
cc cc cc cc 48 89 d0 5d 48 29 f8 c3
[   62.156529][  T819] RSP: 0018:ffffa310419979b8 EFLAGS: 00010202
[   62.318407][  T819] RAX: 54080068930549a0 RBX: ffffa31041997a20 RCX: 0000000000000000
[   62.400015][  T819] RDX: 54080068930549b0 RSI: 0000000000000010 RDI: 54080068930549a0
[   62.481624][  T819] RBP: ffffa310419979b8 R08: ffff937b85579990 R09: ffffa31041997ad8
[   62.563644][  T819] R10: ffff937b86ddae00 R11: 0000000000000000 R12: 54080068930549a0
[   62.645194][  T819] R13: ffff937b814291b8 R14: 0000000000000001 R15: ffffa31041997b81
[   62.726753][  T819] FS:  00007efd50479600(0000) GS:ffff937ef2e00000(0000) knlGS:0000000000000000
[   62.808312][  T819] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   62.889830][  T819] CR2: 00007f125d30ee70 CR3: 0000000105644002 CR4: 00000000003706e0
[   62.971390][  T819] Call Trace:
[   63.052954][  T819]  <TASK>
[   63.134501][  T819]  ? show_regs+0x64/0x70
[   63.216058][  T819]  ? die_addr+0x36/0x90
[   63.297594][  T819]  ? exc_general_protection+0x1c1/0x440
[   63.379112][  T819]  ? asm_exc_general_protection+0x2b/0x30
[   63.460650][  T819]  ? strnlen+0x21/0x40
[   63.542209][  T819]  set_dev_info+0x40/0x170
[   63.623762][  T819]  dev_printk_emit+0xa8/0xe0
[   63.705308][  T819]  __dev_printk+0x34/0x80
[   63.786806][  T819]  _dev_info+0x7a/0xa0
[   63.868304][  T819]  radeon_atpx_validate.constprop.0.isra.0+0xbc/0x100 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[   63.949952][  T819]  radeon_atpx_detect+0x17b/0x190 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[   64.031547][  T819]  ? __pfx_radeon_module_init+0x10/0x10 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[   64.113102][  T819]  radeon_register_atpx_handler+0xd/0x30 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[   64.194721][  T819]  radeon_module_init+0x84/0xff0 [radeon f030e9a708043a486415a94978106b28cd7cb9a2]
[   64.276365][  T819]  do_one_initcall+0x86/0x380
[   64.357865][  T819]  do_init_module+0x63/0x220
[   64.439342][  T819]  load_module+0x99d/0xa90

Some quick digging indicates the most likely culprit is:

commit cbd0606e6a776bf2ba10d4a6957bb7628c0da947
Author: Srinivasan Shanmugam <srinivasan.shanmugam@....com>
Date:   Thu Jul 20 15:39:24 2023 +0530

    drm/radeon: Prefer dev_* variant over printk

    Changed from pr_err/info to dev_* variants so that
    we get better debug info when there are multiple GPUs
    in the system.

Looks like this is the failure point because 'dev' is trashed:

+               dev_info(dev, "ATPX Hybrid Graphics\n");

But  I admit I don't know the APCI stuff well enough to see what, if
anything, is wrong with this:

+       struct acpi_device *adev = container_of(atpx->handle, struct acpi_device, handle);
+       struct device *dev = &adev->dev;

Any ideas?


Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ