[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABQX2QM09V=+G=9T6Ax8Ad3F85hU0Cg4WqD82hTN=yhktXNDaQ@mail.gmail.com>
Date: Thu, 15 Aug 2024 14:40:09 -0400
From: Zack Rusin <zack.rusin@...adcom.com>
To: Christian Heusel <christian@...sel.eu>
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@...adcom.com>,
Martin Krastev <martin.krastev@...adcom.com>,
Maaz Mombasawala <maaz.mombasawala@...adcom.com>, dri-devel@...ts.freedesktop.org,
Brad Spengler <spender@...ecurity.net>, rdkehn@...il.com, linux-kernel@...r.kernel.org,
regressions@...ts.linux.dev
Subject: Re: [REGRESSION][BISECTED] vmwgfx crashes with command buffer error
after update
On Thu, Aug 15, 2024 at 1:48 PM Christian Heusel <christian@...sel.eu> wrote:
>
> Hello Zack,
>
> the user rdkehn (in CC) on the Arch Linux Forums reports that after
> updating to the 6.10.4 stable kernel inside of their VM Workstation the
> driver crashes with the error attached below. This error is also present
> on the latest mainline release 6.11-rc3.
>
> We have bisected the issue together down to the following commit:
>
> d6667f0ddf46 ("drm/vmwgfx: Fix handling of dumb buffers")
>
> Reverting this commit on top of 6.11-rc3 fixes the issue.
>
> While we were still debugging the issue Brad (also CC'ed) messaged me
> that they were seeing similar failures in their ESXi based test
> pipelines except for one box that was running on legacy BIOS (so maybe
> that is relevant). They noticed this because they had set panic_on_warn.
>
> Cheers,
> Chris
>
> ---
>
> #regzbot introduced: d6667f0ddf46
> #regzbot title: drm/vmwgfx: driver crashes due to command buffer error
> #regzbot link: https://bbs.archlinux.org/viewtopic.php?id=298491
>
> ---
>
> dmesg snippet:
> [ 13.297084] ------------[ cut here ]------------
> [ 13.297086] Command buffer error.
> [ 13.297139] WARNING: CPU: 0 PID: 186 at drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c:399 vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx]
> [ 13.297160] Modules linked in: uas usb_storage hid_generic usbhid mptspi sr_mod cdrom scsi_transport_spi vmwgfx serio_raw mptscsih ata_generic atkbd drm_ttm_helper libps2 pata_acpi vivaldi_fmap ttm mptbase crc32c_intel xhci_pci intel_agp xhci_pci_renesas ata_piix intel_gtt i8042 serio
> [ 13.297172] CPU: 0 PID: 186 Comm: irq/16-vmwgfx Not tainted 6.10.4-arch2-1 #1 517ed45cc9c4492ee5d5bfc2d2fe6ef1f2e7a8eb
> [ 13.297174] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> [ 13.297175] RIP: 0010:vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx]
> [ 13.297186] Code: 01 00 01 e8 ba 8c 4f f9 0f 0b 4c 89 ff e8 40 fb ff ff e9 9d fe ff ff 48 c7 c7 99 d9 3f c0 c6 05 52 2f 01 00 01 e8 98 8c 4f f9 <0f> 0b e9 1f fe ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
> [ 13.297187] RSP: 0018:ffffb9c1805e3d78 EFLAGS: 00010282
> [ 13.297188] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000003
> [ 13.297189] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001
> [ 13.297190] RBP: ffff907fc8274c98 R08: 0000000000000000 R09: ffffb9c1805e3bf8
> [ 13.297191] R10: ffff9086dbdfffa8 R11: 0000000000000003 R12: ffff907fc4db5b00
> [ 13.297192] R13: ffff907fc83fd318 R14: ffff907fc8274c88 R15: ffff907fc83fd300
> [ 13.297193] FS: 0000000000000000(0000) GS:ffff9086dbe00000(0000) knlGS:0000000000000000
> [ 13.297194] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 13.297194] CR2: 0000774dc57671ca CR3: 00000006b9e20005 CR4: 00000000003706f0
> [ 13.297196] Call Trace:
> [ 13.297198] <TASK>
> [ 13.297199] ? vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9]
> [ 13.297209] ? __warn.cold+0x8e/0xe8
> [ 13.297211] ? vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9]
> [ 13.297221] ? report_bug+0xff/0x140
> [ 13.297222] ? console_unlock+0x84/0x130
> [ 13.297225] ? handle_bug+0x3c/0x80
> [ 13.297226] ? exc_invalid_op+0x17/0x70
> [ 13.297227] ? asm_exc_invalid_op+0x1a/0x20
> [ 13.297230] ? vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9]
> [ 13.297238] ? vmw_cmdbuf_ctx_process+0x268/0x270 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9]
> [ 13.297245] vmw_cmdbuf_man_process+0x5d/0x100 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9]
> [ 13.297253] vmw_cmdbuf_irqthread+0x25/0x30 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9]
> [ 13.297261] vmw_thread_fn+0x3a/0x70 [vmwgfx a4fe13044bca4eda782d964fb8c4ca15afb325e9]
> [ 13.297271] irq_thread_fn+0x20/0x60
> [ 13.297273] irq_thread+0x18a/0x270
> [ 13.297274] ? __pfx_irq_thread_fn+0x10/0x10
> [ 13.297276] ? __pfx_irq_thread_dtor+0x10/0x10
> [ 13.297277] ? __pfx_irq_thread+0x10/0x10
> [ 13.297278] kthread+0xcf/0x100
> [ 13.297281] ? __pfx_kthread+0x10/0x10
> [ 13.297282] ret_from_fork+0x31/0x50
> [ 13.297285] ? __pfx_kthread+0x10/0x10
> [ 13.297286] ret_from_fork_asm+0x1a/0x30
> [ 13.297288] </TASK>
> [ 13.297289] ---[ end trace 0000000000000000 ]---
Hi, Christian.
Thanks for the report! So just to be clear vmwgfx doesn't crash, but
it shows a warning and the kernel has been compiled with panic on
warning which is actually what panics, right?
I haven't seen this on any of our systems so I'm guessing the affected
systems aren't running gnome/kde? Is there any chance I could see the
full "journalctl -b" log and the vmware.log file associated with those
warnings? They could give me some clues on how to reproduce this.
z
Powered by blists - more mailing lists