[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CY8PR11MB713430B4E2F925B88430784B89D52@CY8PR11MB7134.namprd11.prod.outlook.com>
Date: Tue, 25 Jun 2024 13:49:43 +0000
From: "Zhuo, Qiuxu" <qiuxu.zhuo@...el.com>
To: "maarten.lankhorst@...ux.intel.com" <maarten.lankhorst@...ux.intel.com>,
"mripard@...nel.org" <mripard@...nel.org>, "tzimmermann@...e.de"
<tzimmermann@...e.de>, "airlied@...il.com" <airlied@...il.com>,
"daniel@...ll.ch" <daniel@...ll.ch>
CC: "dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Luck, Tony"
<tony.luck@...el.com>, "Wang, Yudong" <yudong.wang@...el.com>
Subject: RE: [PATCH 1/1] drm/fb-helper: Don't schedule_work() to flush frame
buffer during panic()
Hello,
> From: Zhuo, Qiuxu <qiuxu.zhuo@...el.com>
> Sent: Friday, May 31, 2024 3:45 PM
> To: maarten.lankhorst@...ux.intel.com; mripard@...nel.org;
> tzimmermann@...e.de; airlied@...il.com; daniel@...ll.ch
> Cc: dri-devel@...ts.freedesktop.org; linux-kernel@...r.kernel.org; Luck, Tony
> <tony.luck@...el.com>; Zhuo, Qiuxu <qiuxu.zhuo@...el.com>; Wang, Yudong
> <yudong.wang@...el.com>
> Subject: [PATCH 1/1] drm/fb-helper: Don't schedule_work() to flush frame
> buffer during panic()
>
> Sometimes the system [1] hangs on x86 I/O machine checks. However, the
> expected behavior is to reboot the system, as the machine check handler
> ultimately triggers a panic(), initiating a reboot in the last step.
>
> The root cause is that sometimes the panic() is blocked when
> drm_fb_helper_damage() invoking schedule_work() to flush the frame buffer.
> This occurs during the process of flushing all messages to the frame buffer
> driver as shown in the following call trace:
>
> Machine check occurs [2]:
> panic()
> console_flush_on_panic()
> console_flush_all()
> console_emit_next_record()
> con->write()
> vt_console_print()
> hide_cursor()
> vc->vc_sw->con_cursor()
> fbcon_cursor()
> ops->cursor()
> bit_cursor()
> soft_cursor()
> info->fbops->fb_imageblit()
> drm_fbdev_generic_defio_imageblit()
> drm_fb_helper_damage_area()
> drm_fb_helper_damage()
> schedule_work() // <--- blocked here
> ...
> emergency_restart() // wasn't invoked, so no reboot.
>
> During panic(), except the panic CPU, all the other CPUs are stopped.
> In schedule_work(), the panic CPU requires the lock of worker_pool to queue
> the work on that pool, while the lock may have been token by some other
> stopped CPU. So schedule_work() is blocked.
>
> Additionally, during a panic(), since there is no opportunity to execute any
> scheduled work, it's safe to fix this issue by skipping schedule_work() on
> 'oops_in_progress' in drm_fb_helper_damage().
>
> [1] Enable the kernel option CONFIG_FRAMEBUFFER_CONSOLE,
> CONFIG_DRM_FBDEV_EMULATION, and boot with the 'console=tty0'
> kernel command line parameter.
>
> [2] Set 'panic_timeout' to a non-zero value before calling panic().
>
> Reported-by: Yudong Wang <yudong.wang@...el.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@...el.com>
> ---
> drivers/gpu/drm/drm_fb_helper.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_fb_helper.c
> b/drivers/gpu/drm/drm_fb_helper.c index d612133e2cf7..6d7b6f038821
> 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -628,6 +628,9 @@ static void drm_fb_helper_add_damage_clip(struct
> drm_fb_helper *helper, u32 x, u static void drm_fb_helper_damage(struct
> drm_fb_helper *helper, u32 x, u32 y,
> u32 width, u32 height)
> {
> + if (oops_in_progress)
> + return;
> +
> drm_fb_helper_add_damage_clip(helper, x, y, width, height);
>
> schedule_work(&helper->damage_work);
> --
A gentle ping on this patch.
Updated with recent error injection test results:
- Without the patch, we typically reproduced the issue [1] once in 100 cycles.
- With the patch, we tested it on 3 systems and passed a total of 1500 cycles.
[1] the system got blocked at drm_fb_helper_damage()-> schedule_work() without reboot.
For details, please see the commit message.
Thanks!
-Qiuxu
Powered by blists - more mailing lists