lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aN5m_TRNsWSQVtnD@intel.com>
Date: Thu, 2 Oct 2025 14:50:21 +0300
From: Ville Syrjälä <ville.syrjala@...ux.intel.com>
To: Chintan Patel <chintanlike@...il.com>
Cc: maarten.lankhorst@...ux.intel.com, maxime.ripard@...nel.org,
	tzimmermann@...e.de, airlied@...il.com, simona@...ll.ch,
	dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
	syzbot+147ba789658184f0ce04@...kaller.appspotmail.com
Subject: Re: [PATCH v2] drm/vblank: downgrade vblank wait timeout from WARN
 to debug

On Thu, Oct 02, 2025 at 02:40:05PM +0300, Ville Syrjälä wrote:
> On Wed, Oct 01, 2025 at 07:57:23PM -0700, Chintan Patel wrote:
> > When wait_event_timeout() in drm_wait_one_vblank() times out, the
> > current WARN can cause unnecessary kernel panics in environments
> > with panic_on_warn set (e.g. CI, fuzzing). These timeouts can happen
> > under scheduler pressure or from invalid userspace calls, so they are
> > not always a kernel bug.
> 
> "invalid userspace calls" should never reach this far.
> That would be a kernel bug.

I was also wondering how you could get this due to some scheduler
screwup, but I suppose that could theoretically happen with threaded 
irqs, or whatever work/etc is used to update the vblank count on
drivers that don't have hardware interrupts for it. 100+ msec
hw interrupt latency sounds excessive to me though.

But since you reference some syzbot reports below, are you
actually trying to hide real kernel bugs that syzbot found?

> 
> > 
> > Replace the WARN with drm_dbg_kms() messages that provide useful
> > context (last and current vblank counters) without crashing the
> > system. Developers can still enable drm.debug to diagnose genuine
> > problems.
> > 
> > Reported-by: syzbot+147ba789658184f0ce04@...kaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=147ba789658184f0ce04
> > Tested-by: syzbot+147ba789658184f0ce04@...kaller.appspotmail.com
> > 
> > Signed-off-by: Chintan Patel <chintanlike@...il.com>
> > 
> > v2:
> >  - Drop unnecessary in-code comment (suggested by Thomas Zimmermann)
> >  - Remove else branch, only log timeout case
> > ---
> >  drivers/gpu/drm/drm_vblank.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> > index 46f59883183d..a94570668cba 100644
> > --- a/drivers/gpu/drm/drm_vblank.c
> > +++ b/drivers/gpu/drm/drm_vblank.c
> > @@ -1289,7 +1289,7 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
> >  {
> >  	struct drm_vblank_crtc *vblank = drm_vblank_crtc(dev, pipe);
> >  	int ret;
> > -	u64 last;
> > +	u64 last, curr;
> >  
> >  	if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
> >  		return;
> > @@ -1305,7 +1305,12 @@ void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
> >  				 last != drm_vblank_count(dev, pipe),
> >  				 msecs_to_jiffies(100));
> >  
> > -	drm_WARN(dev, ret == 0, "vblank wait timed out on crtc %i\n", pipe);
> > +	curr = drm_vblank_count(dev, pipe);
> > +
> > +	if (ret == 0) {
> > +		drm_dbg_kms(dev, "WAIT_VBLANK: timeout crtc=%d, last=%llu, curr=%llu\n",
> > +			pipe, last, curr);
> 
> It should at the very least be a drm_err(). Though the backtrace can
> be useful in figuring out where the problem is coming from, so not
> too happy about this change.
> 
> > +	}
> >  
> >  	drm_vblank_put(dev, pipe);
> >  }
> > -- 
> > 2.43.0
> 
> -- 
> Ville Syrjälä
> Intel

-- 
Ville Syrjälä
Intel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ