lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170106100956.2zy2gx7ld3kqjwcf@phenom.ffwll.local>
Date:   Fri, 6 Jan 2017 11:09:56 +0100
From:   Daniel Vetter <daniel@...ll.ch>
To:     Peter Ujfalusi <peter.ujfalusi@...com>
Cc:     daniel.vetter@...el.com, jani.nikula@...ux.intel.com,
        seanpaul@...omium.org, airlied@...ux.ie, gleb@...tmail.com,
        linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        freedesktop-bugs@...lSD.com
Subject: Re: [PATCH] drm: Schedule the output_poll_work with 1s delay if we
 have delayed event

On Thu, Jan 05, 2017 at 12:40:29PM +0200, Peter Ujfalusi wrote:
> On 01/05/2017 10:43 AM, Daniel Vetter wrote:
> > On Wed, Jan 04, 2017 at 02:00:53PM +0200, Peter Ujfalusi wrote:
> >> Instead of scheduling the work to handle the initial delayed event, use 1s
> >> delay.
> >>
> >> When the delayed event is handled w/o delay - in a similar matter when the
> >> poll had been initialized before drm_helper_probe_single_connector_modes()
> >> is called - it triggers a race in Optimus setups.
> >>
> >> Fixes: 339fd36238dd ("drm: drm_probe_helper: Fix output_poll_work scheduling")
> >> Cc: stable@...r.kernel.org   # v4.9
> >> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@...com>
> >> ---
> >> Hi,
> >>
> >> related bug report: https://bugs.freedesktop.org/show_bug.cgi?id=98690
> >>
> >> Regards,
> >> Peter
> >>
> >>  drivers/gpu/drm/drm_probe_helper.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
> >> index 98ed110e28ed..f30c14b0a72f 100644
> >> --- a/drivers/gpu/drm/drm_probe_helper.c
> >> +++ b/drivers/gpu/drm/drm_probe_helper.c
> >> @@ -146,8 +146,9 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev)
> >>  	drm_connector_list_iter_put(&conn_iter);
> >>  
> >>  	if (dev->mode_config.delayed_event) {
> >> +		/* Use short (1s) delay to handle the initial delayed event */
> >>  		poll = true;
> >> -		delay = 0;
> >> +		delay = HZ;
> > 
> > This smells a bit like duct-tape papering over some other race.
> 
> Yes, I agree. We could revert 339fd36238dd also to put back the duct-tape.

Since the revert and this patch are the same kind of duct-tape (massive
delay) I prefer something like this patch ...

> 
> > Lack of
> > locking or something else. Most likely some drivers enable polling a bit
> > too early in their load sequence.
> 
> All is pointing to Optimus. Intel and nouveau alone works, but we have
> failure on Optimus laptops.
> I have tried to narrow it down with another patch attached to the
> bugzilla: https://bugs.freedesktop.org/attachment.cgi?id=128742
> 
> I expected it to fix the problem. But it did not. So I'm puzzled.
> 
> >And if we can't figure this out, then we
> > need some really big FIXME: here that this papers over driver races.
> 
> Yeah, now we know that there is something wrong in some driver(s). We
> can hide it with delaying the poll_work. It would be great if
> nouveau/optimus guys would be able to take a look at this.

... but with all the above information (optimus only, what blows up)
condensed into both the commit message and a big WARNING comment in the
code. Then I think this is reasonable to merge (but still a decent wtf),
since it addresses a rather serious regression.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ