lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <b94cdc$6rvjl8@fmsmga001.fm.intel.com>
Date:	Sat, 06 Oct 2012 09:04:34 +0100
From:	Chris Wilson <chris@...is-wilson.co.uk>
To:	Willy Tarreau <w@....eu>, Daniel Vetter <daniel.vetter@...ll.ch>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: 3.5 regression on i915

On Sat, 6 Oct 2012 01:42:18 +0200, Willy Tarreau <w@....eu> wrote:
> Chris, Daniel,
> 
> since version 3.5, my Asus EeePC 1005HA bugs during startx. I didn't
> have the time to investigate until this evening.
> 
> I could bisect the commits and found that the following one was merged
> in 3.5-rc1 and is responsible for these bugs that can reliably be
> triggered :
> 
>   1b50247a8ddde4af5aaa0e6bc125615372ce6c16 is the first bad commit
>   commit 1b50247a8ddde4af5aaa0e6bc125615372ce6c16
>   Author: Chris Wilson <chris@...is-wilson.co.uk>
>   Date:   Tue Apr 24 15:47:30 2012 +0100
> 
>     drm/i915: Remove the list of pinned inactive objects
>     
>     Simplify object tracking by removing the inactive but pinned list. The
>     only place where this was used is for counting the available memory,
>     which is just as easy performed by checking all objects on the rare
>     occasions it is required (application startup). For ease of debugging,
>     we keep the reporting of pinned objects through the error-state and
>     debugfs.
>     
>     Signed-off-by: Chris Wilson <chris@...is-wilson.co.uk>
>     Signed-off-by: Daniel Vetter <daniel.vetter@...ll.ch>
> 
> I tried to revert it from 3.5.6-rc1 but it does not revert cleanly at all
> and I'm totall unfamiliar with this code to attempt anything sane at this
> time of the night.
> 
> The crash happens here in i915_gem_entervt_ioctl() :
> 
>     3659          BUG_ON(!list_empty(&dev_priv->mm.active_list));
>     3660          BUG_ON(!list_empty(&dev_priv->mm.flushing_list));
>  -> 3661          BUG_ON(!list_empty(&dev_priv->mm.inactive_list));
>     3662          mutex_unlock(&dev->struct_mutex);

That BUG_ON there is silly and can simply be removed. The check is to
verify that no batches were submitted to the kernel whilst the UMS/GEM
client was suspended - to which the BUG_ONs are a crude approximation.
Furthermore, the checks are too late, since it means we attempted to
program the hardware whilst it was in an invalid state, the BUG_ONs are
the least of your concerns at that point.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ