lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F17E56B.1050501@broadcom.com>
Date:	Thu, 19 Jan 2012 10:42:03 +0100
From:	"Arend van Spriel" <arend@...adcom.com>
To:	"Linus Torvalds" <torvalds@...ux-foundation.org>
cc:	Rafał Miłecki <zajec5@...il.com>,
	"Larry Finger" <Larry.Finger@...inger.net>,
	"Network Development" <netdev@...r.kernel.org>,
	dri-devel@...ts.freedesktop.org, "David Airlie" <airlied@...ux.ie>,
	"Jerome Glisse" <jglisse@...hat.com>
Subject: Re: [0/5] bcma/brcmsmac suspend/resume cleanups and fixes

On 01/17/2012 02:12 AM, Linus Torvalds wrote:
> 2012/1/16 Arend van Spriel <arend@...adcom.com>:
>>
>> I build a new kernel with MCE enabled. Same issue. I did not load bcma
>> or brcmsmac yet. Attached is trace I could pull from the kernel log
>> (str-test-*).
> 
> Oh well. Everything looks fine in the test traces - the warnings are
> annoying and nasty, but a known issue and not dangerous (and I have a
> patch in my tree to fix them now).
> 
> So if the real suspend fails, it's some other subsystem that has
> gotten broken. I don't think I have any other reports like that yet,
> and there is not a lot to go on. If you could try to bisect it (I
> assume plain Linux-3.2 works fine?) that woudl be wonderful, otherwise
> I think we're stuck waiting for somebody else to hit it and figure it
> out.
> 
>                  Linus
> 

Hi Linus,

No trivial bisect. I wish I had a faster build machine, but alas. I
suspected some issue in DRM and the bisect took me into drm-core-next
branch. I ended up at the following commit:

dc97b3409a790d2a21aac6e5cdb99558b5944119 is the first bad commit
commit dc97b3409a790d2a21aac6e5cdb99558b5944119
Author: Jerome Glisse <jglisse@...hat.com>
Date:   Fri Nov 18 11:47:03 2011 -0500

    drm/ttm: callback move_notify any time bo placement change v4

    Previously we were calling back move_notify in error path when the
    bo is returned to it's original position or when destroy the bo.
    When destroying the bo set the new mem placement as NULL when calling
    back in the driver.

    Updating nouveau to deal with NULL placement properly.

    v2: reserve the object before calling move_notify in bo destroy path
        at that point ttm should be the only piece of code interacting
        with the object so atomic_set is safe here.
    v3: callback move notify only once the bo is in its new position
        call move notify want swaping out the buffer
    v4:- don't call move_notify when swapin out bo, assume driver should
         do what is appropriate in swap notify
       - move move_notify call back to ttm_bo_cleanup_memtype_use for
         destroy path

    Reviewed-by: Jerome Glisse <jglisse@...hat.com>
    Reviewed-by: Thomas Hellstrom <thellstrom@...are.com>

Actually, this commit was already bothering my bisect as it gave me a
NULL pointer deref on system startup. I patched it to proceed the bisect
(see attached diff). The NULL deref was fixed in the drm-nouveau-next
branch (f7b24c4 drm/nouveau/ttm: fix crash as a result of a recent ttm
change). However, the suspend/resume issue was not resolved.

Gr. AvS
---------------------
01:00.0 VGA compatible controller: nVidia Corporation GT218 [NVS 3100M]
(rev a2) (prog-if 00 [VGA controller])
        Subsystem: Dell Device 040a
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at e2000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
        Region 3: Memory at e0000000 (64-bit, prefetchable) [size=32M]
        Region 5: I/O ports at 7000 [size=128]
        Expansion ROM at e3000000 [disabled] [size=512K]
        Capabilities: <access denied>
        Kernel driver in use: nouveau
        Kernel modules: nouveau, nvidiafb

View attachment "nouveau_bo.diff" of type "text/plain" (816 bytes)

View attachment "dmesg-3.2.0-rc1-00070-gdc97b34-dirty.txt" of type "text/plain" (59179 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ