lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a86cba0b-4513-e7c3-ae75-bb331433f664@molgen.mpg.de>
Date:   Fri, 24 Jul 2020 09:45:18 +0200
From:   Paul Menzel <pmenzel@...gen.mpg.de>
To:     Kees Cook <keescook@...omium.org>,
        Mazin Rezk <mnrzk@...tonmail.com>
Cc:     linux-kernel@...r.kernel.org, amd-gfx@...ts.freedesktop.org,
        dri-devel@...ts.freedesktop.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Christian König <christian.koenig@....com>,
        Harry Wentland <Harry.Wentland@....com>,
        Nicholas Kazlauskas <nicholas.kazlauskas@....com>,
        sunpeng.li@....com, Alexander Deucher <Alexander.Deucher@....com>,
        1i5t5.duncan@....net, mphantomx@...oo.com.br,
        regressions@...mhuis.info, anthony.ruhier@...il.com
Subject: Re: [PATCH] amdgpu_dm: fix nonblocking atomic commit use-after-free

Dear Kees,


Am 24.07.20 um 00:32 schrieb Kees Cook:
> On Thu, Jul 23, 2020 at 09:10:15PM +0000, Mazin Rezk wrote:
>> When amdgpu_dm_atomic_commit_tail is running in the workqueue,
>> drm_atomic_state_put will get called while amdgpu_dm_atomic_commit_tail is
>> running, causing a race condition where state (and then dm_state) is
>> sometimes freed while amdgpu_dm_atomic_commit_tail is running. This bug has
>> occurred since 5.7-rc1 and is well documented among polaris11 users [1].
>>
>> Prior to 5.7, this was not a noticeable issue since the freelist pointer
>> was stored at the beginning of dm_state (base), which was unused. After
>> changing the freelist pointer to be stored in the middle of the struct, the
>> freelist pointer overwrote the context, causing dc_state to become garbage
>> data and made the call to dm_enable_per_frame_crtc_master_sync dereference
>> a freelist pointer.
>>
>> This patch fixes the aforementioned issue by calling drm_atomic_state_get
>> in amdgpu_dm_atomic_commit before drm_atomic_helper_commit is called and
>> drm_atomic_state_put after amdgpu_dm_atomic_commit_tail is complete.
>>
>> According to my testing on 5.8.0-rc6, this should fix bug 207383 on
>> Bugzilla [1].
>>
>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=207383
> 
> Nice work tracking this down!
> 
>> Fixes: 3202fa62f ("slub: relocate freelist pointer to middle of object")
> 
> I do, however, object to this Fixes tag. :) The flaw appears to have
> been with amdgpu_dm's reference tracking of "state" in the nonblocking
> case. (How this reference counting is supposed to work correctly, though,
> I'm not sure.) If I look at where the drm helper was split from being
> the default callback, it looks like this was what introduced the bug:
> 
> da5c47f682ab ("drm/amd/display: Remove acrtc->stream")
> 
> ? 3202fa62f certainly exposed it much more quickly, but there was a race
> even without 3202fa62f where something could have realloced the memory
> and written over it.

I understand the Fixes tag mainly a help when backporting commits.

As Linux 5.8-rc7 is going to be released this Sunday, I wonder, if 
commit 3202fa62f ("slub: relocate freelist pointer to middle of object") 
should be reverted for now to fix the regression for the users according 
to Linux’ no regression policy. Once the AMDGPU/DRM driver issue is 
fixed, it can be reapplied. I know it’s not optimal, but as some testing 
is going to be involved for the fix, I’d argue it’s the best option for 
the users.


Kind regards,

Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ