lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56E04FFA.7070906@amd.com>
Date:	Wed, 9 Mar 2016 11:31:54 -0500
From:	Nicolai Hähnle <nicolai.haehnle@....com>
To:	Luis Henriques <luis.henriques@...onical.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>
CC:	Christian König <christian.koenig@....com>,
	<andersen@...epoet.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	<stable@...r.kernel.org>, Sasha Levin <sasha.levin@...cle.com>,
	Jiri Slaby <jslaby@...e.cz>,
	Kamal Mostafa <kamal@...onical.com>
Subject: Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref

On 09.03.2016 08:56, Luis Henriques wrote:
> On Mon, Mar 07, 2016 at 02:58:51PM -0800, Greg Kroah-Hartman wrote:
>> On Mon, Mar 07, 2016 at 10:06:47PM +0100, Christian König wrote:
>>> Am 07.03.2016 um 21:46 schrieb Greg Kroah-Hartman:
>>>> On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote:
>>>>> The following patch to radeon_sa_bo_new that
>>>>> went into 3.10.99
>>>>>
>>>>>    commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7
>>>>>    Author: Nicolai Hähnle <nicolai.haehnle@....com>
>>>>>    Date:   Fri Feb 5 14:35:53 2016 -0500
>>>>>      drm/radeon: hold reference to fences in radeon_sa_bo_new
>>>>>      commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
>>>>>
>>>>> is triggering an Oops for me right when xscreensaver
>>>>> first began doing 3D stuff.  After reverting this
>>>>> patch, xscreensaver has been happily running 3D stuff.
>>>>>
>>>>> Mar  6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
>>>>> Mar  6 18:00:43 sage kernel: IP: [<ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
>>>>> Mar  6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0
>>>>> Mar  6 18:00:43 sage kernel: Oops: 0002 [#1] SMP
>>>>>
>>>>> Mar  6 18:00:43 sage kernel: Stack:
>>>>> Mar  6 18:00:43 sage kernel:  ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000
>>>>> Mar  6 18:00:43 sage kernel:  ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100
>>>>> Mar  6 18:00:43 sage kernel:  00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000
>>>>> Mar  6 18:00:43 sage kernel: Call Trace:
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa01607ec>] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa005fc9d>] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa0117c49>] radeon_ib_get+0x39/0x110 [radeon]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa011a4ea>] radeon_cs_ioctl+0x69a/0xa70 [radeon]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffffa008e2d2>] drm_ioctl+0x512/0x650 [drm]
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff810a46e1>] ? do_futex+0x111/0xc30
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff81182a45>] do_vfs_ioctl+0x305/0x520
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff8107cd39>] ? vtime_account_user+0x69/0x80
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff81182ce1>] SyS_ioctl+0x81/0xa0
>>>>> Mar  6 18:00:43 sage kernel:  [<ffffffff8178210f>] tracesys+0xe1/0xe6
>>>>>
>>>>> $ lspci | grep VGA
>>>>> 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
>>>>> [AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730]
>>>> Next time, please cc: the people responsible for that patch as well...
>>>>
>>>> I can revert it, but maybe something else is going on here?  Do you have
>>>> this same problem on 3.14, and 4.5-rc7?
>>>
>>> Hi Greg,
>>>
>>> yes that's an already known issue. Feel free to revert that one for now.
>>>
>>> I got it on my TODO list to provide a fixed patch for older kernel, but that
>>> can take a while.
>>>
>>> For the background Nicolais patch is correct, but assumes that
>>> radeon_fence_unref() can safely take NULL as the fence which is not the case
>>> for older kernels.

Actually, the call to radeon_fence_ref() is the culprit.

>>
>> Ok, thanks, now reverted.
>>
>
> And looks like a few more kernels may be affected as well.  I'll
> revert it from 3.16 kernel, and I'm adding Kamal, Sasha and Jiri to
> the CC list.

Kernels that contain commit 954605ca "drm/radeon: use common fence 
implementation for fences, v4" are safe, older kernels require a 
NULL-pointer check around the call to radeon_fence_ref.

This means kernels 3.17 and older are affected and need the additional 
NULL pointer check that I've sent out already on a different thread (I'm 
attaching it again, hoping that Erik gets a chance to test it).

It would be nice to get a confirmation that this really does fix the 
observed bug, then I can prepare a fixed version of the patch for 3.17 
and older (i.e. squash the original bad commit with the attached patch).

Cheers,
Nicolai

>
> Cheers,
> --
> Luís
>
>> greg k-h
>> --
>> To unsubscribe from this list: send the line "unsubscribe stable" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

View attachment "0001-drm-radeon-guard-call-to-radeon_fence_ref-against-NU.patch" of type "text/x-patch" (1448 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ