lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1a1cc125-9314-f569-a6c4-40fc4509a377@amd.com>
Date:   Thu, 4 Nov 2021 08:39:18 +0100
From:   Christian König <christian.koenig@....com>
To:     Karol Herbst <kherbst@...hat.com>, Sven Joachim <svenjoac@....de>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Erhard F." <erhard_f@...lbox.org>,
        nouveau <nouveau@...ts.freedesktop.org>,
        LKML <linux-kernel@...r.kernel.org>, stable@...r.kernel.org,
        Huang Rui <ray.huang@....com>
Subject: Re: [Nouveau] [PATCH 5.10 32/77] drm/ttm: fix memleak in
 ttm_transfered_destroy

Am 03.11.21 um 22:25 schrieb Karol Herbst:
> On Wed, Nov 3, 2021 at 9:47 PM Sven Joachim <svenjoac@....de> wrote:
>> On 2021-11-03 21:32 +0100, Karol Herbst wrote:
>>
>>> On Wed, Nov 3, 2021 at 9:29 PM Karol Herbst <kherbst@...hat.com> wrote:
>>>> On Wed, Nov 3, 2021 at 8:52 PM Sven Joachim <svenjoac@....de> wrote:
>>>>> On 2021-11-01 10:17 +0100, Greg Kroah-Hartman wrote:
>>>>>
>>>>>> From: Christian König <christian.koenig@....com>
>>>>>>
>>>>>> commit 0db55f9a1bafbe3dac750ea669de9134922389b5 upstream.
>>>>>>
>>>>>> We need to cleanup the fences for ghost objects as well.
>>>>>>
>>>>>> Signed-off-by: Christian König <christian.koenig@....com>
>>>>>> Reported-by: Erhard F. <erhard_f@...lbox.org>
>>>>>> Tested-by: Erhard F. <erhard_f@...lbox.org>
>>>>>> Reviewed-by: Huang Rui <ray.huang@....com>
>>>>>> Bug: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D214029&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C9b70f83c53c74b35fee808d99f1091b3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637715715806624439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=UIo0hw0OHeLlGL%2Bcj%2Fjt%2FgTwniaJoNmhgDHSFvymhCc%3D&amp;reserved=0
>>>>>> Bug: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D214447&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C9b70f83c53c74b35fee808d99f1091b3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637715715806634433%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=TIAUb6AdYm2Bo0%2BvFZUFPS8yu55orjnfxMLCmUgC%2FDk%3D&amp;reserved=0
>>>>>> CC: <stable@...r.kernel.org>
>>>>>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2Fmsgid%2F20211020173211.2247-1-christian.koenig%40amd.com&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C9b70f83c53c74b35fee808d99f1091b3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637715715806634433%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=c9i7AR44MVUyZuXHZkLOCBx2%2BZeetq8alGtbz0Wgqzk%3D&amp;reserved=0
>>>>>> Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
>>>>>> ---
>>>>>>   drivers/gpu/drm/ttm/ttm_bo_util.c |    1 +
>>>>>>   1 file changed, 1 insertion(+)
>>>>>>
>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>>>> @@ -322,6 +322,7 @@ static void ttm_transfered_destroy(struc
>>>>>>        struct ttm_transfer_obj *fbo;
>>>>>>
>>>>>>        fbo = container_of(bo, struct ttm_transfer_obj, base);
>>>>>> +     dma_resv_fini(&fbo->base.base._resv);
>>>>>>        ttm_bo_put(fbo->bo);
>>>>>>        kfree(fbo);
>>>>>>   }
>>>>> Alas, this innocuous looking commit causes one of my systems to lock up
>>>>> as soon as run startx.  This happens with the nouveau driver, two other
>>>>> systems with radeon and intel graphics are not affected.  Also I only
>>>>> noticed it in 5.10.77.  Kernels 5.15 and 5.14.16 are not affected, and I
>>>>> do not use 5.4 anymore.
>>>>>
>>>>> I am not familiar with nouveau's ttm management and what has changed
>>>>> there between 5.10 and 5.14, but maybe one of their developers can shed
>>>>> a light on this.
>>>>>
>>>>> Cheers,
>>>>>         Sven
>>>>>
>>>> could be related to 265ec0dd1a0d18f4114f62c0d4a794bb4e729bc1
>>> maybe not.. but I did remember there being a few tmm related patches
>>> which only hurt nouveau :/  I guess one could do a git bisect to
>>> figure out what change "fixes" it.
>> Maybe, but since the memory leaks reported by Erhard only started to
>> show up in 5.14 (if I read the bugzilla reports correctly), perhaps the
>> patch should simply be reverted on earlier kernels?
>>
> Yeah, I think this is probably the right approach.

I agree. The problem is this memory leak could potentially happen with 
5.10 as wel, just much much much less likely.

But my guess is that 5.10 is so buggy that when the leak does NOT happen 
we double free and obviously causing a crash.

So for the sake of stability please don't apply this patch to 5.10. I'm 
going to comment on the original bug report as well.

Thanks,
Christian.

>
>>> On which GPU do you see this problem?
>> On an old GeForce 8500 GT, the whole PC is rather ancient.
>>
>> Cheers,
>>         Sven
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ