linux-kernel - Re: [PATCH 1/4] mm/vmalloc: allow to call vfree() in atomic context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d7d8bbc9-9f46-8c84-dcf3-c564dcccc7d7@virtuozzo.com>
Date:   Fri, 31 Mar 2017 12:26:53 +0300
From:   Andrey Ryabinin <aryabinin@...tuozzo.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
CC:     <penguin-kernel@...ove.SAKURA.ne.jp>,
        <linux-kernel@...r.kernel.org>, <mhocko@...nel.org>,
        <linux-mm@...ck.org>, <hpa@...or.com>, <chris@...is-wilson.co.uk>,
        <hch@....de>, <mingo@...e.hu>, <jszhang@...vell.com>,
        <joelaf@...gle.com>, <joaodias@...gle.com>, <willy@...radead.org>,
        <tglx@...utronix.de>, <thellstrom@...are.com>,
        <stable@...r.kernel.org>
Subject: Re: [PATCH 1/4] mm/vmalloc: allow to call vfree() in atomic context



On 03/31/2017 01:22 AM, Andrew Morton wrote:
> On Thu, 30 Mar 2017 13:27:16 +0300 Andrey Ryabinin <aryabinin@...tuozzo.com> wrote:
> 
>> Commit 5803ed292e63 ("mm: mark all calls into the vmalloc subsystem
>> as potentially sleeping") added might_sleep() to remove_vm_area() from
>> vfree(), and commit 763b218ddfaf ("mm: add preempt points into
>> __purge_vmap_area_lazy()") actually made vfree() potentially sleeping.
>>
>> This broke vmwgfx driver which calls vfree() under spin_lock().
>>
>>     BUG: sleeping function called from invalid context at mm/vmalloc.c:1480
>>     in_atomic(): 1, irqs_disabled(): 0, pid: 341, name: plymouthd
>>     2 locks held by plymouthd/341:
>>      #0:  (drm_global_mutex){+.+.+.}, at: [<ffffffffc01c274b>] drm_release+0x3b/0x3b0 [drm]
>>      #1:  (&(&tfile->lock)->rlock){+.+...}, at: [<ffffffffc0173038>] ttm_object_file_release+0x28/0x90 [ttm]
>>
>>     Call Trace:
>>      dump_stack+0x86/0xc3
>>      ___might_sleep+0x17d/0x250
>>      __might_sleep+0x4a/0x80
>>      remove_vm_area+0x22/0x90
>>      __vunmap+0x2e/0x110
>>      vfree+0x42/0x90
>>      kvfree+0x2c/0x40
>>      drm_ht_remove+0x1a/0x30 [drm]
>>      ttm_object_file_release+0x50/0x90 [ttm]
>>      vmw_postclose+0x47/0x60 [vmwgfx]
>>      drm_release+0x290/0x3b0 [drm]
>>      __fput+0xf8/0x210
>>      ____fput+0xe/0x10
>>      task_work_run+0x85/0xc0
>>      exit_to_usermode_loop+0xb4/0xc0
>>      do_syscall_64+0x185/0x1f0
>>      entry_SYSCALL64_slow_path+0x25/0x25
>>
>> This can be fixed in vmgfx, but it would be better to make vfree()
>> non-sleeping again because we may have other bugs like this one.
> 
> I tend to disagree: adding yet another schedule_work() introduces
> additional overhead and adds some risk of ENOMEM errors which wouldn't
> occur with a synchronous free.
> 
>> __purge_vmap_area_lazy() is the only function in the vfree() path that
>> wants to be able to sleep. So it make sense to schedule
>> __purge_vmap_area_lazy() via schedule_work() so it runs only in sleepable
>> context.
> 
> vfree() already does
> 
> 	if (unlikely(in_interrupt()))
> 		__vfree_deferred(addr);
> 
> so it seems silly to introduce another defer-to-kernel-thread thing
> when we already have one.
> 
>> This will have a minimal effect on the regular vfree() path.
>> since __purge_vmap_area_lazy() is rarely called.
> 
> hum, OK, so perhaps the overhead isn't too bad.
> 
> Remind me: where does __purge_vmap_area_lazy() sleep?

It's cond_resched_lock(&vmap_area_lock);

> 
> Seems to me that a better fix would be to make vfree() atomic, if poss.
>
> Otherwise, to fix callers so they call vfree from sleepable context. 
> That will reduce kernel latencies as well.
> 

Currently we know only two such callers: ttm_object_file_release() and ttm_object_device_release().
Both of them call vfree() under spin_lock() for no reason, Thomas said that he
has patch to fix this: http://lkml.kernel.org/r/f1c0b9ec-c0c8-502c-c7f0-fe692c73ab04@vmware.com

So this patch is more like an attempt to address other similar bugs possibly introduced by
commit 5803ed292e63 ("mm: mark all calls into the vmalloc subsystem as potentially sleeping")
It's quite possible that we overlooked something.