linux-kernel - Re: [PATCH] mm: Free per cpu pages async to shorten program exit time

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0ab1dd42-8869-5b41-3af5-e16a49335df2@redhat.com>
Date:   Fri, 8 Oct 2021 14:55:45 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Vlastimil Babka <vbabka@...e.cz>, ultrachin@....com,
        akpm@...ux-foundation.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Cc:     brookxu.cn@...il.com, chen xiaoguang <xiaoggchen@...cent.com>,
        zeng jingxiang <linuszeng@...cent.com>,
        lu yihui <yihuilu@...cent.com>,
        Claudio Imbrenda <imbrenda@...ux.ibm.com>
Subject: Re: [PATCH] mm: Free per cpu pages async to shorten program exit time

On 08.10.21 14:38, Vlastimil Babka wrote:
> On 10/8/21 10:17, David Hildenbrand wrote:
>> On 08.10.21 08:39, ultrachin@....com wrote:
>>> From: chen xiaoguang <xiaoggchen@...cent.com>
>>>
>>> The exit time is long when program allocated big memory and
>>> the most time consuming part is free memory which takes 99.9%
>>> of the total exit time. By using async free we can save 25% of
>>> exit time.
>>>
>>> Signed-off-by: chen xiaoguang <xiaoggchen@...cent.com>
>>> Signed-off-by: zeng jingxiang <linuszeng@...cent.com>
>>> Signed-off-by: lu yihui <yihuilu@...cent.com>
>>
>> I recently discussed with Claudio if it would be possible to tear down the
>> process MM deferred, because for some use cases (secure/encrypted
>> virtualization, very large mmaps) tearing down the page tables is already
>> the much more expensive operation.
> 
> OK, but what exactly is the benefit here? The cpu time will have to be spent
> in any case, but we move it to a context that's not accounted to the exiting
> process. Is that good? Also if it's a large process and restarts
> immediately, allocating all the memory back again, it might not be available
> as it's still being freed in the background, leading to a risk of OOM?

One use case I was told is that if you have a large (secure/encrypted) 
VM and shut it down, it might take quite a long time until you can 
actually start that very VM again, because tooling assumes that the VM 
isn't shut down until the process is gone (closed all files, sockets, etc.).

I also discussed the risk of OOM with Claudio. In some cases, we don't 
care, for example, we could start the VM on a different node in the 
cluster, or there is sufficient memory available to start it on the same 
node. But there was the idea to stop the OOM killer from firing as long 
as there is still some MM getting cleaned up, which would also make 
sense to some degree.

-- 
Thanks,

David / dhildenb