lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 12 Nov 2021 10:34:39 +0100
From:   Claudio Imbrenda <imbrenda@...ux.ibm.com>
To:     ebiederm@...ssion.com (Eric W. Biederman)
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org, thuth@...hat.com,
        frankja@...ux.ibm.com, borntraeger@...ibm.com,
        Ulrich.Weigand@...ibm.com, david@...hat.com, ultrachin@....com,
        akpm@...ux-foundation.org, vbabka@...e.cz, brookxu.cn@...il.com,
        xiaoggchen@...cent.com, linuszeng@...cent.com, yihuilu@...cent.com,
        mhocko@...e.com, daniel.m.jordan@...cle.com, axboe@...nel.dk,
        legion@...nel.org, peterz@...radead.org, aarcange@...hat.com,
        christian@...uner.io, tglx@...utronix.de
Subject: Re: [RFC v1 2/4] kernel/fork.c: implement new process_mmput_async
 syscall

On Thu, 11 Nov 2021 13:20:11 -0600
ebiederm@...ssion.com (Eric W. Biederman) wrote:

> Claudio Imbrenda <imbrenda@...ux.ibm.com> writes:
> 
> > The goal of this new syscall is to be able to asynchronously free the
> > mm of a dying process. This is especially useful for processes that use
> > huge amounts of memory (e.g. databases or KVM guests). The process is
> > allowed to terminate immediately, while its mm is cleaned/reclaimed
> > asynchronously.
> >
> > A separate process needs use the process_mmput_async syscall to attach
> > itself to the mm of a running target process. The process will then
> > sleep until the last user of the target mm has gone.
> >
> > When the last user of the mm has gone, instead of synchronously free
> > the mm, the attached process is awoken. The syscall will then continue
> > and clean up the target mm.
> >
> > This solution has the advantage that the cleanup of the target mm can
> > happen both be asynchronous and properly accounted for (e.g. cgroups).
> >
> > Tested on s390x.
> >
> > A separate patch will actually wire up the syscall.  
> 
> I am a bit confused.
> 
> You want the process report that it has finished immediately,
> and you want the cleanup work to continue on in the background.
> 
> Why do you need a separate process?
> 
> Why not just modify the process cleanup code to keep the task_struct
> running while allowing waitpid to reap the process (aka allowing
> release_task to run)?  All tasks can be already be reaped after
> exit_notify in do_exit.
> 
> I can see some reasons for wanting an opt-in.  It is nice to know all of
> a processes resources have been freed when waitpid succeeds.
> 
> Still I don't see why this whole thing isn't exit_mm returning
> the mm_sturct when a flag is set, and then having an exit_mm_late
> being called and passed the returned mm after exit_notify.

nevermind, exit_notify is done after cgroup_exit, the teardown would
then not be accounted properly

> 
> Or maybe something with schedule_work or task_work, instead of an
> exit_mm_late.  I don't see any practical difference.
> 
> I really don't see why this needs a whole other process to connect to
> the process you care about asynchronously.
> 
> This whole thing seems an exercise in spending lots of resources to free
> resources much later.
> 
> Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ