lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fb713388-661a-46e0-8925-6d169b46ff9c@kernel.dk>
Date:   Tue, 5 Dec 2023 14:58:46 -0700
From:   Jens Axboe <axboe@...nel.dk>
To:     NeilBrown <neilb@...e.de>, Christian Brauner <brauner@...nel.org>
Cc:     Al Viro <viro@...iv.linux.org.uk>, Oleg Nesterov <oleg@...hat.com>,
        Chuck Lever <chuck.lever@...cle.com>,
        Jeff Layton <jlayton@...nel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-nfs@...r.kernel.org
Subject: Re: [PATCH 1/2] Allow a kthread to declare that it calls
 task_work_run()

On 12/5/23 2:28 PM, NeilBrown wrote:
> On Tue, 05 Dec 2023, Christian Brauner wrote:
>> On Mon, Dec 04, 2023 at 03:09:44PM -0700, Jens Axboe wrote:
>>> On 12/4/23 2:02 PM, NeilBrown wrote:
>>>> It isn't clear to me what _GPL is appropriate, but maybe the rules
>>>> changed since last I looked..... are there rules?
>>>>
>>>> My reasoning was that the call is effectively part of the user-space
>>>> ABI.  A user-space process can call this trivially by invoking any
>>>> system call.  The user-space ABI is explicitly a boundary which the GPL
>>>> does not cross.  So it doesn't seem appropriate to prevent non-GPL
>>>> kernel code from doing something that non-GPL user-space code can
>>>> trivially do.
>>>
>>> By that reasoning, basically everything in the kernel should be non-GPL
>>> marked. And while task_work can get used by the application, it happens
>>> only indirectly or implicitly. So I don't think this reasoning is sound
>>> at all, it's not an exported ABI or API by itself.
>>>
>>> For me, the more core of an export it is, the stronger the reason it
>>> should be GPL. FWIW, I don't think exporting task_work functionality is

>>
>> Yeah, I'm not too fond of that part as well. I don't think we want to
>> give modules the ability to mess with task work. This is just asking for
>> trouble.
>>
> 
> Ok, maybe we need to reframe the problem then.
> 
> Currently fput(), and hence filp_close(), take control away from kernel
> threads in that they cannot be sure that a "close" has actually
> completed.
> 
> This is already a problem for nfsd.  When renaming a file, nfsd needs to
> ensure any cached "open" that it has on the file is closed (else when
> re-exporting an NFS filesystem it can result in a silly-rename).
> 
> nfsd currently handles this case by calling flush_delayed_fput().  I
> suspect you are no more happy about exporting that than you are about
> exporting task_work_run(), but this solution isn't actually 100%
> reliable.  If some other thread calls flush_delayed_fput() between nfsd
> calling filp_close() and that same nfsd calling flush_delayed_fput(),
> then the second flush can return before the first flush (in the other
> thread) completes all the work it took on.
> 
> What we really need - both for handling renames and for avoiding
> possible memory exhaustion - is for nfsd to be able to reliably wait for
> any fput() that it initiated to complete.
> 
> How would you like the VFS to provide that service?

Since task_work happens in the context of your task already, why not
just have a way to get it stashed into a list when final fput is done?
This avoids all of this "let's expose task_work" and using the task list
for that, which seems kind of pointless as you're just going to run it
later on manually anyway.

In semi pseudo code:

bool fput_put_ref(struct file *file)
{
	return atomic_dec_and_test(&file->f_count);
}

void fput(struct file *file)
{
	if (fput_put_ref(file)) {
		...
	}
}

and then your nfsd_file_free() could do:

ret = filp_flush(file, id);
if (fput_put_ref(file))
	llist_add(&file->f_llist, &l->to_free_llist);

or something like that, where l->to_free_llist is where ever you'd
otherwise punt the actual freeing to.

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ