linux-kernel - Re: [PATCH 2/2] kernel: rerun task_work while freezing in get

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <658da3fe-fa02-423b-aff0-52f54e1332ee@gmail.com>
Date: Tue, 9 Jul 2024 15:05:21 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: io-uring@...r.kernel.org, Jens Axboe <axboe@...nel.dk>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Christian Brauner <brauner@...nel.org>,
 Tycho Andersen <tandersen@...flix.com>, Thomas Gleixner
 <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
 Julian Orth <ju.orth@...il.com>, Tejun Heo <tj@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 2/2] kernel: rerun task_work while freezing in
 get_signal()

On 7/9/24 11:36, Oleg Nesterov wrote:
> On 07/08, Pavel Begunkov wrote:
>>
>> On 7/8/24 11:42, Oleg Nesterov wrote:
>>> I don't think we should blame io_uring even if so far it is the only user
>>> of TWA_SIGNAL.
>>
>> And it's not entirely correct even for backporting purposes,
>> I'll pin it to when freezing was introduced then.
> 
> This is another problem introduced by 12db8b690010 ("entry: Add support for
> TIF_NOTIFY_SIGNAL")

Ah, yes, I forgot NOTIFY_SIGNAL was split out of SIGPENDING

> We need much more changes. Say, zap_threads() does the same and assumes
> that only SIGKILL or freezeing can make dump_interrupted() true.
> 
> There are more similar problems. I'll try to think, so far I do not see
> a simple solution...

Thanks. And there was some patching done before against dumping
being interrupted by task_work, indeed a reoccurring issue.


> As for this particular problem, I agree it needs a simple/backportable fix.
> 
>>>>   relock:
>>>> +	clear_notify_signal();
>>>> +	if (unlikely(task_work_pending(current)))
>>>> +		task_work_run();
>>>> +
>>>>   	spin_lock_irq(&sighand->siglock);
>>>
>>> Well, but can't we kill the same code at the start of get_signal() then?
>>> Of course, in this case get_signal() should check signal_pending(), not
>>> task_sigpending().
>>
>> Should be fine,
> 
> Well, not really at least performance-wise... get_signal() should return
> asap if TIF_NOTIFY_SIGNAL was the only reason to call get_signal().
> 
>> but I didn't want to change the
>> try_to_freeze() -> __refrigerator() path, which also reschedules.
> 
> Could you spell please?

Let's say it calls get_signal() for freezing with a task_work pending.
Currently, it executes task_work and calls try_to_freeze(), which
puts the task to sleep. If we remove that task_work_run() before
try_to_freeze(), it would not be able to sleep. Sounds like it should
be fine, it races anyway, but I'm trying to avoid side effect for fixes.

>>> Or perhaps something like the patch below makes more sense? I dunno...
>>
>> It needs a far backporting, I'd really prefer to keep it
>> lean and without more side effects if possible, unless
>> there is a strong opinion on that.
> 
> Well, I don't think my patch is really worse in this sense. Just it
> is buggy ;) it needs another recalc_sigpending() before goto start,
> so lets forget it.
> 
> So I am starting to agree with your change as a workaround until we
> find a clean solution (if ever ;).
> 
> But can I ask you to add this additional clear_notify_signal() +
> task_work_run() to the end of do_freezer_trap() ? get_signal() is
> already a mess...

Will change

> -----------------------------------------------------------------------
> Either way I have no idea whether a cgroup_task_frozen() task should
> react to task_work_add(TWA_SIGNAL) or not.
> 
> Documentation/admin-guide/cgroup-v2.rst says
> 
> 	Writing "1" to the file causes freezing of the cgroup and all
> 	descendant cgroups. This means that all belonging processes will
> 	be stopped and will not run until the cgroup will be explicitly
> 	unfrozen.
> 
> AFAICS this is not accurate, they can run but can't return to user-mode.
> So I guess task_work_run() is fine.

IIUC it's a user facing doc, so maybe it's accurate enough from that
perspective. But I do agree that the semantics around task_work is
not exactly clear.

-- 
Pavel Begunkov