[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878rw9u6fb.fsf@email.froward.int.ebiederm.org>
Date: Fri, 24 Dec 2021 13:52:24 -0600
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Pavel Begunkov <asml.silence@...il.com>
Cc: Olivier Langlois <olivier@...llion01.com>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
io-uring@...r.kernel.org, Alexander Viro <viro@...iv.linux.org.uk>,
Jens Axboe <axboe@...nel.dk>, Oleg Nesterov <oleg@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL
Pavel Begunkov <asml.silence@...il.com> writes:
> On 12/24/21 01:34, Olivier Langlois wrote:
>> On Fri, 2021-10-22 at 15:13 +0100, Pavel Begunkov wrote:
>>> On 6/9/21 21:17, Eric W. Biederman wrote:
>>> In short, a task creates an io_uring worker thread, then the worker
>>> submits a task_work item to the creator task and won't die until
>>> the item is executed/cancelled. And I found that the creator task is
>>> sleeping in do_coredump() -> wait_for_completion()
>>>
> [...]
>>> A hack executing tws there helps (see diff below).
>>> Any chance anyone knows what this is and how to fix it?
>>>
> [...]
>> Pavel,
>>
>> I cannot comment on the merit of the proposed hack but my proposed
>> patch to fix the coredump truncation issue when a process using
>> io_uring core dumps that I submitted back in August is still
>> unreviewed!
>
> That's unfortunate. Not like I can help in any case, but I assumed
> it was dealt with by
>
> commit 06af8679449d4ed282df13191fc52d5ba28ec536
> Author: Eric W. Biederman <ebiederm@...ssion.com>
> Date: Thu Jun 10 15:11:11 2021 -0500
>
> coredump: Limit what can interrupt coredumps
> Olivier Langlois has been struggling with coredumps being incompletely
> written in
> processes using io_uring.
> ...
I thought it had been too.
>> https://lore.kernel.org/lkml/1625bc89782bf83d9d8c7c63e8ffcb651ccb15fa.1629655338.git.olivier@trillion01.com/
>>
>> I have been using it since then I must have generated many dozens of
>> perfect core dump files with it and I have not seen a single truncated
>> core dump files like I used to have prior to the patch.
>>
>> I am bringing back my patch to your attention because one nice side
>> effect of it is that it would have avoided totally the problem that you
>> have encountered in coredump_wait() since it does cancel io_uring
>> resources before calling coredump_wait()!
>
> FWIW, I worked it around in io_uring back then by breaking the
> dependency.
I am in the middle of untangling the dependencies between ptrace,
coredump, signal handling and maybe a few related things.
Do folks have a reproducer I can look at? Pavel especially if you have
something that reproduces on the current kernels.
As part of that I am in the process of guaranteeing all of the coredump
work happens in get_signal so nothing of io_uring or any cleanup
anywhere else runs until the coredump completes.
I haven't quite posted the code for review because it's the holidays.
But I am aiming at v5.17 or possibly v5.18, as the code is just about
ready.
Eric
Powered by blists - more mailing lists