[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1b519092-2ebf-3800-306d-c354c24a9ad1@gmail.com>
Date: Fri, 22 Oct 2021 15:13:08 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>,
linux-kernel@...r.kernel.org
Cc: linux-fsdevel@...r.kernel.org, io-uring@...r.kernel.org,
Alexander Viro <viro@...iv.linux.org.uk>,
Olivier Langlois <olivier@...llion01.com>,
Jens Axboe <axboe@...nel.dk>, Oleg Nesterov <oleg@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL
On 6/9/21 21:17, Eric W. Biederman wrote:
>
> Folks,
>
> Olivier Langlois has been struggling with coredumps getting truncated in
> tasks using io_uring. He has also apparently been struggling with
> the some of his email messages not making it to the lists.
Looks syzbot hit something relevant, see
https://lore.kernel.org/io-uring/0000000000000012fb05cee99477@google.com/
In short, a task creates an io_uring worker thread, then the worker
submits a task_work item to the creator task and won't die until
the item is executed/cancelled. And I found that the creator task is
sleeping in do_coredump() -> wait_for_completion()
0xffffffff81343ccb is in do_coredump (fs/coredump.c:469).
464
465 if (core_waiters > 0) {
466 struct core_thread *ptr;
467
468 freezer_do_not_count();
469 wait_for_completion(&core_state->startup);
470 freezer_count();
A hack executing tws there helps (see diff below).
Any chance anyone knows what this is and how to fix it?
diff --git a/fs/coredump.c b/fs/coredump.c
index 3224dee44d30..f6f9dfb02296 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -466,7 +466,8 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
struct core_thread *ptr;
freezer_do_not_count();
- wait_for_completion(&core_state->startup);
+ while (wait_for_completion_interruptible(&core_state->startup))
+ tracehook_notify_signal();
freezer_count();
/*
* Wait for all the threads to become inactive, so that
>
> We were talking about some of his struggles and questions in this area
> and he pointed me to this patch he thought he had posted but I could not
> find in the list archives.
>
> In short the coredump code deliberately supports being interrupted by
> SIGKILL, and depends upon prepare_signal to filter out all other
> signals. With the io_uring code comes an extra test in signal_pending
> for TIF_NOTIFY_SIGNAL (which is something about asking a task to run
> task_work_run).
>
> I am baffled why the dumper thread would be getting interrupted by
> TIF_NOTIFY_SIGNAL but apparently it is. Perhaps it is an io_uring
> thread that is causing the dump.
>
> Now that we know the problem the question becomes how to fix this issue.
>
> Is there any chance all of this TWA_SIGNAL logic could simply be removed
> now that io_uring threads are normal process threads?
>
> There are only the two call sites so I perhaps the could test
> signal->flags & SIGNAL_FLAG_COREDUMP before scheduling a work on
> a process that is dumping core?
>
> Perhaps the coredump code needs to call task_work_run before it does
> anything?
>
> -----
>
> From: Olivier Langlois <olivier@...llion01.com>
> Subject: [PATCH] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL
> Date: Mon, 07 Jun 2021 16:25:06 -0400
>
> io_uring is a big user of task_work and any event that io_uring made a
> task waiting for that occurs during the core dump generation will
> generate a TIF_NOTIFY_SIGNAL.
>
> Here are the detailed steps of the problem:
> 1. io_uring calls vfs_poll() to install a task to a file wait queue
> with io_async_wake() as the wakeup function cb from io_arm_poll_handler()
> 2. wakeup function ends up calling task_work_add() with TWA_SIGNAL
> 3. task_work_add() sets the TIF_NOTIFY_SIGNAL bit by calling
> set_notify_signal()
>
> Signed-off-by: Olivier Langlois <olivier@...llion01.com>
> ---
> fs/coredump.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/coredump.c b/fs/coredump.c
> index 2868e3e171ae..79c6e3f114db 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -519,7 +519,7 @@ static bool dump_interrupted(void)
> * but then we need to teach dump_write() to restart and clear
> * TIF_SIGPENDING.
> */
> - return signal_pending(current);
> + return task_sigpending(current);
> }
>
> static void wait_for_dump_helpers(struct file *file)
>
--
Pavel Begunkov
Powered by blists - more mailing lists