lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1b519092-2ebf-3800-306d-c354c24a9ad1@gmail.com>
Date:   Fri, 22 Oct 2021 15:13:08 +0100
From:   Pavel Begunkov <asml.silence@...il.com>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        linux-kernel@...r.kernel.org
Cc:     linux-fsdevel@...r.kernel.org, io-uring@...r.kernel.org,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Olivier Langlois <olivier@...llion01.com>,
        Jens Axboe <axboe@...nel.dk>, Oleg Nesterov <oleg@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL

On 6/9/21 21:17, Eric W. Biederman wrote:
> 
> Folks,
> 
> Olivier Langlois has been struggling with coredumps getting truncated in
> tasks using io_uring.  He has also apparently been struggling with
> the some of his email messages not making it to the lists.

Looks syzbot hit something relevant, see
https://lore.kernel.org/io-uring/0000000000000012fb05cee99477@google.com/

In short, a task creates an io_uring worker thread, then the worker
submits a task_work item to the creator task and won't die until
the item is executed/cancelled. And I found that the creator task is
sleeping in do_coredump() -> wait_for_completion()

0xffffffff81343ccb is in do_coredump (fs/coredump.c:469).
464
465             if (core_waiters > 0) {
466                     struct core_thread *ptr;
467
468                     freezer_do_not_count();
469                     wait_for_completion(&core_state->startup);
470                     freezer_count();


A hack executing tws there helps (see diff below).
Any chance anyone knows what this is and how to fix it?


diff --git a/fs/coredump.c b/fs/coredump.c
index 3224dee44d30..f6f9dfb02296 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -466,7 +466,8 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
          struct core_thread *ptr;
  
          freezer_do_not_count();
-        wait_for_completion(&core_state->startup);
+        while (wait_for_completion_interruptible(&core_state->startup))
+            tracehook_notify_signal();
          freezer_count();
          /*
           * Wait for all the threads to become inactive, so that



> 
> We were talking about some of his struggles and questions in this area
> and he pointed me to this patch he thought he had posted but I could not
> find in the list archives.
> 
> In short the coredump code deliberately supports being interrupted by
> SIGKILL, and depends upon prepare_signal to filter out all other
> signals.  With the io_uring code comes an extra test in signal_pending
> for TIF_NOTIFY_SIGNAL (which is something about asking a task to run
> task_work_run).
> 
> I am baffled why the dumper thread would be getting interrupted by
> TIF_NOTIFY_SIGNAL but apparently it is.  Perhaps it is an io_uring
> thread that is causing the dump.
> 
> Now that we know the problem the question becomes how to fix this issue.
> 
> Is there any chance all of this TWA_SIGNAL logic could simply be removed
> now that io_uring threads are normal process threads?
> 
> There are only the two call sites so I perhaps the could test
> signal->flags & SIGNAL_FLAG_COREDUMP before scheduling a work on
> a process that is dumping core?
> 
> Perhaps the coredump code needs to call task_work_run before it does
> anything?
> 
> -----
> 
> From: Olivier Langlois <olivier@...llion01.com>
> Subject: [PATCH] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL
> Date: Mon, 07 Jun 2021 16:25:06 -0400
> 
> io_uring is a big user of task_work and any event that io_uring made a
> task waiting for that occurs during the core dump generation will
> generate a TIF_NOTIFY_SIGNAL.
> 
> Here are the detailed steps of the problem:
> 1. io_uring calls vfs_poll() to install a task to a file wait queue
>     with io_async_wake() as the wakeup function cb from io_arm_poll_handler()
> 2. wakeup function ends up calling task_work_add() with TWA_SIGNAL
> 3. task_work_add() sets the TIF_NOTIFY_SIGNAL bit by calling
>     set_notify_signal()
> 
> Signed-off-by: Olivier Langlois <olivier@...llion01.com>
> ---
>   fs/coredump.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/coredump.c b/fs/coredump.c
> index 2868e3e171ae..79c6e3f114db 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -519,7 +519,7 @@ static bool dump_interrupted(void)
>   	 * but then we need to teach dump_write() to restart and clear
>   	 * TIF_SIGPENDING.
>   	 */
> -	return signal_pending(current);
> +	return task_sigpending(current);
>   }
>   
>   static void wait_for_dump_helpers(struct file *file)
> 

-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ