lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220114180135.83308-1-npache@redhat.com>
Date:   Fri, 14 Jan 2022 13:01:35 -0500
From:   Nico Pache <npache@...hat.com>
To:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        akpm@...ux-foundation.org, jsavitz@...hat.com, mhocko@...e.com
Cc:     peterz@...radead.org, tglx@...utronix.de, mingo@...hat.com,
        dvhart@...radead.org, dave@...olabs.net, andrealmeid@...labora.com,
        longman@...hat.com
Subject: [PATCH v3] mm/oom: do not oom reap task with an unresolved robust futex

In the case that two or more processes share a futex located within
a shared mmaped region, such as a process that shares a lock between
itself and child processes, we have observed that when a process holding
the lock is oom killed, at least one waiter is never alerted to this new
development and simply continues to wait.

This is visible via pthreads by checking the __owner field of the
pthread_mutex_t structure within a waiting process, perhaps with gdb.

We identify reproduction of this issue by checking a waiting process of
a test program and viewing the contents of the pthread_mutex_t, taking note
of the value in the owner field, and then checking dmesg to see if the
owner has already been killed.

As mentioned by Michal in his patchset introducing the oom reaper,
commit aac4536355496 ("mm, oom: introduce oom reaper"), the purpose of the
oom reaper is to try and free memory more quickly; however, In the case
that a robust futex is being used, we want to avoid utilizing the
concurrent oom reaper. This is due to a race that can occur between the
SIGKILL handling the robust futex, and the oom reaper freeing the memory
needed to maintain the robust list.

In the case that the oom victim is utilizing a robust futex, and the
SIGKILL has not yet handled the futex death, the tsk->robust_list should
be non-NULL. This issue can be tricky to reproduce, but with the
modifications of this patch, we have found it to be impossible to
reproduce.

Add a check for tsk->robust_list is non-NULL in wake_oom_reaper() to return
early and prevent waking the oom reaper.

Reproducer: https://gitlab.com/jsavitz/oom_futex_reproducer

Co-developed-by: Joel Savitz <jsavitz@...hat.com>
Signed-off-by: Joel Savitz <jsavitz@...hat.com>
Signed-off-by: Nico Pache <npache@...hat.com>
---
 mm/oom_kill.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 1ddabefcfb5a..3cdaac9c7de5 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -667,6 +667,21 @@ static void wake_oom_reaper(struct task_struct *tsk)
 	if (test_and_set_bit(MMF_OOM_REAP_QUEUED, &tsk->signal->oom_mm->flags))
 		return;
 
+#ifdef CONFIG_FUTEX
+	/*
+	 * If the ooming task's SIGKILL has not finished handling the
+	 * robust futex it is not correct to reap the mm concurrently.
+	 * Do not wake the oom reaper when the task still contains a
+	 * robust list.
+	 */
+	if (tsk->robust_list)
+		return;
+#ifdef CONFIG_COMPAT
+	if (tsk->compat_robust_list)
+		return;
+#endif
+#endif
+
 	get_task_struct(tsk);
 
 	spin_lock(&oom_reaper_lock);
-- 
2.33.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ