linux-kernel - [PATCH] Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com>
Date:   Wed, 11 Dec 2019 19:13:18 +0000
From:   Ioanna Alifieraki <ioanna-maria.alifieraki@...onical.com>
To:     akpm@...ux-foundation.org, manfred@...orfullife.com,
        herton@...hat.com, arnd@...db.de, catalin.marinas@....com,
        malat@...ian.org, joel@...lfernandes.org, gustavo@...eddedor.com,
        linux-kernel@...r.kernel.org, jay.vosburgh@...onical.com,
        ioanna.alifieraki@...il.com
Subject: [PATCH] Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"

This reverts commit a97955844807e327df11aa33869009d14d6b7de0.

Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage
in exit_sem()") removes a lock that is needed. This leads to a process
looping infinitely in exit_sem() and can also lead to a crash.  There is
a reproducer available in [1] and with the commit reverted the issue
does not reproduce anymore.

Using the reproducer found in [1] is fairly easy to reach a point where
one of the child processes is looping infinitely in exit_sem between
for(;;) and if (semid == -1) block, while it's trying to free its last
sem_undo structure which has already been freed by freeary().

Each sem_undo struct is on two lists: one per semaphore set (list_id)
and one per process (list_proc).  The list_id list tracks undos by
semaphore set, and the list_proc by process.

Undo structures are removed either by freeary() or by exit_sem().  The
freeary function is invoked when the user invokes a syscall to remove a
semaphore set.  During this operation freeary() traverses the list_id
associated with the semaphore set and removes the undo structures from
both the list_id and list_proc lists.

For this case, exit_sem() is called at process exit.  Each process
contains a struct sem_undo_list (referred to as "ulp") which contains
the head for the list_proc list. When the process exits, exit_sem()
traverses this list to remove each sem_undo struct. As in freeary(),
whenever a sem_undo struct is removed from list_proc, it is also removed
from the list_id list.

Removing elements from list_id is safe for both exit_sem() and freeary()
due to sem_lock().  Removing elements from list_proc is not safe;
freeary() locks &un->ulp->lock when it performs
list_del_rcu(&un->list_proc) but exit_sem() does not (locking was
removed by commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list
lock usage in exit_sem()").

This can result in the following situation while executing the
reproducer [1] : Consider a child process in exit_sem() and the parent
in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)).  The list_proc
for the child contains the last two undo structs A and B (the rest have
been removed either by exit_sem() or freeary()).  The semid for A is 1
and semid for B is 2.  exit_sem() removes A and at the same time
freeary() removes B. Since A and B have different semid sem_lock() will
acquire different locks for each process and both can proceed.  The bug
is that they remove A and B from the same list_proc at the same time
because only freeary() acquires the ulp lock.  When exit_sem() removes A
it makes ulp->list_proc.next to point at B and at the same time
freeary() removes B setting B->semid=-1.  At the next iteration of
for(;;) loop exit_sem() will try to remove B.  The only way to break
from for(;;) is for (&un->list_proc == &ulp->list_proc) to be true which
is not.  Then exit_sem() will check if B->semid=-1 which is and will
continue looping in for(;;) until the memory for B is reallocated and
the value at B->semid is changed. At that point, exit_sem() will crash
attempting to unlink B from the lists (this can be easily triggered by
running the reproducer [1] a second time).

To prove this scenario instrumentation was added to keep information
about each sem_undo (un) struct that is removed per process and per
semaphore set (sma).

        CPU0                                CPU1
[caller holds sem_lock(sma for A)]      ...
freeary()                               exit_sem()
...                                     ...
...                                     sem_lock(sma for B)
spin_lock(A->ulp->lock)                 ...
list_del_rcu(un_A->list_proc)           list_del_rcu(un_B->list_proc)

Undo structures A and B have different semid and sem_lock() operations
proceed. However they belong to the same list_proc list and they are
removed at the same time. This results into ulp->list_proc.next pointing
to the address of B which is already removed.

After reverting commit a97955844807 ("ipc,sem: remove uneeded
sem_undo_list lock usage in exit_sem()") the issue was no longer
reproducible.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779

Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()")
Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki@...onical.com>
---
 ipc/sem.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index ec97a7072413..fe12ea8dd2b3 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2368,11 +2368,9 @@ void exit_sem(struct task_struct *tsk)
 		ipc_assert_locked_object(&sma->sem_perm);
 		list_del(&un->list_id);
 
-		/* we are the last process using this ulp, acquiring ulp->lock
-		 * isn't required. Besides that, we are also protected against
-		 * IPC_RMID as we hold sma->sem_perm lock now
-		 */
+		spin_lock(&ulp->lock);
 		list_del_rcu(&un->list_proc);
+		spin_unlock(&ulp->lock);
 
 		/* perform adjustments registered in un */
 		for (i = 0; i < sma->sem_nsems; i++) {
-- 
2.17.1