lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1566357985-97781-1-git-send-email-joseph.qi@linux.alibaba.com>
Date:   Wed, 21 Aug 2019 11:26:25 +0800
From:   Joseph Qi <joseph.qi@...ux.alibaba.com>
To:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Jason Xing <kerneljasonxing@...ux.alibaba.com>,
        Caspar Zhang <caspar@...ux.alibaba.com>,
        Joseph Qi <joseph.qi@...ux.alibaba.com>
Subject: [PATCH v3] psi: get poll_work to run when calling poll syscall next time

From: Jason Xing <kerneljasonxing@...ux.alibaba.com>

Only when calling the poll syscall the first time can user
receive POLLPRI correctly. After that, user always fails to
acquire the event signal.

Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.

The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag.

Signed-off-by: Jason Xing <kerneljasonxing@...ux.alibaba.com>
Reviewed-by: Caspar Zhang <caspar@...ux.alibaba.com>
Reviewed-by: Suren Baghdasaryan <surenb@...gle.com>
Acked-by: Johannes Weiner <hannes@...xchg.org>
Signed-off-by: Joseph Qi <joseph.qi@...ux.alibaba.com>
---
v3: Change the description as Johannes Weiner suggested.

 kernel/sched/psi.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 23fbbcc..6e52b67 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1131,7 +1131,15 @@ static void psi_trigger_destroy(struct kref *ref)
 	 * deadlock while waiting for psi_poll_work to acquire trigger_lock
 	 */
 	if (kworker_to_destroy) {
+		/*
+		 * After the RCU grace period has expired, the worker
+		 * can no longer be found through group->poll_kworker.
+		 * But it might have been already scheduled before
+		 * that - deschedule it cleanly before destroying it.
+		 */
 		kthread_cancel_delayed_work_sync(&group->poll_work);
+		atomic_set(&group->poll_scheduled, 0);
+
 		kthread_destroy_worker(kworker_to_destroy);
 	}
 	kfree(t);
-- 
1.8.3.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ