lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 27 Apr 2015 14:48:34 +0800
From:	Xunlei Pang <xlpang@....com>
To:	linux-kernel@...r.kernel.org
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Juri Lelli <juri.lelli@...il.com>,
	Ingo Molnar <mingo@...hat.com>,
	Xunlei Pang <pang.xunlei@...aro.org>
Subject: [RFC PATCH RESEND 0/4] Maintain the FIFO order for same priority RT tasks

From: Xunlei Pang <pang.xunlei@...aro.org>

1. TWO FIFO QUEUES
We know, there are two main queues each cpu for RT SMP scheduler. Let's
name them "run queue" and "pushable queue" respectively. 

"run queue" is from the intra-cpu's perspective, while "pushable queue" is
from the inter-cpu's perspective. "pushable queue" also has a very close 
relationship with task affinity. The two queues separately maintain their
FIFO order for tasks queued at the same priority.

If we got a wrong FIFO order of any of the two queues, we say it broke FIFO.


2. THE WAY THEY WORK
Currently, when an rt task gets queued, it can be put to the head or
tail of its "run queue" at the same priority according to different
scenarios. Further if it is migratable, it will also and always be put
to the tail of its "pushable queue" at the same priority because "plist"
is used to manage this queue in strict FIFO order.

a) If a task got enqueued or dequeued, it got added to or removed from the
"run queue", and also "pushable queue"(added to tail) if it is migratable.

b) If a runnable task acquires cpu(current task), it stays in the front of
the "run queue", but got removed from the "pushable queue". 

c) If a runnable task releases cpu, it stays in the "run queue"(if it 
releases cpu involuntarily, stays just behind new current in this queue; 
if it releases cpu voluntarily like yield, usually was requeued behind 
other tasks queued at the same priority in this queue), and got added to 
the tail of its "pushable queue" at the same priority if it is migratable.

d) If a tasks's affinity is changed from one cpu to multiple cpus, it got
added to the tail of "pushable queue" at the same priority.

e) If a tasks's affinity is changed from multiple cpus to one cpu, it got
removed from the "pushable queue".


3. PROBLEMATIC CASES
Unfortunately, in current kernel implementations, as far as I know, c) 
described in "2. THE WAY THEY WORK" have some cases that broke FIFO of 
one of the two queues for tasks queued at the same priority.

- Case 1: current task gets preempted by a higher priority task
current will be enqueued behind other tasks with the same priority in the
"pushable queue", while it actually stays ahead of these tasks in the "run
queue". Afterwards, if there comes a pull from other cpu or a push from
local cpu, the task behind current in the "run queue" will be removed from
the "pushable queue" and gets running, as the global rt scheduler fetches
tasks from the head of the "pushable queue" to do the pulling or pushing.
The kernel broke FIFO in this case.

current task gets preempted by an equal priority task:
This is done in check_preempt_equal_prio(), and can be divided into 3 sub cases.
  - Case 2 : p is the only one has the same priority as current.
             p preempts current successfully, but the kernel still had the 
             same problem as in Case 1 during the preempting.
  - Case 3 : p is not the only one has the same priority as current, let's
             say q is already queued before p. so, in this case we should
             choose q to do the preempting, not p. So the kernel broke FIFO in 
             this case.
  - Case 4 : p is going to preempt current, but when doing the actual preempting,
             current isn't pushed away due to the change of system status. so
             eventually p becomes the new current and gets running, while 
             current is queued back waiting in the same "run queue". Obviously, 
             the kernel broke FIFO in this case.

NOTE: when a task's affinity is changed and it becomes migratable(see d) described 
in "2. THE WAY THEY WORK"), in current kernel implementation it will be queued 
behind other tasks with the same priority in the "pushable queue". This may seems 
contratry to the corresponding order in the "run queue". But from "pushable queue"'s 
prospective, when a task's affinity is changed and further got removed from or 
added to the "pushable queue" because of that, it means requeuing. In my view, I 
don't think this broke FIFO of the "pushable queue". Any discussion?

But for "case 1" and "case 2", the task eventually wasn't got removed from or added 
to the "pushable queue" due to affinity changing, so its "pushable queue" FIFO 
order should also stays unchanged, and must be the same as its "run queue" order.
So we say "case 1" and "case 2" broke FIFO of its "pushable queue".

4. SUMMARY
case 1 : broke its "pushable queue" FIFO order.
case 2 : broke its "pushable queue" FIFO order.
case 3 : broke its "run queue" FIFO order.
case 4 : broke its "run queue" FIFO order.

5. PATCHSET
Patch 1 : Fix "Case 3".
Patch 2 : Prerequisite for Patch 3 - Provide plist_add_head() for "pushable queue".
Patch 3 : Fix "Case 1" and "Case 2".
Patch 4 : Fix "Case 4".

Xunlei Pang (4):
  sched/rt: Modify check_preempt_equal_prio() for multiple tasks queued
    at the same priority
  lib/plist: Provide plist_add_head() for nodes with the same prio
  sched/rt: Fix wrong SMP scheduler behavior for equal prio cases
  sched/rt: Requeue p back if the preemption initiated by
    check_preempt_equal_prio_common() failed

 include/linux/plist.h    |  34 ++++++-
 include/linux/sched.h    |   5 +
 include/linux/sched/rt.h |  16 +++
 kernel/sched/core.c      |   6 +-
 kernel/sched/rt.c        | 253 ++++++++++++++++++++++++++++++++++++++++-------
 lib/plist.c              |  28 +++++-
 6 files changed, 299 insertions(+), 43 deletions(-)

-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ