linux-kernel - Re: [RFC][PATCH 08/10] sched/fair: Implement delayed dequeue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3888d7c8-660e-479c-8c10-8295204e5f36@arm.com>
Date: Mon, 15 Apr 2024 18:07:24 +0100
From: Luis Machado <luis.machado@....com>
To: Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com,
 juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com,
 rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
 bristot@...hat.com, vschneid@...hat.com, linux-kernel@...r.kernel.org
Cc: kprateek.nayak@....com, wuyun.abel@...edance.com, tglx@...utronix.de,
 efault@....de, nd <nd@....com>
Subject: Re: [RFC][PATCH 08/10] sched/fair: Implement delayed dequeue

Hi Peter,

On 4/5/24 11:28, Peter Zijlstra wrote:
> Extend / fix 86bfbb7ce4f6 ("sched/fair: Add lag based placement") by
> noting that lag is fundamentally a temporal measure. It should not be
> carried around indefinitely.
> 
> OTOH it should also not be instantly discarded, doing so will allow a
> task to game the system by purposefully (micro) sleeping at the end of
> its time quantum.
> 
> Since lag is intimately tied to the virtual time base, a wall-time
> based decay is also insufficient, notably competition is required for
> any of this to make sense.
> 
> Instead, delay the dequeue and keep the 'tasks' on the runqueue,
> competing until they are eligible.
> 
> Strictly speaking, we only care about keeping them until the 0-lag
> point, but that is a difficult proposition, instead carry them around
> until they get picked again, and dequeue them at that point.
> 
> Since we should have dequeued them at the 0-lag point, truncate lag
> (eg. don't let them earn positive lag).
> 
> XXX test the cfs-throttle stuff
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>

Playing around with a Pixel 6 running a 6.6-based kernel with this
series backported on top, I spotted a noticeable performance improvement
on the speedometer benchmark:

- m6.6-stock-* is the 6.6 mainline Android kernel unmodified.

- m6.6-eevdf-complete-* is the 6.6 mainline Android kernel with
this series applied on top (along with a few required backported
patches).

+-------------------+-----------------------+-----------+
|      metric       |          tag          | perc_diff |
+-------------------+-----------------------+-----------+
| Speedometer Score |     m6.6-stock-1      |   0.0%    |
| Speedometer Score |     m6.6-stock-2      |   1.23%   |
| Speedometer Score |     m6.6-stock-3      |  -0.22%   |
| Speedometer Score | m6.6-eevdf-complete-1 |   4.54%   |
| Speedometer Score | m6.6-eevdf-complete-2 |   4.95%   |
| Speedometer Score | m6.6-eevdf-complete-3 |   6.07%   |
+-------------------+-----------------------+-----------+

Also some interesting improvements in terms of frame timing for the
uibenchjanktests benchmark. In particular the metrics of missed
deadlines and jank (late) frames, which seems to indicate better
latencies.

+-----------------------+-----------------------+-----------+
|        metric         |          tag          | perc_diff |
+-----------------------+-----------------------+-----------+
| gfx-avg-frame-time-50 |     m6.6-stock-1      |    0.0    |
| gfx-avg-frame-time-90 |     m6.6-stock-1      |    0.0    |
| gfx-avg-frame-time-95 |     m6.6-stock-1      |    0.0    |
| gfx-avg-frame-time-99 |     m6.6-stock-1      |    0.0    |
| gfx-avg-frame-time-50 |     m6.6-stock-2      |   3.46    |
| gfx-avg-frame-time-90 |     m6.6-stock-2      |   1.19    |
| gfx-avg-frame-time-95 |     m6.6-stock-2      |   0.24    |
| gfx-avg-frame-time-99 |     m6.6-stock-2      |   0.48    |
| gfx-avg-frame-time-50 | m6.6-eevdf-complete-1 |  -30.45   |
| gfx-avg-frame-time-90 | m6.6-eevdf-complete-1 |  -48.44   |
| gfx-avg-frame-time-95 | m6.6-eevdf-complete-1 |  -51.32   |
| gfx-avg-frame-time-99 | m6.6-eevdf-complete-1 |  -52.48   |
| gfx-avg-frame-time-50 | m6.6-eevdf-complete-2 |  -30.32   |
| gfx-avg-frame-time-90 | m6.6-eevdf-complete-2 |  -48.16   |
| gfx-avg-frame-time-95 | m6.6-eevdf-complete-2 |  -51.08   |
| gfx-avg-frame-time-99 | m6.6-eevdf-complete-2 |   -51.7   |
+-----------------------+-----------------------+-----------+

+-----------------------------------+-----------------------+-----------+
|              metric               |          tag          | perc_diff |
+-----------------------------------+-----------------------+-----------+
| gfx-avg-num-frame-deadline-missed |     m6.6-stock-1      |    0.0    |
| gfx-max-num-frame-deadline-missed |     m6.6-stock-1      |    0.0    |
| gfx-avg-num-frame-deadline-missed |     m6.6-stock-2      |   -3.21   |
| gfx-max-num-frame-deadline-missed |     m6.6-stock-2      |   -3.21   |
| gfx-avg-num-frame-deadline-missed | m6.6-eevdf-complete-1 |  -85.29   |
| gfx-max-num-frame-deadline-missed | m6.6-eevdf-complete-1 |  -85.29   |
| gfx-avg-num-frame-deadline-missed | m6.6-eevdf-complete-2 |   -84.8   |
| gfx-max-num-frame-deadline-missed | m6.6-eevdf-complete-2 |   -84.8   |
+-----------------------------------+-----------------------+-----------+

+----------------------------+-----------------------+-----------+
|           metric           |          tag          | perc_diff |
+----------------------------+-----------------------+-----------+
| gfx-avg-high-input-latency |     m6.6-stock-1      |    0.0    |
| gfx-max-high-input-latency |     m6.6-stock-1      |    0.0    |
| gfx-avg-high-input-latency |     m6.6-stock-2      |   0.93    |
| gfx-max-high-input-latency |     m6.6-stock-2      |   0.93    |
| gfx-avg-high-input-latency | m6.6-eevdf-complete-1 |  -18.35   |
| gfx-max-high-input-latency | m6.6-eevdf-complete-1 |  -18.35   |
| gfx-avg-high-input-latency | m6.6-eevdf-complete-2 |  -18.05   |
| gfx-max-high-input-latency | m6.6-eevdf-complete-2 |  -18.05   |
+----------------------------+-----------------------+-----------+

+--------------+-----------------------+-----------+
|    metric    |          tag          | perc_diff |
+--------------+-----------------------+-----------+
| gfx-avg-jank |     m6.6-stock-1      |    0.0    |
| gfx-max-jank |     m6.6-stock-1      |    0.0    |
| gfx-avg-jank |     m6.6-stock-2      |   1.56    |
| gfx-max-jank |     m6.6-stock-2      |   1.56    |
| gfx-avg-jank | m6.6-eevdf-complete-1 |  -82.81   |
| gfx-max-jank | m6.6-eevdf-complete-1 |  -82.81   |
| gfx-avg-jank | m6.6-eevdf-complete-2 |  -78.12   |
| gfx-max-jank | m6.6-eevdf-complete-2 |  -78.12   |
+--------------+-----------------------+-----------+

Bisecting through the patches in this series, I ended up with patch 08/10
as the one that improved things overall for these benchmarks.

I'd like to investigate this further to understand the reason behind some of
these dramatic improvements.

> ---
>  include/linux/sched.h   |    1 
>  kernel/sched/core.c     |   22 +++++--
>  kernel/sched/fair.c     |  148 +++++++++++++++++++++++++++++++++++++++++++-----
>  kernel/sched/features.h |   12 +++
>  kernel/sched/sched.h    |    2 
>  5 files changed, 167 insertions(+), 18 deletions(-)
> 
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -542,6 +542,7 @@ struct sched_entity {
>  
>  	struct list_head		group_node;
>  	unsigned int			on_rq;
> +	unsigned int			sched_delayed;
>  
>  	u64				exec_start;
>  	u64				sum_exec_runtime;
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2154,10 +2154,14 @@ void activate_task(struct rq *rq, struct
>  
>  void deactivate_task(struct rq *rq, struct task_struct *p, int flags)
>  {
> -	WRITE_ONCE(p->on_rq, (flags & DEQUEUE_SLEEP) ? 0 : TASK_ON_RQ_MIGRATING);
> -	ASSERT_EXCLUSIVE_WRITER(p->on_rq);
> +	bool sleep = flags & DEQUEUE_SLEEP;
>  
> -	dequeue_task(rq, p, flags);
> +	if (dequeue_task(rq, p, flags)) {
> +		WRITE_ONCE(p->on_rq, sleep ? 0 : TASK_ON_RQ_MIGRATING);
> +		ASSERT_EXCLUSIVE_WRITER(p->on_rq);
> +	} else {
> +		SCHED_WARN_ON(!sleep); /* only sleep can fail */
> +	}
>  }
>  
>  static inline int __normal_prio(int policy, int rt_prio, int nice)
> @@ -3858,12 +3862,17 @@ static int ttwu_runnable(struct task_str
>  
>  	rq = __task_rq_lock(p, &rf);
>  	if (task_on_rq_queued(p)) {
> +		update_rq_clock(rq);
> +		if (p->se.sched_delayed) {
> +			/* mustn't run a delayed task */
> +			SCHED_WARN_ON(task_on_cpu(rq, p));
> +			enqueue_task(rq, p, ENQUEUE_DELAYED);
> +		}
>  		if (!task_on_cpu(rq, p)) {
>  			/*
>  			 * When on_rq && !on_cpu the task is preempted, see if
>  			 * it should preempt the task that is current now.
>  			 */
> -			update_rq_clock(rq);
>  			wakeup_preempt(rq, p, wake_flags);
>  		}
>  		ttwu_do_wakeup(p);
> @@ -4243,11 +4252,16 @@ int try_to_wake_up(struct task_struct *p
>  		 * case the whole 'p->on_rq && ttwu_runnable()' case below
>  		 * without taking any locks.
>  		 *
> +		 * Specifically, given current runs ttwu() we must be before
> +		 * schedule()'s deactivate_task(), as such this must not
> +		 * observe sched_delayed.
> +		 *
>  		 * In particular:
>  		 *  - we rely on Program-Order guarantees for all the ordering,
>  		 *  - we're serialized against set_special_state() by virtue of
>  		 *    it disabling IRQs (this allows not taking ->pi_lock).
>  		 */
> +		SCHED_WARN_ON(p->se.sched_delayed);
>  		if (!ttwu_state_match(p, state, &success))
>  			goto out;
>  
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5270,6 +5270,10 @@ static inline int cfs_rq_throttled(struc
>  static inline bool cfs_bandwidth_used(void);
>  
>  static void
> +requeue_delayed_entity(struct sched_entity *se);
> +
> +/* XXX bool and pull in the requeue_delayed_entity thing */
> +static void
>  enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
>  {
>  	bool curr = cfs_rq->curr == se;
> @@ -5356,18 +5360,33 @@ static void clear_buddies(struct cfs_rq
>  
>  static __always_inline void return_cfs_rq_runtime(struct cfs_rq *cfs_rq);
>  
> -static void
> +static bool
>  dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
>  {
> -	int action = UPDATE_TG;
> -
> -	if (entity_is_task(se) && task_on_rq_migrating(task_of(se)))
> -		action |= DO_DETACH;
> -
>  	/*
>  	 * Update run-time statistics of the 'current'.
>  	 */
> -	update_curr(cfs_rq);
> +	if (flags & DEQUEUE_DELAYED) {
> +		SCHED_WARN_ON(!se->sched_delayed);
> +		se->sched_delayed = 0;
> +	} else {
> +		bool sleep = flags & DEQUEUE_SLEEP;
> +
> +		SCHED_WARN_ON(sleep && se->sched_delayed);
> +		update_curr(cfs_rq);
> +
> +		if (sched_feat(DELAY_DEQUEUE) && sleep &&
> +		    !entity_eligible(cfs_rq, se)) {
> +			if (cfs_rq->next == se)
> +				cfs_rq->next = NULL;
> +			se->sched_delayed = 1;
> +			return false;
> +		}
> +	}
> +
> +	int action = UPDATE_TG;
> +	if (entity_is_task(se) && task_on_rq_migrating(task_of(se)))
> +		action |= DO_DETACH;
>  
>  	/*
>  	 * When dequeuing a sched_entity, we must:
> @@ -5407,6 +5426,8 @@ dequeue_entity(struct cfs_rq *cfs_rq, st
>  
>  	if (cfs_rq->nr_running == 0)
>  		update_idle_cfs_rq_clock_pelt(cfs_rq);
> +
> +	return true;
>  }
>  
>  static void
> @@ -5432,6 +5453,7 @@ set_next_entity(struct cfs_rq *cfs_rq, s
>  	}
>  
>  	update_stats_curr_start(cfs_rq, se);
> +	SCHED_WARN_ON(cfs_rq->curr);
>  	cfs_rq->curr = se;
>  
>  	/*
> @@ -5452,6 +5474,8 @@ set_next_entity(struct cfs_rq *cfs_rq, s
>  	se->prev_sum_exec_runtime = se->sum_exec_runtime;
>  }
>  
> +static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags);
> +
>  /*
>   * Pick the next process, keeping these things in mind, in this order:
>   * 1) keep things fair between processes/task groups
> @@ -5460,16 +5484,29 @@ set_next_entity(struct cfs_rq *cfs_rq, s
>   * 4) do not run the "skip" process, if something else is available
>   */
>  static struct sched_entity *
> -pick_next_entity(struct cfs_rq *cfs_rq)
> +pick_next_entity(struct rq *rq, struct cfs_rq *cfs_rq)
>  {
>  	/*
>  	 * Enabling NEXT_BUDDY will affect latency but not fairness.
>  	 */
>  	if (sched_feat(NEXT_BUDDY) &&
> -	    cfs_rq->next && entity_eligible(cfs_rq, cfs_rq->next))
> +	    cfs_rq->next && entity_eligible(cfs_rq, cfs_rq->next)) {
> +		/* ->next will never be delayed */
> +		SCHED_WARN_ON(cfs_rq->next->sched_delayed);
>  		return cfs_rq->next;
> +	}
>  
> -	return pick_eevdf(cfs_rq);
> +	struct sched_entity *se = pick_eevdf(cfs_rq);
> +	if (se->sched_delayed) {
> +		dequeue_entities(rq, se, DEQUEUE_SLEEP | DEQUEUE_DELAYED);
> +		SCHED_WARN_ON(se->sched_delayed);
> +		SCHED_WARN_ON(se->on_rq);
> +		if (sched_feat(DELAY_ZERO) && se->vlag > 0)
> +			se->vlag = 0;
> +
> +		return NULL;
> +	}
> +	return se;
>  }
>  
>  static bool check_cfs_rq_runtime(struct cfs_rq *cfs_rq);
> @@ -5493,6 +5530,7 @@ static void put_prev_entity(struct cfs_r
>  		/* in !on_rq case, update occurred at dequeue */
>  		update_load_avg(cfs_rq, prev, 0);
>  	}
> +	SCHED_WARN_ON(cfs_rq->curr != prev);
>  	cfs_rq->curr = NULL;
>  }
>  
> @@ -5793,6 +5831,10 @@ static bool throttle_cfs_rq(struct cfs_r
>  		if (!se->on_rq)
>  			goto done;
>  
> +		/*
> +		 * XXX should be fine vs sched_delay; if won't run after this.
> +		 * Either pick dequeues it, or unthrottle. Double check!!
> +		 */
>  		dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
>  
>  		if (cfs_rq_is_idle(group_cfs_rq(se)))
> @@ -5882,8 +5924,11 @@ void unthrottle_cfs_rq(struct cfs_rq *cf
>  	for_each_sched_entity(se) {
>  		struct cfs_rq *qcfs_rq = cfs_rq_of(se);
>  
> -		if (se->on_rq)
> +		if (se->on_rq) {
> +			if (se->sched_delayed)
> +				requeue_delayed_entity(se);
>  			break;
> +		}
>  		enqueue_entity(qcfs_rq, se, ENQUEUE_WAKEUP);
>  
>  		if (cfs_rq_is_idle(group_cfs_rq(se)))
> @@ -6729,6 +6774,40 @@ static int sched_idle_cpu(int cpu)
>  }
>  #endif
>  
> +static void
> +requeue_delayed_entity(struct sched_entity *se)
> +{
> +	struct cfs_rq *cfs_rq = cfs_rq_of(se);
> +
> +	/*
> +	 * se->sched_delayed should imply both: se->on_rq == 1 and
> +	 * cfs_rq->curr != se. Because a delayed entity is one that is still on
> +	 * the runqueue competing until elegibility.
> +	 *
> +	 * Except for groups, consider current going idle and newidle pulling a
> +	 * task in the same group -- in that case 'cfs_rq->curr == se'.
> +	 */
> +	SCHED_WARN_ON(!se->sched_delayed);
> +	SCHED_WARN_ON(!se->on_rq);
> +	SCHED_WARN_ON(entity_is_task(se) && cfs_rq->curr == se);
> +
> +	if (sched_feat(DELAY_ZERO)) {
> +		update_entity_lag(cfs_rq, se);
> +		if (se->vlag > 0) {
> +			cfs_rq->nr_running--;
> +			if (se != cfs_rq->curr)
> +				__dequeue_entity(cfs_rq, se);
> +			se->vlag = 0;
> +			place_entity(cfs_rq, se, 0);
> +			if (se != cfs_rq->curr)
> +				__enqueue_entity(cfs_rq, se);
> +			cfs_rq->nr_running++;
> +		}
> +	}
> +
> +	se->sched_delayed = 0;
> +}
> +
>  /*
>   * The enqueue_task method is called before nr_running is
>   * increased. Here we update the fair scheduling stats and
> @@ -6742,6 +6821,11 @@ enqueue_task_fair(struct rq *rq, struct
>  	int idle_h_nr_running = task_has_idle_policy(p);
>  	int task_new = !(flags & ENQUEUE_WAKEUP);
>  
> +	if (flags & ENQUEUE_DELAYED) {
> +		requeue_delayed_entity(se);
> +		return;
> +	}
> +
>  	/*
>  	 * The code below (indirectly) updates schedutil which looks at
>  	 * the cfs_rq utilization to select a frequency.
> @@ -6759,8 +6843,11 @@ enqueue_task_fair(struct rq *rq, struct
>  		cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
>  
>  	for_each_sched_entity(se) {
> -		if (se->on_rq)
> +		if (se->on_rq) {
> +			if (se->sched_delayed)
> +				requeue_delayed_entity(se);
>  			break;
> +		}
>  		cfs_rq = cfs_rq_of(se);
>  		enqueue_entity(cfs_rq, se, flags);
>  
> @@ -6836,6 +6923,7 @@ static int dequeue_entities(struct rq *r
>  {
>  	bool was_sched_idle = sched_idle_rq(rq);
>  	bool task_sleep = flags & DEQUEUE_SLEEP;
> +	bool task_delayed = flags & DEQUEUE_DELAYED;
>  	struct task_struct *p = NULL;
>  	struct cfs_rq *cfs_rq;
>  	int idle_h_nr_running;
> @@ -6849,7 +6937,13 @@ static int dequeue_entities(struct rq *r
>  
>  	for_each_sched_entity(se) {
>  		cfs_rq = cfs_rq_of(se);
> -		dequeue_entity(cfs_rq, se, flags);
> +
> +		if (!dequeue_entity(cfs_rq, se, flags)) {
> +			if (p && &p->se == se)
> +				return -1;
> +
> +			break;
> +		}
>  
>  		/* h_nr_running is the hierachical count of tasks */
>  		if (p) {
> @@ -6877,6 +6971,7 @@ static int dequeue_entities(struct rq *r
>  			break;
>  		}
>  		flags |= DEQUEUE_SLEEP;
> +		flags &= ~DEQUEUE_DELAYED;
>  	}
>  
>  	for_each_sched_entity(se) {
> @@ -6906,6 +7001,18 @@ static int dequeue_entities(struct rq *r
>  		/* balance early to pull high priority tasks */
>  		if (unlikely(!was_sched_idle && sched_idle_rq(rq)))
>  			rq->next_balance = jiffies;
> +
> +		if (task_delayed) {
> +			SCHED_WARN_ON(!task_sleep);
> +			SCHED_WARN_ON(p->on_rq != 1);
> +
> +			/* Fix-up what dequeue_task_fair() skipped */
> +			util_est_update(&rq->cfs, p, task_sleep);
> +			hrtick_update(rq);
> +
> +			/* Fix-up what deactivate_task() skipped. */
> +			WRITE_ONCE(p->on_rq, 0);
> +		}
>  	}
>  
>  	return 1;
> @@ -6923,6 +7030,10 @@ static bool dequeue_task_fair(struct rq
>  	if (dequeue_entities(rq, &p->se, flags) < 0)
>  		return false;
>  
> +	/*
> +	 * It doesn't make sense to update util_est for the delayed dequeue
> +	 * case where ttwu will make it appear the sleep never happened.
> +	 */
>  	util_est_update(&rq->cfs, p, flags & DEQUEUE_SLEEP);
>  	hrtick_update(rq);
>  	return true;
> @@ -8463,7 +8574,9 @@ static struct task_struct *pick_task_fai
>  		if (unlikely(check_cfs_rq_runtime(cfs_rq)))
>  			goto again;
>  
> -		se = pick_next_entity(cfs_rq);
> +		se = pick_next_entity(rq, cfs_rq);
> +		if (!se)
> +			goto again;
>  		cfs_rq = group_cfs_rq(se);
>  	} while (cfs_rq);
>  
> @@ -12803,10 +12916,17 @@ static void attach_task_cfs_rq(struct ta
>  static void switched_from_fair(struct rq *rq, struct task_struct *p)
>  {
>  	detach_task_cfs_rq(p);
> +	// XXX think harder on this
> +	// this could actually be handled correctly I suppose; keep the whole
> +	// se enqueued while boosted. horrible complexity though
> +	p->se.sched_delayed = 0;
> +	// XXX also vlag ?!?
>  }
>  
>  static void switched_to_fair(struct rq *rq, struct task_struct *p)
>  {
> +	SCHED_WARN_ON(p->se.sched_delayed);
> +
>  	attach_task_cfs_rq(p);
>  
>  	set_task_max_allowed_capacity(p);
> --- a/kernel/sched/features.h
> +++ b/kernel/sched/features.h
> @@ -29,6 +29,18 @@ SCHED_FEAT(NEXT_BUDDY, false)
>  SCHED_FEAT(CACHE_HOT_BUDDY, true)
>  
>  /*
> + * Delay dequeueing tasks until they get selected or woken.
> + *
> + * By delaying the dequeue for non-eligible tasks, they remain in the
> + * competition and can burn off their negative lag. When they get selected
> + * they'll have positive lag by definition.
> + *
> + * DELAY_ZERO clips the lag on dequeue (or wakeup) to 0.
> + */
> +SCHED_FEAT(DELAY_DEQUEUE, true)
> +SCHED_FEAT(DELAY_ZERO, true)
> +
> +/*
>   * Allow wakeup-time preemption of the current task:
>   */
>  SCHED_FEAT(WAKEUP_PREEMPTION, true)
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -2245,6 +2245,7 @@ extern const u32		sched_prio_to_wmult[40
>  #define DEQUEUE_MOVE		0x04 /* Matches ENQUEUE_MOVE */
>  #define DEQUEUE_NOCLOCK		0x08 /* Matches ENQUEUE_NOCLOCK */
>  #define DEQUEUE_MIGRATING	0x100 /* Matches ENQUEUE_MIGRATING */
> +#define DEQUEUE_DELAYED		0x200 /* Matches ENQUEUE_DELAYED */
>  
>  #define ENQUEUE_WAKEUP		0x01
>  #define ENQUEUE_RESTORE		0x02
> @@ -2260,6 +2261,7 @@ extern const u32		sched_prio_to_wmult[40
>  #endif
>  #define ENQUEUE_INITIAL		0x80
>  #define ENQUEUE_MIGRATING	0x100
> +#define ENQUEUE_DELAYED		0x200
>  
>  #define RETRY_TASK		((void *)-1UL)
>  
> 
> 
>