[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200819133320.bxwb3ikjswyhmsyg@linutronix.de>
Date: Wed, 19 Aug 2020 15:33:20 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: peterz@...radead.org
Cc: linux-kernel@...r.kernel.org, io-uring@...r.kernel.org,
Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Jens Axboe <axboe@...nel.dk>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC PATCH] sched: Invoke io_wq_worker_sleeping() with enabled
preemption
On 2020-08-19 15:15:07 [+0200], peterz@...radead.org wrote:
> > - if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) {
> > + if (tsk->flags & PF_WQ_WORKER) {
> > preempt_disable();
> > - if (tsk->flags & PF_WQ_WORKER)
> > - wq_worker_sleeping(tsk);
> > - else
> > - io_wq_worker_sleeping(tsk);
> > + wq_worker_sleeping(tsk);
> > preempt_enable_no_resched();
> > }
> >
> > if (tsk_is_pi_blocked(tsk))
> > return;
> >
> > + if (tsk->flags & PF_IO_WORKER)
> > + io_wq_worker_sleeping(tsk);
> > +
>
> Urgh, so this adds a branch in what is normally considered a fairly hot
> path.
>
> I'm thinking that the raw_spinlock_t option would permit leaving that
> single:
>
> if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER))
>
> branch intact?
The compiler generates code to test for both flags at once. If none of
both possible flags are set then there is one branch (get out and bring
me to tst_is_pi…).
And yes, with raw_spinlock_t we could keep that one branch.
If you want to optimize further, we could move PF_IO_WORKER to an lower
bit. x86 can test for both via
(gcc-10)
| testl $536870944, 44(%rbp) #, _11->flags
| jne .L1635 #,
(clang-9)
| testl $536870944, 44(%rbx) # imm = 0x20000020
| je .LBB112_6
but ARM can't and does
| ldr r1, [r5, #16] @ tsk_3->flags, tsk_3->flags
| mov r2, #32 @ tmp157,
| movt r2, 8192 @ tmp157,
| tst r2, r1 @ tmp157, tsk_3->flags
| beq .L998 @,
same ARM64
| ldr w0, [x20, 60] //, _11->flags
| and w0, w0, 1073741792 // tmp117, _11->flags,
| and w0, w0, -536870849 // tmp117, tmp117,
| cbnz w0, .L453 // tmp117,
using 0x10 for PF_IO_WORKER instead will turn this into:
| ldr w0, [x20, 60] //, _11->flags
| tst w0, 48 // _11->flags,
| bne .L453 //,
ARM:
| ldr r2, [r5, #16] @ tsk_3->flags, tsk_3->flags
| tst r2, #48 @ tsk_3->flags,
| beq .L998 @,
Sebastian
Powered by blists - more mailing lists