lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130814160632.GJ24092@twins.programming.kicks-ass.net>
Date:	Wed, 14 Aug 2013 18:06:32 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Mike Galbraith <bitbucket@...ine.de>
Cc:	"H. Peter Anvin" <hpa@...or.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...nel.org>,
	Andi Kleen <ak@...ux.intel.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org
Subject: Re: [RFC][PATCH 0/5] preempt_count rework

On Wed, Aug 14, 2013 at 05:39:11PM +0200, Mike Galbraith wrote:
> On Wed, 2013-08-14 at 06:47 -0700, H. Peter Anvin wrote:
> 
> > On x86, you never want to take the address of a percpu variable if you
> > can avoid it, as you end up generating code like:
> > 
> > 	movq %fs:0,%rax
> > 	subl $1,(%rax)
> 
> Hmmm..
> 
> #define cpu_rq(cpu)             (&per_cpu(runqueues, (cpu)))
> #define this_rq()               (&__get_cpu_var(runqueues))
> 
> ffffffff81438c7f:       48 c7 c3 80 11 01 00    mov    $0x11180,%rbx
>         /*
>          * this_rq must be evaluated again because prev may have moved
>          * CPUs since it called schedule(), thus the 'rq' on its stack
>          * frame will be invalid.
>          */
>         finish_task_switch(this_rq(), prev);
> ffffffff81438c86:       e8 25 b4 c0 ff          callq  ffffffff810440b0 <finish_task_switch>
>                  * The context switch have flipped the stack from under us
>                  * and restored the local variables which were saved when
>                  * this task called schedule() in the past. prev == current
>                  * is still correct, but it can be moved to another cpu/rq.
>                  */
>                 cpu = smp_processor_id();
> ffffffff81438c8b:       65 8b 04 25 b8 c5 00    mov    %gs:0xc5b8,%eax
> ffffffff81438c92:       00
>                 rq = cpu_rq(cpu);
> ffffffff81438c93:       48 98                   cltq
> ffffffff81438c95:       48 03 1c c5 00 f3 bb    add    -0x7e440d00(,%rax,8),%rbx
> 
> ..so could the rq = cpu_rq(cpu) sequence be improved cycle expenditure
> wise by squirreling rq pointer away in a percpu this_rq, and replacing
> cpu_rq(cpu) above with a __this_cpu_read(this_rq) version of this_rq()?

Well, this_rq() should already get you that. The above code sucks for
using cpu_rq() when we know cpu == smp_processor_id().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ