linux-kernel - Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130903191950.GC30757@gmail.com>
Date:	Tue, 3 Sep 2013 21:19:50 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Al Viro <viro@...iv.linux.org.uk>,
	Sedat Dilek <sedat.dilek@...il.com>,
	Waiman Long <waiman.long@...com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Jeff Layton <jlayton@...hat.com>,
	Miklos Szeredi <mszeredi@...e.cz>,
	Ingo Molnar <mingo@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Andi Kleen <andi@...stfloor.org>,
	"Chandramouleeswaran, Aswin" <aswin@...com>,
	"Norton, Scott J" <scott.norton@...com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Arnaldo Carvalho de Melo <acme@...radead.org>
Subject: Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless
 update of refcount


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Tue, Sep 3, 2013 at 8:41 AM, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > I've done that, and it matches the PEBS runs, except obviously with
> > the instruction skew (so then depending on run it's 95% the
> > instruction after the xadd). So the PEBS profiles are entirely
> > consistent with other data.
> 
> So one thing that strikes me about our lg-locks is that they are 
> designed to be cheap, but they force this insane 3-deep memory access 
> chain to lock them.
> 
> That may be a large part of why lg_local_lock shows up so clearly on my 
> profiles: the single "lock xadd" instruction ends up not just being 
> serializing, but it is what actually consumes the previous memory reads.
> 
> The core of the lg_local_lock sequence ends up being this 
> four-instruction sequence:
> 
>     mov    (%rdi),%rdx
>     add    %gs:0xcd48,%rdx
>     mov    $0x100,%eax
>     lock   xadd   %ax,(%rdx)
> 
> and that's a nasty chain of dependent memory loads. First we load the 
> percpu address, then we add the percpu offset to that, and then we do 
> the xadd on the result.
> 
> It's kind of sad, because in *theory* we could get rid of that whole 
> thing entirely, and just do it as one single
> 
>     mov    $0x100,%eax
>     lock xadd %ax,%gs:vfsmount_lock
> 
> that only has one single memory access, not three dependent ones.
> 
> But the two extra memory accesses come from:
> 
>  - the lglock data structure isn't a percpu data structure, it's this 
> stupid global data structure that has a percpu pointer in it.  So that 
> first "mov (%rdi),%rdx" is purely to load what is effectively a constant 
> address (per lglock).
> 
>    And that's not because it wants to be, but because we associate 
> global lockdep data with it. Ugh. If it wasn't for that, we could just 
> make them percpu.

I don't think that's fundamental - the per CPU lock was percpu before:

 #define DEFINE_LGLOCK(name)                                            \
-                                                                       \
- DEFINE_SPINLOCK(name##_cpu_lock);                                     \
- DEFINE_PER_CPU(arch_spinlock_t, name##_lock);                         \
- DEFINE_LGLOCK_LOCKDEP(name);                                          \


but AFAICS got converted to a pointer via this commit:

 commit eea62f831b8030b0eeea8314eed73b6132d1de26
 Author: Andi Kleen <ak@...ux.intel.com>
 Date:   Tue May 8 13:32:24 2012 +0930

    brlocks/lglocks: turn into functions
    
    lglocks and brlocks are currently generated with some complicated 
    macros in lglock.h.  But there's no reason to not just use common 
    utility functions and put all the data into a common data structure.
    
    Since there are at least two users it makes sense to share this code 
    in a library.  This is also easier maintainable than a macro forest.
    
    This will also make it later possible to dynamically allocate lglocks 
    and also use them in modules (this would both still need some 
    additional, but now straightforward, code)

Which was a rather misguided premise IMHO.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/