lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <496D6300.9070402@kernel.org>
Date:	Wed, 14 Jan 2009 12:58:56 +0900
From:	Tejun Heo <tj@...nel.org>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
CC:	Christoph Lameter <cl@...ux-foundation.org>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Ingo Molnar <mingo@...e.hu>, travis@....com,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>, steiner@....com,
	Hugh Dickins <hugh@...itas.com>
Subject: Re: regarding the x86_64 zero-based percpu patches

Hello, Eric.

Eric W. Biederman wrote:
> Tejun Heo <tj@...nel.org> writes:
>> I don't know.  I think it's a dangerous thing which can be avoided.
>> If there's no other solution, then we might have to live with it but I
>> don't see the winning benefit of such design over per-cpu virtual
>> mapping.
> 
> It isn't incompatible with a per-cpu virtual mapping.  It allows the
> possibility of each cpu reusing the same chunk of virtual address
> space for per cpu memory.
> 
> On x86_64 and other architectures with enough address space bits it allows
> us to share the large pages that we use for the normal memory mapping with
> the ones for per cpu access.
> 
> I definitely think the work of combining the pda and the percpu areas
> into a common area is worthwhile.

Yeah, it's gonna be necessary regardless of which way we go.

> I think it would be nice if the percpu area could grow and would not be
> a fixed size at boot time, I'm not particularly convinced it has to.

The main problem is that the area needs to be congruent which
basically mandates them to be contiguous.  The three alternatives on
table are...

1. Just reserve memory from the get-go.  Simplest.  No additional TLB
   pressure but memory is likely to be wasted and more importantly
   scalability suffers.

2. Reserve address space and map memory as necessary.  We can be much
   more generous about reserving address space especially on 64bit
   machines and probably can mostly forget about scalability issue
   there.  However, getting things just right for address space
   contrained 32bit might not be too easy but then again nothing
   really is scalable on 32bit these days, so we probably can live
   with boot time parameter or something.

   Another issue is added TLB pressure as it's likely to consume 4K
   TLB entries in addition to the default kernel mapping 2M TLB
   entries.  The TLB pressure can be mostly avoided if percpu area is
   sufficiently large to justify 2MB page allocation but it isn't.

3. Do realloc().  This doesn't impose scalability issues or add to TLB
   pressure but it does contrain how the percpu variables can be used
   and introduces certain amount of possibility for scary
   once-in-a-blue-moon never-reproducible bugs.  Maybe such
   possibility can be reduced by putting some restriction on the
   interface but I don't know.  It still scares me.

Hmm... IIUC, the biggest drawback of #2 is the added TLB pressure,
right?  What if we reserve percpu allocation by 2MB chunks?  ie. use
4k mapping but always allocate the percpu pages from aligned 2MB
chunks.  That way it won't waste 2MB per cpu and although it will use
additional 4K TLB entries, it will free up 2MB TLB entries.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ