lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 3 Feb 2009 08:40:56 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Lee Schermerhorn <Lee.Schermerhorn@...com>
cc:	Hugh Dickins <hugh@...itas.com>, Greg KH <gregkh@...e.de>,
	Maksim Yevmenkin <maksim.yevmenkin@...il.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Nick Piggin <npiggin@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	will@...wder-design.com, Rik van Riel <riel@...hat.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Mikos Szeredi <miklos@...redi.hu>
Subject: Re: [PATCH] Fix OOPS in mmap_region() when merging adjacent VM_LOCKED
 file segments



On Tue, 3 Feb 2009, Lee Schermerhorn wrote:
> 
> This reminded me of something I'd seen recently looking
> at /proc/<pid>/[numa]_maps for <a large commercial database> on
> Linux/x86_64: 
> 
> 2adadf2b9000-2adadf2c0000 rwxp 00000000 00:0e 4072                       /dev/zero
> 
> For portability between Linux and various Unix-like systems that don't
> support MAP_ANON*, perhaps?

Odd. 

At first I thought that it is just that Linux will turn a MAP_SHARED | 
MAP_ANON into that /dev/zero thing, so you won't be able to tell by lookup 
at /proc/maps. So it would be very possible that the application did not 
actually open /dev/zero at all, and used MAP_ANON instead (see the whole 
shmem_zero_setup() and shmem_file_setup() thing).

But those mappings have that 'p' for private there, so it's not 
MAP_SHARED. And yes, that means that your large commercial database really 
did open /dev/zero and mapped it privately. They must be living in the 
past.

> Anyway, from the addresses and permissions, these all look potentially
> mergeable.  The offset is preventing merging, right?  I guess that's one
> of the downsides of mapping /dev/zero rather than using MAP_ANONYMOUS?

Yeah. The MAP_ANON code has a total hack:

                case MAP_PRIVATE:
                        /*
                         * Set pgoff according to addr for anon_vma.
                         */
                        pgoff = addr >> PAGE_SHIFT;
                        break;

where the whole point is to allow sharing: since pgoff doesn't matter, we 
can make it be something that will merge _if_ you don't play games (of 
course, if you then start usign mremap to move things around, that all 
breaks, and you lose the merging ;)

That said, if it's just a hundred segments, nobody really cares. It's 
going to make vma lookup fractionally slower, but not so anybody would 
likely ever notice even in benchmarks. And if it's just this one db, it's 
certainly not going to use any noticeable amount of memory either.

Merging is important, but it's important to avoid the _really_ common 
cases, and to make /proc/maps more readable etc. It's not like it matters 
for the occasional crazy setup.

But you could still try to teach the DB people to use MAP_ANON.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ