lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bnebk51f.fsf@rasmusvillemoes.dk>
Date:	Thu, 13 Aug 2015 09:42:52 +0200
From:	Rasmus Villemoes <linux@...musvillemoes.dk>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Al Viro <viro@...iv.linux.org.uk>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: get_vmalloc_info() and /proc/meminfo insanely expensive

On Thu, Aug 13 2015, Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Wed, Aug 12, 2015 at 9:00 PM, Andrew Morton
> <akpm@...ux-foundation.org> wrote:
>>
>> Do your /proc/meminfo vmalloc numbers actually change during that build?
>> Mine don't.  Perhaps we can cache the most recent vmalloc_info and
>> invalidate that cache whenever someone does a vmalloc/vfree/etc.
>
> Sure, that works too.
>
> Looking at that mm/vmalloc.c file, the locking is pretty odd. It looks
> pretty strange in setup_vmalloc_vm(), for example. If that newly
> allocated "va" that we haven't even exposed to anybody yet has its
> address or size changed, we're screwed in so many ways.
>
> I get the feeling this file should be rewritten. But that's not going
> to happen. The "let's just cache the last value for one jiffy" seemed
> to be the minimal fixup to it.
>

I think it's simpler and better to fix glibc. Looking at the history,
the code for get_[av]phys_pages was added in 1996 (in the git repo
commit 845dcb57), with comments such as

 /* Return the number of pages of physical memory in the system.  There
    is currently (as of version 2.0.21) no system call to determine the
    number.  It is planned for the 2.1.x series to add this, though.

and

    /* XXX Here will come a test for the new system call.  */


And that syscall seems to be sysinfo(). So even though sysinfo() also
returns much more than required, it is still more than an order of
magnitude faster than reading and parsing /proc/meminfo (in the quick
microbench I threw together):

#include <stdio.h>
#include <sys/sysinfo.h>
#include <rdtsc.h>

void do_get_phys_pages(void)
{
	get_phys_pages();
}
void do_get_avphys_pages(void)
{
	get_avphys_pages();
}


void do_sysinfo(void)
{
	struct sysinfo info;

	sysinfo(&info);
}

void time_this(const char *name, void (*f)(void), int rep)
{
	int i;
	unsigned long start, stop;

	start = rdtsc();
	for (i = 0; i < rep; ++i)
		f();
	stop = rdtsc();

	printf("%-20s\t%d\t%lu\t%.1f\n", name, rep, stop-start, (double)(stop-start)/rep);
}

#define time_this(f, rep) time_this(#f, do_ ## f, rep)

int main(void)
{
	time_this(sysinfo, 1);
	time_this(get_phys_pages, 1);
	time_this(get_avphys_pages, 1);

	time_this(sysinfo, 1);
	time_this(get_phys_pages, 1);
	time_this(get_avphys_pages, 1);

	time_this(sysinfo, 10);
	time_this(get_phys_pages, 10);
	time_this(get_avphys_pages, 10);
	
	return 0;
}

$ ./sysinfo 
sysinfo                 1       6056    6056.0
get_phys_pages          1       226744  226744.0
get_avphys_pages        1       84480   84480.0
sysinfo                 1       2288    2288.0
get_phys_pages          1       73216   73216.0
get_avphys_pages        1       76692   76692.0
sysinfo                 10      6856    685.6
get_phys_pages          10      626936  62693.6
get_avphys_pages        10      604440  60444.0

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ