lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0902031150110.5290@blonde.anvils>
Date:	Tue, 3 Feb 2009 12:18:28 +0000 (GMT)
From:	Hugh Dickins <hugh@...itas.com>
To:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
cc:	Pekka Enberg <penberg@...helsinki.fi>,
	Nick Piggin <npiggin@...e.de>,
	Linux Memory Management List <linux-mm@...ck.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Lin Ming <ming.m.lin@...el.com>,
	Christoph Lameter <cl@...ux-foundation.org>
Subject: Re: [patch] SLQB slab allocator

On Tue, 3 Feb 2009, Zhang, Yanmin wrote:
> On Mon, 2009-02-02 at 11:00 +0200, Pekka Enberg wrote:
> > On Mon, 2009-02-02 at 11:38 +0800, Zhang, Yanmin wrote:
> > > Can we add a checking about free memory page number/percentage in function
> > > allocate_slab that we can bypass the first try of alloc_pages when memory
> > > is hungry?
> > 
> > If the check isn't too expensive, I don't any reason not to. How would
> > you go about checking how much free pages there are, though? Is there
> > something in the page allocator that we can use for this?
> 
> We can use nr_free_pages(), totalram_pages and hugetlb_total_pages(). Below
> patch is a try. I tested it with hackbench and tbench on my stoakley
> (2 qual-core processors) and tigerton (4 qual-core processors).
> There is almost no regression.

May I repeat what I said yesterday?  Certainly I'm oversimplifying,
but if I'm plain wrong, please correct me.

Having lots of free memory is a temporary accident following process
exit (when lots of anonymous memory has suddenly been freed), before
it has been put to use for page cache.  The kernel tries to run with
a certain amount of free memory in reserve, and the rest of memory
put to (potentially) good use.  I don't think we have the number
you're looking for there, though perhaps some approximation could
be devised (or I'm looking at the problem the wrong way round).

Perhaps feedback from vmscan.c, on how much it's having to write back,
would provide a good clue.  There's plenty of stats maintained there.

> 
> Besides this patch, I have another patch to try to reduce the calculation
> of "totalram_pages - hugetlb_total_pages()", but it touches many files.
> So just post the first simple patch here for review.
> 
> 
> Hugh,
> 
> Would you like to test it on your machines?

Indeed I shall, starting in a few hours when I've finished with trying
the script I promised yesterday to send you.  And I won't be at all
surprised if your patch eliminates my worst cases, because I don't
expect to have any significant amount of free memory during my testing,
and my swap testing suffers from slub's thirst for higher orders.

But I don't believe the kind of check you're making is appropriate,
and I do believe that when you try more extensive testing, you'll find
regressions in other tests which were relying on the higher orders.
If all of your testing happens to have lots of free memory around,
I'm surprised; but perhaps I'm naive about how things actually work,
especially on the larger machines.

Or maybe your tests are relying crucially on the slabs allocated at
system startup, when of course there should be plenty of free memory
around.

By the way, when I went to remind myself of what nr_free_pages()
actually does, my grep immediately hit this remark in mm/mmap.c:
		 * nr_free_pages() is very expensive on large systems,
I hope that's just a stale comment from before it was converted
to global_page_state(NR_FREE_PAGES)!

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ