[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110308005706.GB5169@us.ibm.com>
Date: Mon, 7 Mar 2011 16:57:06 -0800
From: Nishanth Aravamudan <nacc@...ibm.com>
To: David Rientjes <rientjes@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Dave Hansen <dave@...ux.vnet.ibm.com>,
Petr Holasek <pholasek@...hat.com>,
linux-kernel@...r.kernel.org, emunson@...bm.net, anton@...hat.com,
Andi Kleen <ak@...ux.intel.com>, Mel Gorman <mel@....ul.ie>,
Wu Fengguang <fengguang.wu@...el.com>, linux-mm@...ck.org
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of
hugepages
Hi David,
On 07.03.2011 [15:47:23 -0800], David Rientjes wrote:
> On Mon, 7 Mar 2011, Andrew Morton wrote:
>
> > > > > On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> > > > > > + for_each_hstate(h)
> > > > > > + seq_printf(m,
> > > > > > + "HugePages_Total: %5lu\n"
> > > > > > + "HugePages_Free: %5lu\n"
> > > > > > + "HugePages_Rsvd: %5lu\n"
> > > > > > + "HugePages_Surp: %5lu\n"
> > > > > > + "Hugepagesize: %8lu kB\n",
> > > > > > + h->nr_huge_pages,
> > > > > > + h->free_huge_pages,
> > > > > > + h->resv_huge_pages,
> > > > > > + h->surplus_huge_pages,
> > > > > > + 1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> > > > > > }
> > > > >
> > > > > It sounds like now we'll get a meminfo that looks like:
> > > > >
> > > > > ...
> > > > > AnonHugePages: 491520 kB
> > > > > HugePages_Total: 5
> > > > > HugePages_Free: 2
> > > > > HugePages_Rsvd: 3
> > > > > HugePages_Surp: 1
> > > > > Hugepagesize: 2048 kB
> > > > > HugePages_Total: 2
> > > > > HugePages_Free: 1
> > > > > HugePages_Rsvd: 1
> > > > > HugePages_Surp: 1
> > > > > Hugepagesize: 1048576 kB
> > > > > DirectMap4k: 12160 kB
> > > > > DirectMap2M: 2082816 kB
> > > > > DirectMap1G: 2097152 kB
> > > > >
> > > > > At best, that's a bit confusing. There aren't any other entries in
> > > > > meminfo that occur more than once. Plus, this information is available
> > > > > in the sysfs interface. Why isn't that sufficient?
> > > > >
> > > > > Could we do something where we keep the default hpage_size looking like
> > > > > it does now, but append the size explicitly for the new entries?
> > > > >
> > > > > HugePages_Total(1G): 2
> > > > > HugePages_Free(1G): 1
> > > > > HugePages_Rsvd(1G): 1
> > > > > HugePages_Surp(1G): 1
> > > > >
> > > >
> > > > Let's not change the existing interface, please.
> > > >
> > > > Adding new fields: OK.
> > > > Changing the way in whcih existing fields are calculated: OKish.
> > > > Renaming existing fields: not OK.
> > >
> > > How about lining up multiple values in each field like this?
> > >
> > > HugePages_Total: 5 2
> > > HugePages_Free: 2 1
> > > HugePages_Rsvd: 3 1
> > > HugePages_Surp: 1 1
> > > Hugepagesize: 2048 1048576 kB
> > > ...
> > >
> > > This doesn't change the field names and the impact for user space
> > > is still small?
> >
> > It might break some existing parsers, dunno.
> >
> > It was a mistake to assume that all hugepages will have the same size
> > for all time, and we just have to live with that mistake.
> >
>
> I'm not sure it was a mistake: the kernel has a default hugepage size and
> that's what the global /proc/sys/vm/nr_hugepages tunable uses, so it seems
> appropriate that its statistics are exported in the global /proc/meminfo.
Yep, the intent was for meminfo to (continue to) document the default
hugepage size's usage, and for any other size's statistics to be
accessed by the appropriate sysfs entries.
> > I'd suggest that we leave meminfo alone, just ensuring that its output
> > makes some sense. Instead create a new interface which presents all
> > the required info in a sensible fashion and migrate usersapce reporting
> > tools over to that interface. Just let the meminfo field die a slow
> > death.
> >
>
> (Adding Nishanth to the cc)
>
> It's already there, all this data is available for all the configured
> hugepage sizes via /sys/kernel/mm/hugepages/hugepages-<size>kB/ as
> described by Documentation/ABI/testing/sysfs-kernel-mm-hugepages.
>
> It looks like Nishanth and others put quite a bit of effort into
> making as stable of an API as possible for this information.
I'm not sure if libhugetlbfs already has a tool for parsing the values
there (i.e., to give an end-user a quick'n'dirty snapshot of overall
current hugepage usage). Eric? If not, probably something worth having.
I believe we also have the per-node information in sysfs too, in case
that's relevant to tooling.
Thanks,
Nish
--
Nishanth Aravamudan <nacc@...ibm.com>
IBM Linux Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists