lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110308093723.GA19206@csn.ul.ie>
Date:	Tue, 8 Mar 2011 09:37:23 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	Nishanth Aravamudan <nacc@...ibm.com>
Cc:	David Rientjes <rientjes@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	Petr Holasek <pholasek@...hat.com>,
	linux-kernel@...r.kernel.org, emunson@...bm.net, anton@...hat.com,
	Andi Kleen <ak@...ux.intel.com>,
	Wu Fengguang <fengguang.wu@...el.com>, linux-mm@...ck.org
Subject: Re: [PATCH] hugetlb: /proc/meminfo shows data for all sizes of
	hugepages

On Mon, Mar 07, 2011 at 04:57:06PM -0800, Nishanth Aravamudan wrote:
> > > > > > On Mon, 2011-03-07 at 14:05 +0100, Petr Holasek wrote:
> > > > > > > +       for_each_hstate(h)
> > > > > > > +               seq_printf(m,
> > > > > > > +                               "HugePages_Total:   %5lu\n"
> > > > > > > +                               "HugePages_Free:    %5lu\n"
> > > > > > > +                               "HugePages_Rsvd:    %5lu\n"
> > > > > > > +                               "HugePages_Surp:    %5lu\n"
> > > > > > > +                               "Hugepagesize:   %8lu kB\n",
> > > > > > > +                               h->nr_huge_pages,
> > > > > > > +                               h->free_huge_pages,
> > > > > > > +                               h->resv_huge_pages,
> > > > > > > +                               h->surplus_huge_pages,
> > > > > > > +                               1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
> > > > > > >  }
> > > > > >
> > > > > > It sounds like now we'll get a meminfo that looks like:
> > > > > >
> > > > > > ...
> > > > > > AnonHugePages:    491520 kB
> > > > > > HugePages_Total:       5
> > > > > > HugePages_Free:        2
> > > > > > HugePages_Rsvd:        3
> > > > > > HugePages_Surp:        1
> > > > > > Hugepagesize:       2048 kB
> > > > > > HugePages_Total:       2
> > > > > > HugePages_Free:        1
> > > > > > HugePages_Rsvd:        1
> > > > > > HugePages_Surp:        1
> > > > > > Hugepagesize:    1048576 kB
> > > > > > DirectMap4k:       12160 kB
> > > > > > DirectMap2M:     2082816 kB
> > > > > > DirectMap1G:     2097152 kB
> > > > > >
> > > > > > At best, that's a bit confusing.  There aren't any other entries in
> > > > > > meminfo that occur more than once.  Plus, this information is available
> > > > > > in the sysfs interface.  Why isn't that sufficient?
> > > > > >
> > > > > > Could we do something where we keep the default hpage_size looking like
> > > > > > it does now, but append the size explicitly for the new entries?
> > > > > >
> > > > > > HugePages_Total(1G):       2
> > > > > > HugePages_Free(1G):        1
> > > > > > HugePages_Rsvd(1G):        1
> > > > > > HugePages_Surp(1G):        1
> > > > > >
> > > > >
> > > > > Let's not change the existing interface, please.
> > > > >
> > > > > Adding new fields: OK.
> > > > > Changing the way in whcih existing fields are calculated: OKish.
> > > > > Renaming existing fields: not OK.
> > > > 
> > > > How about lining up multiple values in each field like this?
> > > > 
> > > >   HugePages_Total:       5 2
> > > >   HugePages_Free:        2 1
> > > >   HugePages_Rsvd:        3 1
> > > >   HugePages_Surp:        1 1
> > > >   Hugepagesize:       2048 1048576 kB
> > > >   ...
> > > > 
> > > > This doesn't change the field names and the impact for user space
> > > > is still small?
> > > 
> > > It might break some existing parsers, dunno.
> > > 
> > > It was a mistake to assume that all hugepages will have the same size
> > > for all time, and we just have to live with that mistake.
> > > 
> > 
> > I'm not sure it was a mistake: the kernel has a default hugepage size and 
> > that's what the global /proc/sys/vm/nr_hugepages tunable uses, so it seems 
> > appropriate that its statistics are exported in the global /proc/meminfo.
> 
> Yep, the intent was for meminfo to (continue to) document the default
> hugepage size's usage, and for any other size's statistics to be
> accessed by the appropriate sysfs entries.
> 

Agreed. The suggested changes to the interface here is very likely to
break libhugetlbfs.

> > > I'd suggest that we leave meminfo alone, just ensuring that its output
> > > makes some sense.  Instead create a new interface which presents all
> > > the required info in a sensible fashion and migrate usersapce reporting
> > > tools over to that interface.  Just let the meminfo field die a slow
> > > death.
> > > 
> > 
> > (Adding Nishanth to the cc)
> > 
> > It's already there, all this data is available for all the configured
> > hugepage sizes via /sys/kernel/mm/hugepages/hugepages-<size>kB/ as
> > described by Documentation/ABI/testing/sysfs-kernel-mm-hugepages.
> > 
> > It looks like Nishanth and others put quite a bit of effort into
> > making as stable of an API as possible for this information.
> 
> I'm not sure if libhugetlbfs already has a tool for parsing the values
> there (i.e., to give an end-user a quick'n'dirty snapshot of overall
> current hugepage usage). Eric?

I'm not Eric, but it does. It's called hugeadm and here is an example of
its output

hydra:~# hugeadm --pool-list
      Size  Minimum  Current  Maximum  Default
   2097152       16       16       16        *
1073741824        2        2        2         

> If not, probably something worth having.
> I believe we also have the per-node information in sysfs too, in case
> that's relevant to tooling.
> 

The kernel interfaces are sufficient at the moment at exporting all the
information. If hugeadm is providing insufficient information, I'd
prefer to see it enhanced than the sysfs or meminfo interfaces changed.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ