lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170803104417.GI12521@dhcp22.suse.cz>
Date:   Thu, 3 Aug 2017 12:44:18 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Wei Wang <wei.w.wang@...el.com>
Cc:     linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org,
        linux-mm@...ck.org, mst@...hat.com, mawilcox@...rosoft.com,
        akpm@...ux-foundation.org, virtio-dev@...ts.oasis-open.org,
        david@...hat.com, cornelia.huck@...ibm.com,
        mgorman@...hsingularity.net, aarcange@...hat.com,
        amit.shah@...hat.com, pbonzini@...hat.com,
        liliang.opensource@...il.com, yang.zhang.wz@...il.com,
        quan.xu@...yun.com
Subject: Re: [PATCH v13 4/5] mm: support reporting free page blocks

On Thu 03-08-17 18:42:15, Wei Wang wrote:
> On 08/03/2017 05:11 PM, Michal Hocko wrote:
> >On Thu 03-08-17 14:38:18, Wei Wang wrote:
[...]
> >>+static int report_free_page_block(struct zone *zone, unsigned int order,
> >>+				  unsigned int migratetype, struct page **page)
> >This is just too ugly and wrong actually. Never provide struct page
> >pointers outside of the zone->lock. What I've had in mind was to simply
> >walk free lists of the suitable order and call the callback for each one.
> >Something as simple as
> >
> >	for (i = 0; i < MAX_NR_ZONES; i++) {
> >		struct zone *zone = &pgdat->node_zones[i];
> >
> >		if (!populated_zone(zone))
> >			continue;
> >		spin_lock_irqsave(&zone->lock, flags);
> >		for (order = min_order; order < MAX_ORDER; ++order) {
> >			struct free_area *free_area = &zone->free_area[order];
> >			enum migratetype mt;
> >			struct page *page;
> >
> >			if (!free_area->nr_pages)
> >				continue;
> >
> >			for_each_migratetype_order(order, mt) {
> >				list_for_each_entry(page,
> >						&free_area->free_list[mt], lru) {
> >
> >					pfn = page_to_pfn(page);
> >					visit(opaque2, prn, 1<<order);
> >				}
> >			}
> >		}
> >
> >		spin_unlock_irqrestore(&zone->lock, flags);
> >	}
> >
> >[...]
> 
> 
> I think the above would take the lock for too long time. That's why we
> prefer to take one free page block each time, and taking it one by one
> also doesn't make a difference, in terms of the performance that we
> need.

I think you should start with simple approach and impove incrementally
if this turns out to be not optimal. I really detest taking struct pages
outside of the lock. You never know what might happen after the lock is
dropped. E.g. can you race with the memory hotremove?

> The struct page is used as a "state" to get the next free page block. It is
> only
> given for an internal implementation of a function in mm ( not seen by the
> outside caller). Would this be OK?
> If not, how about pfn - we can also pass in pfn to the function, and do
> pfn_to_page each time the function starts, and then do page_to_pfn when
> returns.

No, just do not try to play tricks with struct pages which might have
gone away.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ