linux-kernel - Re: [PATCH v2 19/20] mm, hugetlb: retry if failed to allocate and there is concurrent user

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1386606983.2723.2.camel@buesod1.americas.hpqcorp.net>
Date:	Mon, 09 Dec 2013 08:36:23 -0800
From:	Davidlohr Bueso <davidlohr@...com>
To:	Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:	David Gibson <david@...son.dropbear.id.au>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>, Mel Gorman <mgorman@...e.de>,
	Michal Hocko <mhocko@...e.cz>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Hugh Dickins <hughd@...gle.com>,
	Davidlohr Bueso <davidlohr.bueso@...com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Hillf Danton <dhillf@...il.com>
Subject: Re: [PATCH v2 19/20] mm, hugetlb: retry if failed to allocate and
 there is concurrent user

On Mon, 2013-09-30 at 16:47 +0900, Joonsoo Kim wrote:
> On Mon, Sep 16, 2013 at 10:09:09PM +1000, David Gibson wrote:
> > > > 
> > > > > +		*do_dequeue = false;
> > > > >  		spin_unlock(&hugetlb_lock);
> > > > >  		page = alloc_buddy_huge_page(h, NUMA_NO_NODE);
> > > > >  		if (!page) {
> > > > 
> > > > I think the counter also needs to be incremented in the case where we
> > > > call alloc_buddy_huge_page() from alloc_huge_page().  Even though it's
> > > > new, it gets added to the hugepage pool at this point and could still
> > > > be a contended page for the last allocation, unless I'm missing
> > > > something.
> > > 
> > > Your comment has reasonable point to me, but I have a different opinion.
> > > 
> > > As I already mentioned, the point is that we want to avoid the race
> > > which kill the legitimate users of hugepages by out of resources.
> > > I increase 'h->nr_dequeue_users' when the hugepage allocated by
> > > administrator is dequeued. It is because what the hugepage I want to
> > > protect from the race is the one allocated by administrator via
> > > kernel param or /proc interface. Administrator may already know how many
> > > hugepages are needed for their application so that he may set nr_hugepage
> > > to reasonable value. I want to guarantee that these hugepages can be used
> > > for his application without any race, since he assume that the application
> > > would work fine with these hugepages.
> > > 
> > > To protect hugepages returned from alloc_buddy_huge_page() from the race
> > > is different for me. Although it will be added to the hugepage pool, this
> > > doesn't guarantee certain application's success more. If certain
> > > application's success depends on the race of this new hugepage, it's death
> > > by the race doesn't matter, since nobody assume that it works fine.
> > 
> > Hrm.  I still think this path should be included.  Although I'll agree
> > that failing in this case is less bad.
> > 
> > However, it can still lead to a situation where with two processes or
> > threads, faulting on exactly the same shared page we have one succeed
> > and the other fail.  That's a strange behaviour and I think we want to
> > avoid it in this case too.
> 
> Hello, David.
> 
> I don't think it is a strange behaviour. Similar situation can occur
> even though we use the mutex. Hugepage allocation can be failed when
> the first process try to allocate the hugepage while second process is blocked
> by the mutex. And then, second process will go into the fault handler. And
> at this time, it can succeed. So result is that we have one succeed and
> the other fail.
> 
> It is slightly different from the case you mentioned, but I think that
> effect for user is same. We cannot avoid this kind of race completely and
> I think that avoiding the race for administrator managed hugepage pool is
> good enough to use.

What was the final decision on this issue? Is Joonsoo's approach to
removing this mutex viable, or are we stuck with it?

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/