[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ec2426bc-d817-f645-b868-9edb9b4c54ca@oracle.com>
Date: Mon, 8 Apr 2019 20:30:14 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Michal Hocko <mhocko@...nel.org>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 0/2] A couple hugetlbfs fixes
On 4/8/19 12:48 PM, Davidlohr Bueso wrote:
> On Thu, 28 Mar 2019, Mike Kravetz wrote:
>
>> - A BUG can be triggered (not easily) due to temporarily mapping a
>> page before doing a COW.
>
> But you actually _have_ seen it? Do you have the traces? I ask
> not because of the patches perse, but because it would be nice
> to have a real snipplet in the Changelog for patch 2.
Yes, I actually saw this problem. It happened while I was debugging and
testing some patches for hugetlb migration. The BUG I hit was in
unaccount_page_cache_page(): VM_BUG_ON_PAGE(page_mapped(page), page).
Stack trace was something like:
unaccount_page_cache_page
__delete_from_page_cache
delete_from_page_cache
remove_huge_page
remove_inode_hugepages
hugetlbfs_punch_hole
hugetlbfs_fallocate
When I hit that, it took me a while to figure out how it could happen.
i.e. How could a page be mapped at that point in remove_inode_hugepages?
It checks page_mapped and we are holding the fault mutex. With some
additional debug code (strategic udelays) I could hit the issue on a
somewhat regular basis and verified another thread was in the
hugetlb_no_page/hugetlb_cow path for the same page at the same time.
Unfortunately, I did not save the traces. I am trying to recreate now.
However, my test system was recently updated and it might take a little
time to recreate.
--
Mike Kravetz
Powered by blists - more mailing lists