lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20130109114413.GA13475@suse.de> Date: Wed, 9 Jan 2013 11:44:13 +0000 From: Mel Gorman <mgorman@...e.de> To: Linus Torvalds <torvalds@...ux-foundation.org> Cc: "Kirill A. Shutemov" <kirill@...temov.name>, Hillf Danton <dhillf@...il.com>, Hugh Dickins <hughd@...gle.com>, Dave Jones <davej@...hat.com>, Linux Kernel <linux-kernel@...r.kernel.org>, Andrea Arcangeli <aarcange@...hat.com>, Andrew Morton <akpm@...ux-foundation.org>, Linux-MM <linux-mm@...ck.org>, Rik van Riel <riel@...hat.com> Subject: Re: oops in copy_page_rep() On Tue, Jan 08, 2013 at 08:52:14AM -0800, Linus Torvalds wrote: > On Tue, Jan 8, 2013 at 8:31 AM, Kirill A. Shutemov <kirill@...temov.name> wrote: > >> > >> Heh. I was more thinking about why do_huge_pmd_wp_page() needs it, but > >> do_huge_pmd_numa_page() does not. > > > > It does. The check should be moved up. > > > >> Also, do we actually need it for huge_pmd_set_accessed()? The > >> *placement* of that thing confuses me. And because it confuses me, I'd > >> like to understand it. > > > > We need it for huge_pmd_set_accessed() too. > > > > Looks like a mis-merge. The original patch for huge_pmd_set_accessed() was > > correct: http://lkml.org/lkml/2012/10/25/402 > > Not a merge error: the pmd_trans_splitting() check was removed by > commit d10e63f29488 ("mm: numa: Create basic numa page hinting > infrastructure"). > > Now, *why* it was removed, I can't tell. And it's not clear why the > original code just had it in a conditional, while the suggested patch > has that "goto repeat" thing. It was a mistake by me to remove it and as I screwed up in October I no longer remember how I managed it. The retry versus "goto repeat" is a detail. By retrying the full fault there is a possibility the split will still be in progress on fault retry or that a new THP is collapsed underneath and a new split started while the mmap_sem is released but both are unlikely. On the other side, taking the anon_vma rwsem for write in wait_split_huge_page() could cause delays elsewhere that would be almost impossible to detect so it is not necessarily better. Retrying the fault as your patch does is reasonable. > I suspect re-trying the fault (which I > assume the original code did) is actually better, because that way you > go through all the "should I reschedule as I return through the > exception" stuff. I dunno. > > Mel, that original patch came from you , although it was based on > previous work by Peter/Ingo/Andrea. Can you walk us through the > history and thinking about the loss of pmd_trans_splitting(). Was it > purely a mistake? It looks intentional. > Mistake. Andrea, Peter and Ingo did not make similar mistakes. Looking at your patch, I also think that the check needs to be made before the call to do_huge_pmd_numa_page() so it can reply on a pmd_same() check to make sure a split did not start before the page table lock was taken. In response you said to Andrea Also, and more fundamentally, since do_pmd_numa_page() doesn't take the orig_pmd thing as an argument (and re-check it under the page-table lock), testing pmd_trans_splitting() on it is pointless, since it can change later. do_pmd_numa_page() is called for a normal PMD that is marked pmd_numa(), not a THP PMD. As the mmap_sem is held it cannot collapse to a THP underneath us after the pmd_trans_huge() check so it should be unnecessary to check pmd_trans_splitting() there. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists