lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 05 Sep 2018 08:49:20 -0400
From:   "Zi Yan" <zi.yan@...rutgers.edu>
To:     "Peter Xu" <peterx@...hat.com>
Cc:     "Kirill A. Shutemov" <kirill@...temov.name>,
        linux-kernel@...r.kernel.org,
        "Andrea Arcangeli" <aarcange@...hat.com>,
        "Andrew Morton" <akpm@...ux-foundation.org>,
        "Michal Hocko" <mhocko@...e.com>,
        "Huang Ying" <ying.huang@...el.com>,
        "Dan Williams" <dan.j.williams@...el.com>,
        "Naoya Horiguchi" <n-horiguchi@...jp.nec.com>,
        "Jérôme Glisse" <jglisse@...hat.com>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
        "Konstantin Khlebnikov" <khlebnikov@...dex-team.ru>,
        "Souptick Joarder" <jrdr.linux@...il.com>, linux-mm@...ck.org
Subject: Re: [PATCH] mm: hugepage: mark splitted page dirty when needed

On 5 Sep 2018, at 3:30, Peter Xu wrote:

> On Tue, Sep 04, 2018 at 10:00:28AM -0400, Zi Yan wrote:
>> On 4 Sep 2018, at 4:01, Kirill A. Shutemov wrote:
>>
>>> On Tue, Sep 04, 2018 at 03:55:10PM +0800, Peter Xu wrote:
>>>> When splitting a huge page, we should set all small pages as dirty if
>>>> the original huge page has the dirty bit set before.  Otherwise we'll
>>>> lose the original dirty bit.
>>>
>>> We don't lose it. It got transfered to struct page flag:
>>>
>>> 	if (pmd_dirty(old_pmd))
>>> 		SetPageDirty(page);
>>>
>>
>> Plus, when split_huge_page_to_list() splits a THP, its subroutine __split_huge_page()
>> propagates the dirty bit in the head page flag to all subpages in __split_huge_page_tail().
>
> Hi, Kirill, Zi,
>
> Thanks for your responses!
>
> Though in my test the huge page seems to be splitted not by
> split_huge_page_to_list() but by explicit calls to
> change_protection().  The stack looks like this (again, this is a
> customized kernel, and I added an explicit dump_stack() there):
>
>   kernel:  dump_stack+0x5c/0x7b
>   kernel:  __split_huge_pmd+0x192/0xdc0
>   kernel:  ? update_load_avg+0x8b/0x550
>   kernel:  ? update_load_avg+0x8b/0x550
>   kernel:  ? account_entity_enqueue+0xc5/0xf0
>   kernel:  ? enqueue_entity+0x112/0x650
>   kernel:  change_protection+0x3a2/0xab0
>   kernel:  mwriteprotect_range+0xdd/0x110
>   kernel:  userfaultfd_ioctl+0x50b/0x1210
>   kernel:  ? do_futex+0x2cf/0xb20
>   kernel:  ? tty_write+0x1d2/0x2f0
>   kernel:  ? do_vfs_ioctl+0x9f/0x610
>   kernel:  do_vfs_ioctl+0x9f/0x610
>   kernel:  ? __x64_sys_futex+0x88/0x180
>   kernel:  ksys_ioctl+0x70/0x80
>   kernel:  __x64_sys_ioctl+0x16/0x20
>   kernel:  do_syscall_64+0x55/0x150
>   kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> At the very time the userspace is sending an UFFDIO_WRITEPROTECT ioctl
> to kernel space, which is handled by mwriteprotect_range().  In case
> you'd like to refer to the kernel, it's basically this one from
> Andrea's (with very trivial changes):
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git userfault
>
> So... do we have two paths to split the huge pages separately?
>
> Another (possibly very naive) question is: could any of you hint me
> how the page dirty bit is finally applied to the PTEs?  These two
> dirty flags confused me for a few days already (the SetPageDirty() one
> which sets the page dirty flag, and the pte_mkdirty() which sets that
> onto the real PTEs).

change_protection() only causes splitting a PMD entry into multiple PTEs
but not the physical compound page, so my answer does not apply to your case.
It is unclear how the dirty bit makes your QEMU get a SIGBUS. I think you
need to describe your problem with more details.

AFAIK, the PageDirty bit will not apply back to any PTEs. So for your case,
when reporting a page’s dirty bit information, some function in the kernel only checks
the PTE’s dirty bit but not the dirty bit in the struct page flags, which
might provide a wrong answer.


—
Best Regards,
Yan Zi

Download attachment "signature.asc" of type "application/pgp-signature" (517 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ