lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5b56e76b-cf73-4cfd-b4c5-03fdece234fd@vivo.com>
Date: Wed, 28 Aug 2024 09:14:40 +0800
From: zhiguojiang <justinjiang@...o.com>
To: David Hildenbrand <david@...hat.com>,
 Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, oe-lkp@...ts.linux.dev, oliver.sang@...el.com
Cc: opensource.kernel@...o.com
Subject: Re: [PATCH v2] vma remove the unneeded avc bound with non-CoWed folio



在 2024/8/28 1:35, David Hildenbrand 写道:
> On 27.08.24 03:50, zhiguojiang wrote:
>>
>>
>> 在 2024/8/27 1:24, David Hildenbrand 写道:
>>> On 23.08.24 16:01, Zhiguo Jiang wrote:
>>>> After CoWed by do_wp_page, the vma established a new mapping
>>>> relationship
>>>> with the CoWed folio instead of the non-CoWed folio. However, 
>>>> regarding
>>>> the situation where vma->anon_vma and the non-CoWed folio's 
>>>> anon_vma are
>>>> not same, the avc binding relationship between them will no longer be
>>>> needed, so it is issue for the avc binding relationship still existing
>>>> between them.
>>>>
>>>> This patch will remove the avc binding relationship between vma and 
>>>> the
>>>> non-CoWed folio's anon_vma, which each has their own independent
>>>> anon_vma. It can also alleviates rmap overhead simultaneously.
>>>>
>>>> Signed-off-by: Zhiguo Jiang <justinjiang@...o.com>
>>>> ---
>>>> -v2:
>>>>    * Solve the kernel test robot noticed "WARNING"
>>>>      Reported-by: kernel test robot <oliver.sang@...el.com>
>>>>      Closes:
>>>> https://lore.kernel.org/oe-lkp/202408230938.43f55b4-lkp@intel.com 
>>>>
>>>>    * Update comments to more accurately describe this patch.
>>>>
>>>> -v1:
>>>> https://lore.kernel.org/linux-mm/20240820143359.199-1-justinjiang@vivo.com/ 
>>>>
>>>>
>>>>    include/linux/rmap.h |  1 +
>>>>    mm/memory.c          |  8 +++++++
>>>>    mm/rmap.c            | 53 
>>>> ++++++++++++++++++++++++++++++++++++++++++++
>>>>    3 files changed, 62 insertions(+)
>>>>
>>>> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>>>> index 91b5935e8485..8607d28a3146
>>>> --- a/include/linux/rmap.h
>>>> +++ b/include/linux/rmap.h
>>>> @@ -257,6 +257,7 @@ void folio_remove_rmap_ptes(struct folio *,
>>>> struct page *, int nr_pages,
>>>>        folio_remove_rmap_ptes(folio, page, 1, vma)
>>>>    void folio_remove_rmap_pmd(struct folio *, struct page *,
>>>>            struct vm_area_struct *);
>>>> +void folio_remove_anon_avc(struct folio *, struct vm_area_struct *);
>>>>      void hugetlb_add_anon_rmap(struct folio *, struct 
>>>> vm_area_struct *,
>>>>            unsigned long address, rmap_t flags);
>>>> diff --git a/mm/memory.c b/mm/memory.c
>>>> index 93c0c25433d0..4c89cb1cb73e
>>>> --- a/mm/memory.c
>>>> +++ b/mm/memory.c
>>>> @@ -3428,6 +3428,14 @@ static vm_fault_t wp_page_copy(struct vm_fault
>>>> *vmf)
>>>>                 * old page will be flushed before it can be reused.
>>>>                 */
>>>>                folio_remove_rmap_pte(old_folio, vmf->page, vma);
>>>> +
>>>> +            /*
>>>> +             * If the new_folio's anon_vma is different from the
>>>> +             * old_folio's anon_vma, the avc binding relationship
>>>> +             * between vma and the old_folio's anon_vma is removed,
>>>> +             * avoiding rmap redundant overhead.
>>>> +             */
>>>> +            folio_remove_anon_avc(old_folio, vma);
>>>
>>> ... by increasing write fault latency, introducing an RMAP walk (!)? 
>>> Hmm?
>>>
>>> On the reuse path, we do a folio_move_anon_rmap(), to optimize that.
>>>
>> Thanks for your comments. This may not be a good fixup patch. The
>> resue patch folio_move_anon_rmap() seems to be exclusive or
>> _refcount = 1 folios. The fork() path seems to clear exclusive flag
>> in copy_page_range() --> ... --> __folio_try_dup_anon_rmap(). However,
>> I observed lots of orphan avcs by the above debug trace logs in
>> wp_page_copy(). But they may be not removed by discussing with Mika.
>
> Was this patch ever tested? I cannot even boot a simple VM without an 
> endless stream of
>
> [    5.804598] ------------[ cut here ]------------
> [    5.805494] WARNING: CPU: 11 PID: 595 at mm/rmap.c:443 
> unlink_anon_vmas+0x19b/0x1d0
> [    5.806962] Modules linked in: qemu_fw_cfg
> [    5.807762] CPU: 11 UID: 0 PID: 595 Comm: dracut-rootfs-g Tainted: 
> G        W          6.11.0-rc4+ #72
> [    5.809546] Tainted: [W]=WARN
> [    5.810127] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
> BIOS 1.16.3-2.fc40 04/01/2014
> [    5.811753] RIP: 0010:unlink_anon_vmas+0x19b/0x1d0
> [    5.812680] Code: b0 00 00 00 00 75 1f f0 ff 8f a0 00 00 00 75 a2 
> e8 8a fd ff ff eb 9b 5b 5d 41 5c 41 5d 41 5e 41 5f e9 d4 82 d0 00 0f 
> 0b eb dd <0f> 0b eb cf 0f 0b 48 83 c7 08 e8 16 40 d7 ff e9 ea fe ff ff 
> 48 8b
> [    5.816247] RSP: 0018:ffffa19f43bb78d0 EFLAGS: 00010286
> [    5.817258] RAX: ffff8a71c1bdd2d0 RBX: ffff8a71c1bdd2c0 RCX: 
> ffff8a71c27a86c8
> [    5.818624] RDX: 0000000000000001 RSI: ffff8a71c2771b28 RDI: 
> ffff8a71c27a9e60
> [    5.820011] RBP: dead000000000122 R08: 0000000000000000 R09: 
> 0000000000000001
> [    5.821380] R10: 0000000000000200 R11: 0000000000000001 R12: 
> ffff8a71c2771b28
> [    5.822748] R13: dead000000000100 R14: ffff8a71c2771b18 R15: 
> ffff8a71c27a9e60
> [    5.824122] FS:  0000000000000000(0000) GS:ffff8a7337980000(0000) 
> knlGS:0000000000000000
> [    5.825665] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    5.826775] CR2: 00007fca7f70ac58 CR3: 00000001027b2004 CR4: 
> 0000000000770ef0
> [    5.828146] PKRU: 55555554
> [    5.828686] Call Trace:
> [    5.829169]  <TASK>
> [    5.829594]  ? __warn.cold+0xb1/0x13e
> [    5.830312]  ? unlink_anon_vmas+0x19b/0x1d0
> [    5.831118]  ? report_bug+0xff/0x140
> [    5.831840]  ? handle_bug+0x3c/0x80
> [    5.832524]  ? exc_invalid_op+0x17/0x70
> [    5.833262]  ? asm_exc_invalid_op+0x1a/0x20
> [    5.834086]  ? unlink_anon_vmas+0x19b/0x1d0
> [    5.834908]  free_pgtables+0x130/0x290
> [    5.835661]  exit_mmap+0x19a/0x460
> [    5.836351]  __mmput+0x4b/0x120
> [    5.836965]  do_exit+0x2e1/0xac0
> [    5.837601]  ? lock_release+0xd5/0x2c0
> [    5.838343]  do_group_exit+0x36/0xa0
> [    5.839035]  __x64_sys_exit_group+0x18/0x20
> [    5.839866]  x64_sys_call+0x14b4/0x14c0
Arm64 machine tested it and no crashes detected. You may try the
attachment modifition provided by Lorenzo Stoakes. Can you please
check if there are any opportunities for further improvement?
>
>
> Andrew, please remove this from mm-unstable.

Thanks
Zhiguo
View attachment "0001-mm-fixup-orphan-avc-cleanup-logic.patch" of type "text/plain" (6291 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ