[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <25d25633-0bf4-452c-b665-354a5aaa5d0c@redhat.com>
Date: Wed, 28 Aug 2024 06:51:29 +0300
From: Mika Penttilä <mpenttil@...hat.com>
To: zhiguojiang <justinjiang@...o.com>, David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, oe-lkp@...ts.linux.dev, oliver.sang@...el.com
Cc: opensource.kernel@...o.com
Subject: Re: [PATCH v2] vma remove the unneeded avc bound with non-CoWed folio
Hi,
On 8/28/24 04:14, zhiguojiang wrote:
>
>
> 在 2024/8/28 1:35, David Hildenbrand 写道:
>> On 27.08.24 03:50, zhiguojiang wrote:
>>>
>>>
>>> 在 2024/8/27 1:24, David Hildenbrand 写道:
>>>> On 23.08.24 16:01, Zhiguo Jiang wrote:
>>>>> After CoWed by do_wp_page, the vma established a new mapping
>>>>> relationship
>>>>> with the CoWed folio instead of the non-CoWed folio. However,
>>>>> regarding
>>>>> the situation where vma->anon_vma and the non-CoWed folio's
>>>>> anon_vma are
>>>>> not same, the avc binding relationship between them will no longer be
>>>>> needed, so it is issue for the avc binding relationship still
>>>>> existing
>>>>> between them.
>>>>>
>>>>> This patch will remove the avc binding relationship between vma
>>>>> and the
>>>>> non-CoWed folio's anon_vma, which each has their own independent
>>>>> anon_vma. It can also alleviates rmap overhead simultaneously.
>>>>>
>>>>> Signed-off-by: Zhiguo Jiang <justinjiang@...o.com>
>>>>> ---
>>>>> -v2:
>>>>> * Solve the kernel test robot noticed "WARNING"
>>>>> Reported-by: kernel test robot <oliver.sang@...el.com>
>>>>> Closes:
>>>>> https://lore.kernel.org/oe-lkp/202408230938.43f55b4-lkp@intel.com
>>>>> * Update comments to more accurately describe this patch.
>>>>>
>>>>> -v1:
>>>>> https://lore.kernel.org/linux-mm/20240820143359.199-1-justinjiang@vivo.com/
>>>>>
>>>>>
>>>>> include/linux/rmap.h | 1 +
>>>>> mm/memory.c | 8 +++++++
>>>>> mm/rmap.c | 53
>>>>> ++++++++++++++++++++++++++++++++++++++++++++
>>>>> 3 files changed, 62 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>>>>> index 91b5935e8485..8607d28a3146
>>>>> --- a/include/linux/rmap.h
>>>>> +++ b/include/linux/rmap.h
>>>>> @@ -257,6 +257,7 @@ void folio_remove_rmap_ptes(struct folio *,
>>>>> struct page *, int nr_pages,
>>>>> folio_remove_rmap_ptes(folio, page, 1, vma)
>>>>> void folio_remove_rmap_pmd(struct folio *, struct page *,
>>>>> struct vm_area_struct *);
>>>>> +void folio_remove_anon_avc(struct folio *, struct vm_area_struct *);
>>>>> void hugetlb_add_anon_rmap(struct folio *, struct
>>>>> vm_area_struct *,
>>>>> unsigned long address, rmap_t flags);
>>>>> diff --git a/mm/memory.c b/mm/memory.c
>>>>> index 93c0c25433d0..4c89cb1cb73e
>>>>> --- a/mm/memory.c
>>>>> +++ b/mm/memory.c
>>>>> @@ -3428,6 +3428,14 @@ static vm_fault_t wp_page_copy(struct vm_fault
>>>>> *vmf)
>>>>> * old page will be flushed before it can be reused.
>>>>> */
>>>>> folio_remove_rmap_pte(old_folio, vmf->page, vma);
>>>>> +
>>>>> + /*
>>>>> + * If the new_folio's anon_vma is different from the
>>>>> + * old_folio's anon_vma, the avc binding relationship
>>>>> + * between vma and the old_folio's anon_vma is removed,
>>>>> + * avoiding rmap redundant overhead.
>>>>> + */
>>>>> + folio_remove_anon_avc(old_folio, vma);
>>>>
>>>> ... by increasing write fault latency, introducing an RMAP walk
>>>> (!)? Hmm?
>>>>
>>>> On the reuse path, we do a folio_move_anon_rmap(), to optimize that.
>>>>
>>> Thanks for your comments. This may not be a good fixup patch. The
>>> resue patch folio_move_anon_rmap() seems to be exclusive or
>>> _refcount = 1 folios. The fork() path seems to clear exclusive flag
>>> in copy_page_range() --> ... --> __folio_try_dup_anon_rmap(). However,
>>> I observed lots of orphan avcs by the above debug trace logs in
>>> wp_page_copy(). But they may be not removed by discussing with Mika.
>>
>> Was this patch ever tested? I cannot even boot a simple VM without an
>> endless stream of
>>
>> [ 5.804598] ------------[ cut here ]------------
>> [ 5.805494] WARNING: CPU: 11 PID: 595 at mm/rmap.c:443
>> unlink_anon_vmas+0x19b/0x1d0
>> [ 5.806962] Modules linked in: qemu_fw_cfg
>> [ 5.807762] CPU: 11 UID: 0 PID: 595 Comm: dracut-rootfs-g Tainted:
>> G W 6.11.0-rc4+ #72
>> [ 5.809546] Tainted: [W]=WARN
>> [ 5.810127] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
>> BIOS 1.16.3-2.fc40 04/01/2014
>> [ 5.811753] RIP: 0010:unlink_anon_vmas+0x19b/0x1d0
>> [ 5.812680] Code: b0 00 00 00 00 75 1f f0 ff 8f a0 00 00 00 75 a2
>> e8 8a fd ff ff eb 9b 5b 5d 41 5c 41 5d 41 5e 41 5f e9 d4 82 d0 00 0f
>> 0b eb dd <0f> 0b eb cf 0f 0b 48 83 c7 08 e8 16 40 d7 ff e9 ea fe ff
>> ff 48 8b
>> [ 5.816247] RSP: 0018:ffffa19f43bb78d0 EFLAGS: 00010286
>> [ 5.817258] RAX: ffff8a71c1bdd2d0 RBX: ffff8a71c1bdd2c0 RCX:
>> ffff8a71c27a86c8
>> [ 5.818624] RDX: 0000000000000001 RSI: ffff8a71c2771b28 RDI:
>> ffff8a71c27a9e60
>> [ 5.820011] RBP: dead000000000122 R08: 0000000000000000 R09:
>> 0000000000000001
>> [ 5.821380] R10: 0000000000000200 R11: 0000000000000001 R12:
>> ffff8a71c2771b28
>> [ 5.822748] R13: dead000000000100 R14: ffff8a71c2771b18 R15:
>> ffff8a71c27a9e60
>> [ 5.824122] FS: 0000000000000000(0000) GS:ffff8a7337980000(0000)
>> knlGS:0000000000000000
>> [ 5.825665] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 5.826775] CR2: 00007fca7f70ac58 CR3: 00000001027b2004 CR4:
>> 0000000000770ef0
>> [ 5.828146] PKRU: 55555554
>> [ 5.828686] Call Trace:
>> [ 5.829169] <TASK>
>> [ 5.829594] ? __warn.cold+0xb1/0x13e
>> [ 5.830312] ? unlink_anon_vmas+0x19b/0x1d0
>> [ 5.831118] ? report_bug+0xff/0x140
>> [ 5.831840] ? handle_bug+0x3c/0x80
>> [ 5.832524] ? exc_invalid_op+0x17/0x70
>> [ 5.833262] ? asm_exc_invalid_op+0x1a/0x20
>> [ 5.834086] ? unlink_anon_vmas+0x19b/0x1d0
>> [ 5.834908] free_pgtables+0x130/0x290
>> [ 5.835661] exit_mmap+0x19a/0x460
>> [ 5.836351] __mmput+0x4b/0x120
>> [ 5.836965] do_exit+0x2e1/0xac0
>> [ 5.837601] ? lock_release+0xd5/0x2c0
>> [ 5.838343] do_group_exit+0x36/0xa0
>> [ 5.839035] __x64_sys_exit_group+0x18/0x20
>> [ 5.839866] x64_sys_call+0x14b4/0x14c0
> Arm64 machine tested it and no crashes detected. You may try the
> attachment modifition provided by Lorenzo Stoakes. Can you please
> check if there are any opportunities for further improvement?
This patch is still wrong afaics in the main logic, you can not remove
the avc because the non cowed folios of child are not reached then.
>>
>>
>> Andrew, please remove this from mm-unstable.
>
> Thanks
> Zhiguo
Thanks,
Mika
Powered by blists - more mailing lists