[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1e95a6e4-9993-40ae-b563-44b7024da25c@redhat.com>
Date: Tue, 27 Aug 2024 19:35:48 +0200
From: David Hildenbrand <david@...hat.com>
To: zhiguojiang <justinjiang@...o.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, oe-lkp@...ts.linux.dev, oliver.sang@...el.com
Cc: opensource.kernel@...o.com
Subject: Re: [PATCH v2] vma remove the unneeded avc bound with non-CoWed folio
On 27.08.24 03:50, zhiguojiang wrote:
>
>
> 在 2024/8/27 1:24, David Hildenbrand 写道:
>> On 23.08.24 16:01, Zhiguo Jiang wrote:
>>> After CoWed by do_wp_page, the vma established a new mapping
>>> relationship
>>> with the CoWed folio instead of the non-CoWed folio. However, regarding
>>> the situation where vma->anon_vma and the non-CoWed folio's anon_vma are
>>> not same, the avc binding relationship between them will no longer be
>>> needed, so it is issue for the avc binding relationship still existing
>>> between them.
>>>
>>> This patch will remove the avc binding relationship between vma and the
>>> non-CoWed folio's anon_vma, which each has their own independent
>>> anon_vma. It can also alleviates rmap overhead simultaneously.
>>>
>>> Signed-off-by: Zhiguo Jiang <justinjiang@...o.com>
>>> ---
>>> -v2:
>>> * Solve the kernel test robot noticed "WARNING"
>>> Reported-by: kernel test robot <oliver.sang@...el.com>
>>> Closes:
>>> https://lore.kernel.org/oe-lkp/202408230938.43f55b4-lkp@intel.com
>>> * Update comments to more accurately describe this patch.
>>>
>>> -v1:
>>> https://lore.kernel.org/linux-mm/20240820143359.199-1-justinjiang@vivo.com/
>>>
>>> include/linux/rmap.h | 1 +
>>> mm/memory.c | 8 +++++++
>>> mm/rmap.c | 53 ++++++++++++++++++++++++++++++++++++++++++++
>>> 3 files changed, 62 insertions(+)
>>>
>>> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>>> index 91b5935e8485..8607d28a3146
>>> --- a/include/linux/rmap.h
>>> +++ b/include/linux/rmap.h
>>> @@ -257,6 +257,7 @@ void folio_remove_rmap_ptes(struct folio *,
>>> struct page *, int nr_pages,
>>> folio_remove_rmap_ptes(folio, page, 1, vma)
>>> void folio_remove_rmap_pmd(struct folio *, struct page *,
>>> struct vm_area_struct *);
>>> +void folio_remove_anon_avc(struct folio *, struct vm_area_struct *);
>>> void hugetlb_add_anon_rmap(struct folio *, struct vm_area_struct *,
>>> unsigned long address, rmap_t flags);
>>> diff --git a/mm/memory.c b/mm/memory.c
>>> index 93c0c25433d0..4c89cb1cb73e
>>> --- a/mm/memory.c
>>> +++ b/mm/memory.c
>>> @@ -3428,6 +3428,14 @@ static vm_fault_t wp_page_copy(struct vm_fault
>>> *vmf)
>>> * old page will be flushed before it can be reused.
>>> */
>>> folio_remove_rmap_pte(old_folio, vmf->page, vma);
>>> +
>>> + /*
>>> + * If the new_folio's anon_vma is different from the
>>> + * old_folio's anon_vma, the avc binding relationship
>>> + * between vma and the old_folio's anon_vma is removed,
>>> + * avoiding rmap redundant overhead.
>>> + */
>>> + folio_remove_anon_avc(old_folio, vma);
>>
>> ... by increasing write fault latency, introducing an RMAP walk (!)? Hmm?
>>
>> On the reuse path, we do a folio_move_anon_rmap(), to optimize that.
>>
> Thanks for your comments. This may not be a good fixup patch. The
> resue patch folio_move_anon_rmap() seems to be exclusive or
> _refcount = 1 folios. The fork() path seems to clear exclusive flag
> in copy_page_range() --> ... --> __folio_try_dup_anon_rmap(). However,
> I observed lots of orphan avcs by the above debug trace logs in
> wp_page_copy(). But they may be not removed by discussing with Mika.
Was this patch ever tested? I cannot even boot a simple VM without an endless stream of
[ 5.804598] ------------[ cut here ]------------
[ 5.805494] WARNING: CPU: 11 PID: 595 at mm/rmap.c:443 unlink_anon_vmas+0x19b/0x1d0
[ 5.806962] Modules linked in: qemu_fw_cfg
[ 5.807762] CPU: 11 UID: 0 PID: 595 Comm: dracut-rootfs-g Tainted: G W 6.11.0-rc4+ #72
[ 5.809546] Tainted: [W]=WARN
[ 5.810127] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
[ 5.811753] RIP: 0010:unlink_anon_vmas+0x19b/0x1d0
[ 5.812680] Code: b0 00 00 00 00 75 1f f0 ff 8f a0 00 00 00 75 a2 e8 8a fd ff ff eb 9b 5b 5d 41 5c 41 5d 41 5e 41 5f e9 d4 82 d0 00 0f 0b eb dd <0f> 0b eb cf 0f 0b 48 83 c7 08 e8 16 40 d7 ff e9 ea fe ff ff 48 8b
[ 5.816247] RSP: 0018:ffffa19f43bb78d0 EFLAGS: 00010286
[ 5.817258] RAX: ffff8a71c1bdd2d0 RBX: ffff8a71c1bdd2c0 RCX: ffff8a71c27a86c8
[ 5.818624] RDX: 0000000000000001 RSI: ffff8a71c2771b28 RDI: ffff8a71c27a9e60
[ 5.820011] RBP: dead000000000122 R08: 0000000000000000 R09: 0000000000000001
[ 5.821380] R10: 0000000000000200 R11: 0000000000000001 R12: ffff8a71c2771b28
[ 5.822748] R13: dead000000000100 R14: ffff8a71c2771b18 R15: ffff8a71c27a9e60
[ 5.824122] FS: 0000000000000000(0000) GS:ffff8a7337980000(0000) knlGS:0000000000000000
[ 5.825665] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.826775] CR2: 00007fca7f70ac58 CR3: 00000001027b2004 CR4: 0000000000770ef0
[ 5.828146] PKRU: 55555554
[ 5.828686] Call Trace:
[ 5.829169] <TASK>
[ 5.829594] ? __warn.cold+0xb1/0x13e
[ 5.830312] ? unlink_anon_vmas+0x19b/0x1d0
[ 5.831118] ? report_bug+0xff/0x140
[ 5.831840] ? handle_bug+0x3c/0x80
[ 5.832524] ? exc_invalid_op+0x17/0x70
[ 5.833262] ? asm_exc_invalid_op+0x1a/0x20
[ 5.834086] ? unlink_anon_vmas+0x19b/0x1d0
[ 5.834908] free_pgtables+0x130/0x290
[ 5.835661] exit_mmap+0x19a/0x460
[ 5.836351] __mmput+0x4b/0x120
[ 5.836965] do_exit+0x2e1/0xac0
[ 5.837601] ? lock_release+0xd5/0x2c0
[ 5.838343] do_group_exit+0x36/0xa0
[ 5.839035] __x64_sys_exit_group+0x18/0x20
[ 5.839866] x64_sys_call+0x14b4/0x14c0
Andrew, please remove this from mm-unstable.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists