lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <B88A05D7-718B-4E6B-80E7-349EF5CDD4BA@nvidia.com>
Date: Wed, 27 Dec 2023 16:41:53 -0500
From: Zi Yan <ziy@...dia.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: akpm@...ux-foundation.org, hannes@...xchg.org, mhocko@...nel.org,
 roman.gushchin@...ux.dev, shakeelb@...gle.com, muchun.song@...ux.dev,
 nphamcs@...il.com, david@...hat.com, ying.huang@...el.com,
 shy828301@...il.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 cgroups@...r.kernel.org
Subject: Re: [PATCH] mm: memcg: fix split queue list crash when large folio
 migration

On 20 Dec 2023, at 1:51, Baolin Wang wrote:

> When running autonuma with enabling multi-size THP, I encountered the following
> kernel crash issue:
>
> [  134.290216] list_del corruption. prev->next should be fffff9ad42e1c490,
> but was dead000000000100. (prev=fffff9ad42399890)
> [  134.290877] kernel BUG at lib/list_debug.c:62!
> [  134.291052] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [  134.291210] CPU: 56 PID: 8037 Comm: numa01 Kdump: loaded Tainted:
> G            E      6.7.0-rc4+ #20
> [  134.291649] RIP: 0010:__list_del_entry_valid_or_report+0x97/0xb0
> ......
> [  134.294252] Call Trace:
> [  134.294362]  <TASK>
> [  134.294440]  ? die+0x33/0x90
> [  134.294561]  ? do_trap+0xe0/0x110
> ......
> [  134.295681]  ? __list_del_entry_valid_or_report+0x97/0xb0
> [  134.295842]  folio_undo_large_rmappable+0x99/0x100
> [  134.296003]  destroy_large_folio+0x68/0x70
> [  134.296172]  migrate_folio_move+0x12e/0x260
> [  134.296264]  ? __pfx_remove_migration_pte+0x10/0x10
> [  134.296389]  migrate_pages_batch+0x495/0x6b0
> [  134.296523]  migrate_pages+0x1d0/0x500
> [  134.296646]  ? __pfx_alloc_misplaced_dst_folio+0x10/0x10
> [  134.296799]  migrate_misplaced_folio+0x12d/0x2b0
> [  134.296953]  do_numa_page+0x1f4/0x570
> [  134.297121]  __handle_mm_fault+0x2b0/0x6c0
> [  134.297254]  handle_mm_fault+0x107/0x270
> [  134.300897]  do_user_addr_fault+0x167/0x680
> [  134.304561]  exc_page_fault+0x65/0x140
> [  134.307919]  asm_exc_page_fault+0x22/0x30
>
> The reason for the crash is that, the commit 85ce2c517ade ("memcontrol: only
> transfer the memcg data for migration") removed the charging and uncharging
> operations of the migration folios and cleared the memcg data of the old folio.
>
> During the subsequent release process of the old large folio in destroy_large_folio(),
> if the large folio needs to be removed from the split queue, an incorrect split
> queue can be obtained (which is pgdat->deferred_split_queue) because the old
> folio's memcg is NULL now. This can lead to list operations being performed
> under the wrong split queue lock protection, resulting in a list crash as above.
>
> After the migration, the old folio is going to be freed, so we can remove it
> from the split queue in mem_cgroup_migrate() a bit earlier before clearing the
> memcg data to avoid getting incorrect split queue.
>
> Fixes: 85ce2c517ade ("memcontrol: only transfer the memcg data for migration")
> Signed-off-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
> ---
>  mm/huge_memory.c |  2 +-
>  mm/memcontrol.c  | 11 +++++++++++
>  2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 6be1a380a298..c50dc2e1483f 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3124,7 +3124,7 @@ void folio_undo_large_rmappable(struct folio *folio)
>  	spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
>  	if (!list_empty(&folio->_deferred_list)) {
>  		ds_queue->split_queue_len--;
> -		list_del(&folio->_deferred_list);
> +		list_del_init(&folio->_deferred_list);
>  	}
>  	spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
>  }
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ae8c62c7aa53..e66e0811cccc 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -7575,6 +7575,17 @@ void mem_cgroup_migrate(struct folio *old, struct folio *new)
>
>  	/* Transfer the charge and the css ref */
>  	commit_charge(new, memcg);
> +	/*
> +	 * If the old folio a large folio and is in the split queue, it needs

should be:
If the old folio is a large folio and in the split queue


> +	 * to be removed from the split queue now, in case getting an incorrect
> +	 * split queue in destroy_large_folio() after the memcg of the old folio
> +	 * is cleared.
> +	 *
> +	 * In addition, the old folio is about to be freed after migration, so
> +	 * removing from the split queue a bit earlier seems reasonable.
               ^
               it
> +	 */
> +	if (folio_test_large(old) && folio_test_large_rmappable(old))
> +		folio_undo_large_rmappable(old);
>  	old->memcg_data = 0;
>  }
>
> -- 
> 2.39.3

Otherwise, LGTM. Thanks. Reviewed-by: Zi Yan <ziy@...dia.com>

--
Best Regards,
Yan, Zi

Download attachment "signature.asc" of type "application/pgp-signature" (855 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ