lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <337c721a.70d1.1936753c377.Coremail.00107082@163.com>
Date: Tue, 26 Nov 2024 15:16:03 +0800 (CST)
From: "David Wang" <00107082@....com>
To: "Suren Baghdasaryan" <surenb@...gle.com>
Cc: kent.overstreet@...ux.dev, linux-kernel@...r.kernel.org, yuzhao@...gle.com
Subject: Re: Abnormal values show up in /proc/allocinfo

Hi, 


At 2024-11-26 09:14:50, "David Wang" <00107082@....com> wrote:
>
>Hi, 
>
>
>
>At 2024-11-26 04:31:39, "Suren Baghdasaryan" <surenb@...gle.com> wrote:
>>
>>Hi David,
>>Could you please check if you have this fix:
>>
>>ed265529d39a "mm/codetag: fix arg in pgalloc_tag_copy alloc_tag_sub"
>>
>>It was merged after v6.12-rc6 and it fixes an accounting bug inside
>>pgalloc_tag_copy(), which is used during compaction.
>>Thanks,
>>Suren.
>>
>>
>>https://lore.kernel.org/all/20240906042108.1150526-3-yuzhao@google.com/
>>https://lore.kernel.org/all/20241022232440.334820-1-souravpanda@google.com/
>
>
>No, ed265529d39a is not in 6.12.0; It is now in Linus' tree.
>I will pull the code and make some tests.
>
>Will update later.


I build a kernel based on a tree with top commit 9f16d5e6f220, 
no abnormal value observed for compaction_alloc:

$ sudo cat /proc/allocinfo  | grep compaction_alloc
           0        0 mm/compaction.c:1880 func:compaction_alloc 250098
(The last column is accumulative call counters I patched with my system, meaning compaction_alloc do happen)


But,  still got underflowed values:

       -4096 18446744073709551615 mm/filemap.c:3788 func:do_read_cache_folio 18 
   -86241280 18446744073709530561 mm/filemap.c:1951 func:__filemap_get_folio 8691015 

Finally a procedure to reproduce it:  (thanks for the tip about compact_memory)
1. populate file caches, e.g. grep something in kernel source.
2. echo 1 >/proc/sys/vm/compact_memory
3. echo 3 > /proc/sys/vm/drop_caches

There would be negative values show up.
A simple python script to check abnormal values:

with open("/proc/allocinfo") as f:
    for l in f:
        try:
            vs  = l.split()
            v1, v2 = int(vs[0]), int(vs[1])
            if v1<0 or v2<0 or (v1==0 and v2!=0) or (v1!=0 and v2==0): print l,
        except: pass


Most likely, memory release is accounted *twice* for those memory.
Reading through the code, I feel the most suspected code would be:
    clear_page_tag_ref(&old->page);
This line of code may not work as expected.

The code in pgalloc_tag_copy involves too many low level plumbing details, I think it 
is simpler to just implement a *swap* logic: just swap the tags.

I made following changes and it works, no abnormal values detected, so far.
(pgalloc_tag_copy should be renamed to pgalloc_tag_swap)


diff --git a/lib/alloc_tag.c b/lib/alloc_tag.c
index 2414a7ee7ec7..7d6d1015f4b1 100644
--- a/lib/alloc_tag.c
+++ b/lib/alloc_tag.c
@@ -191,24 +192,30 @@ void pgalloc_tag_split(struct folio *folio, int old_order, int new_order)
 
 void pgalloc_tag_copy(struct folio *new, struct folio *old)
 {
-	union pgtag_ref_handle handle;
-	union codetag_ref ref;
-	struct alloc_tag *tag;
+	union pgtag_ref_handle handle_new, handle_old;
+	union codetag_ref ref_new, ref_old;
+	struct alloc_tag *tag_new, *tag_old;
 
-	tag = pgalloc_tag_get(&old->page);
-	if (!tag)
+	tag_old = pgalloc_tag_get(&old->page);
+	if (!tag_old)
+		return;
+	tag_new = pgalloc_tag_get(&new->page);
+	if (!tag_new)
 		return;
 
-	if (!get_page_tag_ref(&new->page, &ref, &handle))
+	if (!get_page_tag_ref(&old->page, &ref_old, &handle_old))
+		return;
+	if (!get_page_tag_ref(&new->page, &ref_new, &handle_new))
 		return;
 
-	/* Clear the old ref to the original allocation tag. */
-	clear_page_tag_ref(&old->page);
-	/* Decrement the counters of the tag on get_new_folio. */
-	alloc_tag_sub(&ref, folio_size(new));
-	__alloc_tag_ref_set(&ref, tag);
-	update_page_tag_ref(handle, &ref);
-	put_page_tag_ref(handle);
+	/* swap tag */
+	__alloc_tag_ref_set(&ref_new, tag_old);
+	update_page_tag_ref(handle_new, &ref_new);
+	put_page_tag_ref(handle_new);
+
+	__alloc_tag_ref_set(&ref_old, tag_new);
+	update_page_tag_ref(handle_old, &ref_old);
+	put_page_tag_ref(handle_old);
 }
 
 static void shutdown_mem_profiling(bool remove_file)



FYI
David


>
>Thanks~
>David
>
>>
>>> > >Thanks,
>>> > >Suren.
>>> > >
>>> > >>
>>> > >>
>>> > >> Thanks
>>> > >> David
>>> > >>
>>> > >>
>>> > >>
>>> > >>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ