linux-kernel - Memory leaks due to "locking/percpu-rwsem: Remove the embedded rwsem"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <C1CCBDAC-A453-4FF2-908F-0B6E356223D1@lca.pw>
Date:   Fri, 27 Mar 2020 16:47:58 -0400
From:   Qian Cai <cai@....pw>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        dbueso@...e.de, juri.lelli@...hat.com, longman@...hat.com,
        linux-kernel@...r.kernel.org
Subject: Memory leaks due to "locking/percpu-rwsem: Remove the embedded rwsem"



> On Mar 27, 2020, at 6:19 AM, Qian Cai <cai@....pw> wrote:
> 
> 
> 
>> On Mar 27, 2020, at 5:37 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>> 
>> If the trylock fails, someone else got the lock and we remain on the
>> waitqueue. It seems like a very bad idea to put the task while it
>> remains on the waitqueue, no?
> 
> Interesting, I thought this was more straightforward to see, but I may be wrong as always. At the beginning of percpu_rwsem_wake_function() it calls get_task_struct(), but if the trylock failed, it will remain in the waitqueue. However, it will run percpu_rwsem_wake_function() again with get_task_struct() to increase the refcount. Can you enlighten me where it will call put_task_struct() in waitqueue or elsewhere to balance the refcount in this case?

I am pretty confident that the linux-next commit,

7f26482a872c ("locking/percpu-rwsem: Remove the embedded rwsem”)

Introduced memory leaks,

I put a debugging patch here,

diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
index a008a1ba21a7..857602ef54f1 100644
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -123,8 +123,10 @@ static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
 	struct percpu_rw_semaphore *sem = key;
 
 	/* concurrent against percpu_down_write(), can get stolen */
-	if (!__percpu_rwsem_trylock(sem, reader))
+	if (!__percpu_rwsem_trylock(sem, reader)) {
+		printk("KK __percpu_rwsem_trylock\n");
 		return 1;
+	}
 
 	list_del_init(&wq_entry->entry);
 	smp_store_release(&wq_entry->private, NULL);

Once those printks() triggered, it ends up with task_struct leaks,

unreferenced object 0xc000200df1422280 (size 8192):
  comm "read_all", pid 12975, jiffies 4297309144 (age 5351.480s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000f5c5fa2d>] copy_process+0x26c/0x1920
    [<0000000099229290>] _do_fork+0xac/0xb20
    [<00000000d40a7825>] __do_sys_clone+0x98/0xe0
    [<00000000c7cd06a4>] ppc_clone+0x8/0xc
unreferenced object 0xc00020047ef8eb80 (size 120):
  comm "read_all", pid 12975, jiffies 4297309144 (age 5351.480s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000004def8a44>] prepare_creds+0x38/0x110
    [<0000000037a68116>] copy_creds+0xbc/0x1d0
    [<0000000016b7471c>] copy_process+0x454/0x1920
    [<0000000099229290>] _do_fork+0xac/0xb20
    [<00000000d40a7825>] __do_sys_clone+0x98/0xe0
    [<00000000c7cd06a4>] ppc_clone+0x8/0xc
unreferenced object 0xc000200d96f80800 (size 1384):
  comm "read_all", pid 12975, jiffies 4297309144 (age 5351.480s)
  hex dump (first 32 bytes):
    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    10 08 f8 96 0d 20 00 c0 10 08 f8 96 0d 20 00 c0  ..... ....... ..
  backtrace:
    [<000000008894d13b>] copy_process+0xa40/0x1920
    [<0000000099229290>] _do_fork+0xac/0xb20
    [<00000000d40a7825>] __do_sys_clone+0x98/0xe0
    [<00000000c7cd06a4>] ppc_clone+0x8/0xc
unreferenced object 0xc000001e91ba4000 (size 16384):
  comm "read_all", pid 12982, jiffies 4297309462 (age 5348.300s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000009689397b>] kzalloc.constprop.48+0x1c/0x30
    [<000000001753eb18>] task_numa_fault+0xac8/0x1260
    [<0000000047bb80b1>] __handle_mm_fault+0x12cc/0x1b00
    [<00000000c0a4c8ba>] handle_mm_fault+0x298/0x450
    [<000000003465b20d>] __do_page_fault+0x2b8/0xf90
    [<000000005037fec9>] handle_page_fault+0x10/0x30
unreferenced object 0xc0002015fe4aaa80 (size 8192):
  comm "read_all", pid 13157, jiffies 4297353979 (age 4903.130s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000f5c5fa2d>] copy_process+0x26c/0x1920
    [<0000000099229290>] _do_fork+0xac/0xb20
    [<00000000d40a7825>] __do_sys_clone+0x98/0xe0
    [<00000000c7cd06a4>] ppc_clone+0x8/0xc
unreferenced object 0xc00020047ef8f080 (size 120):
  comm "read_all", pid 13157, jiffies 4297353979 (age 4903.130s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000004def8a44>] prepare_creds+0x38/0x110
    [<0000000037a68116>] copy_creds+0xbc/0x1d0
    [<0000000016b7471c>] copy_process+0x454/0x1920
    [<0000000099229290>] _do_fork+0xac/0xb20
    [<00000000d40a7825>] __do_sys_clone+0x98/0xe0
    [<00000000c7cd06a4>] ppc_clone+0x8/0xc
unreferenced object 0xc0002012a9388f00 (size 1384):
  comm "read_all", pid 13157, jiffies 4297353979 (age 4903.130s)
  hex dump (first 32 bytes):
    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    10 8f 38 a9 12 20 00 c0 10 8f 38 a9 12 20 00 c0  ..8.. ....8.. ..
  backtrace:
    [<000000008894d13b>] copy_process+0xa40/0x1920
    [<0000000099229290>] _do_fork+0xac/0xb20
    [<00000000d40a7825>] __do_sys_clone+0x98/0xe0
    [<00000000c7cd06a4>] ppc_clone+0x8/0xc
unreferenced object 0xc000001c86704000 (size 16384):
  comm "read_all", pid 13164, jiffies 4297354081 (age 4902.110s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<000000009689397b>] kzalloc.constprop.48+0x1c/0x30
    [<000000001753eb18>] task_numa_fault+0xac8/0x1260
    [<0000000047bb80b1>] __handle_mm_fault+0x12cc/0x1b00
    [<00000000c0a4c8ba>] handle_mm_fault+0x298/0x450
    [<000000003465b20d>] __do_page_fault+0x2b8/0xf90
    [<000000005037fec9>] handle_page_fault+0x10/0x30