linux-kernel - [PATCH] zram: Using GFP_ATOMIC instead of GFP_KERNEL to allocate bitmap memory in backing_dev

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20231130152047.200169-1-dongyun.liu@transsion.com>
Date:   Thu, 30 Nov 2023 23:20:47 +0800
From:   Dongyun Liu <dongyun.liu3@...il.com>
To:     minchan@...nel.org, senozhatsky@...omium.org, axboe@...nel.dk
Cc:     linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
        lincheng.yang@...nssion.com, jiajun.ling@...nssion.com,
        ldys2014@...mail.com, Dongyun Liu <dongyun.liu@...nssion.com>
Subject: [PATCH] zram: Using GFP_ATOMIC instead of GFP_KERNEL to allocate bitmap memory in backing_dev_store

We encountered a deadlock issue when using zram on kernel version 5.10.
The reason is that the backing_dev_store will first hold the
zram->init_lock, and then it will allocate memory with kvmalloc_node. At
this moment, the system may be in a state of memory shortage, so it will
enter the memory swapping process. In zram_bvec_write, we will trigger
our own thread to move data from zram to the backing device to free up
more available memory, which is the work done in the
try_wakeup_zram_wbd. In this function, zram->init_lock will be acquired
again to read zram->bd_wb_limit, which causes a deadlock as follow call
trace.

The latest version of Linux does not have the bug, but a potential risk
in future.If we try to acquire zram->init_lock again in the zram process
(for example, when we need to read zram->bd_wb_limit), there is a risk
of deadlock. The bug is quite elusive, and it only appears with low
probability after conducting a week-long stress test using 50 devices.
Therefore, I suggest passing the GFP_ATOMIC flag to allocate memory in
the backing_dev_store, to avoid similar issues we encountered. Please
consider it, Thank you.

INFO: task init:331 blocked for more than 120 seconds.  "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:init   state:D stack:    0 pid:    1 ppid:     0 flags:0x04000000
Call trace:
  __switch_to+0x244/0x4e4
  __schedule+0x5bc/0xc48
  schedule+0x80/0x164
  rwsem_down_read_slowpath+0x4fc/0xf9c
  __down_read+0x140/0x188
  down_read+0x14/0x24
  try_wakeup_wbd_thread+0x78/0x1ec [zram]
  __zram_bvec_write+0x720/0x878 [zram]
  zram_bvec_rw+0xa8/0x234 [zram]
  zram_submit_bio+0x16c/0x268 [zram]
  submit_bio_noacct+0x128/0x3c8
  submit_bio+0x1cc/0x3d0
  __swap_writepage+0x5c4/0xd4c
  swap_writepage+0x130/0x158
  pageout+0x1f4/0x478
  shrink_page_list+0x9b4/0x1eb8
  shrink_inactive_list+0x2f4/0xaa8
  shrink_lruvec+0x184/0x340
  shrink_node_memcgs+0x84/0x3a0
  shrink_node+0x2c4/0x6c4
  shrink_zones+0x16c/0x29c
  do_try_to_free_pages+0xe4/0x2b4
  try_to_free_pages+0x388/0x7b4
  __alloc_pages_direct_reclaim+0x88/0x278
  __alloc_pages_slowpath+0x4ec/0xf6c
  __alloc_pages_nodemask+0x1f4/0x3dc
  kmalloc_order+0x54/0x338
  kmalloc_order_trace+0x34/0x1bc
  __kmalloc+0x5e8/0x9c0
  kvmalloc_node+0xa8/0x264
  backing_dev_store+0x1a4/0x818 [zram]
  dev_attr_store+0x38/0x8c
  sysfs_kf_write+0x64/0xc4
  kernfs_fop_write_iter+0x168/0x2ac
  vfs_write+0x300/0x37c
  ksys_write+0x7c/0xf0
  __arm64_sys_write+0x20/0x30
  el0_svc_common+0xd4/0x270
  el0_svc+0x28/0x98
  el0_sync_handler+0x8c/0xf0
  el0_sync+0x1b4/0x1c0

Signed-off-by: Dongyun Liu <dongyun.liu@...nssion.com>
---
 drivers/block/zram/zram_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index d77d3664ca08..ee6c22c50e09 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -514,7 +514,7 @@ static ssize_t backing_dev_store(struct device *dev,

 	nr_pages = i_size_read(inode) >> PAGE_SHIFT;
 	bitmap_sz = BITS_TO_LONGS(nr_pages) * sizeof(long);
-	bitmap = kvzalloc(bitmap_sz, GFP_KERNEL);
+	bitmap = kmalloc(bitmap_sz, GFP_ATOMIC);
 	if (!bitmap) {
 		err = -ENOMEM;
 		goto out;
-- 
2.25.1