lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <kaprll3vfmf4vxy4zigkooytykz7my2pyf4jajyxf2m7fpygyl@wgs735yktho5>
Date: Sun, 9 Feb 2025 01:20:39 +0900
From: Sergey Senozhatsky <senozhatsky@...omium.org>
To: Yosry Ahmed <yosry.ahmed@...ux.dev>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, 
	Kairui Song <ryncsn@...il.com>, Andrew Morton <akpm@...ux-foundation.org>, 
	Minchan Kim <minchan@...nel.org>, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv4 02/17] zram: do not use per-CPU compression streams

On (25/02/07 21:07), Yosry Ahmed wrote:
> I assume this problem is unique to zram and not zswap because zram can
> be used with normal IO (and then recurse through reclaim), while zswap
> is only reachable thorugh reclaim (which cannot recurse)?

I think I figured it out.  It appears that the problem was in lockdep
class key.  Both in zram and in zsmalloc I made the keys static:

 static void zram_slot_lock_init(struct zram *zram, u32 index)
 {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
       static struct lock_class_key key;

        lockdep_init_map(&zram->table[index].lockdep_map, "zram-entry->lock",
                        &key, 0);
 #endif
 }

Which would put the locks to the same class from lockdep point of
view.  And that means that chains of locks from zram0 (mounted ext4)
and chains of locks from zram1 (swap device) would interleave, leading
to reports that made no sense.  Like ext4 writeback and blkdev_read and
handle_mm_fault->do_swap_page() would be parts of the same lock-chain *.

So I moved lockdep class keys to per-zram device and per-zsmalloc pool
to separate the lockdep chains.  Looks like that did the trick.


*

[ 1714.787676] [    T172] ======================================================
[ 1714.788905] [    T172] WARNING: possible circular locking dependency detected
[ 1714.790114] [    T172] 6.14.0-rc1-next-20250207+ #936 Not tainted
[ 1714.791150] [    T172] ------------------------------------------------------
[ 1714.792356] [    T172] kworker/u96:4/172 is trying to acquire lock:
[ 1714.793421] [    T172] ffff888114cf0598 (ptlock_ptr(ptdesc)#2){+.+.}-{3:3}, at: page_vma_mapped_walk+0x5c0/0x960
[ 1714.795174] [    T172]
                          but task is already holding lock:
[ 1714.796453] [    T172] ffffe8ffff981cf8 (&zstrm->lock){+.+.}-{4:4}, at: zcomp_stream_get+0x20/0x40 [zram]
[ 1714.798098] [    T172]
                          which lock already depends on the new lock.

[ 1714.799901] [    T172]                                                                                                                                                                                                                                                                                   the existing dependency chain (in reverse order) is:
[ 1714.801469] [    T172]
                          -> #3 (&zstrm->lock){+.+.}-{4:4}:
[ 1714.802750] [    T172]        lock_acquire.part.0+0x63/0x1a0
[ 1714.803712] [    T172]        __mutex_lock+0xaa/0xd40
[ 1714.804574] [    T172]        zcomp_stream_get+0x20/0x40 [zram]
[ 1714.805578] [    T172]        zram_read_from_zspool+0x84/0x140 [zram]
[ 1714.806673] [    T172]        zram_bio_read+0x56/0x2c0 [zram]
[ 1714.807641] [    T172]        __submit_bio+0x12d/0x1c0
[ 1714.808511] [    T172]        __submit_bio_noacct+0x7f/0x200
[ 1714.809468] [    T172]        mpage_readahead+0xdd/0x110
[ 1714.810360] [    T172]        read_pages+0x7a/0x1b0
[ 1714.811182] [    T172]        page_cache_ra_unbounded+0x19a/0x210
[ 1714.812215] [    T172]        force_page_cache_ra+0x92/0xb0
[ 1714.813161] [    T172]        filemap_get_pages+0x11f/0x440
[ 1714.814098] [    T172]        filemap_read+0xf6/0x400
[ 1714.814945] [    T172]        blkdev_read_iter+0x66/0x130
[ 1714.815860] [    T172]        vfs_read+0x266/0x370
[ 1714.816674] [    T172]        ksys_read+0x66/0xe0
[ 1714.817477] [    T172]        do_syscall_64+0x64/0x130
[ 1714.818344] [    T172]        entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 1714.819444] [    T172]
                          -> #2 (zram-entry->lock){+.+.}-{0:0}:
[ 1714.820769] [    T172]        lock_acquire.part.0+0x63/0x1a0
[ 1714.821734] [    T172]        zram_slot_free_notify+0x5c/0x80 [zram]
[ 1714.822811] [    T172]        swap_entry_range_free+0x115/0x1a0
[ 1714.823812] [    T172]        cluster_swap_free_nr+0xb9/0x150
[ 1714.824787] [    T172]        do_swap_page+0x80d/0xea0
[ 1714.825661] [    T172]        __handle_mm_fault+0x538/0x7a0
[ 1714.826592] [    T172]        handle_mm_fault+0xdf/0x240
[ 1714.827485] [    T172]        do_user_addr_fault+0x152/0x700
[ 1714.828432] [    T172]        exc_page_fault+0x66/0x1f0
[ 1714.829317] [    T172]        asm_exc_page_fault+0x22/0x30
[ 1714.830235] [    T172]        do_sys_poll+0x213/0x260
[ 1714.831090] [    T172]        __x64_sys_poll+0x44/0x190
[ 1714.831972] [    T172]        do_syscall_64+0x64/0x130
[ 1714.832846] [    T172]        entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 1714.833949] [    T172]
                          -> #1 (&cluster_info[i].lock){+.+.}-{3:3}:
[ 1714.835354] [    T172]        lock_acquire.part.0+0x63/0x1a0
[ 1714.836307] [    T172]        _raw_spin_lock+0x2c/0x40
[ 1714.837194] [    T172]        __swap_duplicate+0x5e/0x150
[ 1714.838123] [    T172]        swap_duplicate+0x1c/0x40
[ 1714.838980] [    T172]        try_to_unmap_one+0x6c4/0xd60
[ 1714.839901] [    T172]        rmap_walk_anon+0xe7/0x210
[ 1714.840774] [    T172]        try_to_unmap+0x76/0x80
[ 1714.841613] [    T172]        shrink_folio_list+0x487/0xad0
[ 1714.842546] [    T172]        evict_folios+0x247/0x800
[ 1714.843404] [    T172]        try_to_shrink_lruvec+0x1cd/0x2b0
[ 1714.844382] [    T172]        lru_gen_shrink_node+0xc3/0x190
[ 1714.845335] [    T172]        do_try_to_free_pages+0xee/0x4b0
[ 1714.846292] [    T172]        try_to_free_pages+0xea/0x280
[ 1714.847208] [    T172]        __alloc_pages_slowpath.constprop.0+0x296/0x970
[ 1714.848391] [    T172]        __alloc_frozen_pages_noprof+0x2b3/0x300
[ 1714.849475] [    T172]        __folio_alloc_noprof+0x10/0x30
[ 1714.850422] [    T172]        do_anonymous_page+0x69/0x4b0
[ 1714.851337] [    T172]        __handle_mm_fault+0x557/0x7a0
[ 1714.852265] [    T172]        handle_mm_fault+0xdf/0x240
[ 1714.853153] [    T172]        do_user_addr_fault+0x152/0x700
[ 1714.854099] [    T172]        exc_page_fault+0x66/0x1f0
[ 1714.854976] [    T172]        asm_exc_page_fault+0x22/0x30
[ 1714.855897] [    T172]        rep_movs_alternative+0x3a/0x60
[ 1714.856851] [    T172]        _copy_to_iter+0xe2/0x7a0
[ 1714.857719] [    T172]        get_random_bytes_user+0x95/0x150
[ 1714.858712] [    T172]        vfs_read+0x266/0x370
[ 1714.859512] [    T172]        ksys_read+0x66/0xe0
[ 1714.860301] [    T172]        do_syscall_64+0x64/0x130
[ 1714.861167] [    T172]        entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 1714.862270] [    T172]
                          -> #0 (ptlock_ptr(ptdesc)#2){+.+.}-{3:3}:
[ 1714.863656] [    T172]        check_prev_add+0xeb/0xca0
[ 1714.864532] [    T172]        __lock_acquire+0xf56/0x12c0
[ 1714.865446] [    T172]        lock_acquire.part.0+0x63/0x1a0
[ 1714.866399] [    T172]        _raw_spin_lock+0x2c/0x40
[ 1714.867258] [    T172]        page_vma_mapped_walk+0x5c0/0x960
[ 1714.868235] [    T172]        folio_referenced_one+0xd0/0x4a0
[ 1714.869205] [    T172]        __rmap_walk_file+0xbe/0x1b0
[ 1714.870119] [    T172]        folio_referenced+0x10b/0x140
[ 1714.871039] [    T172]        shrink_folio_list+0x72c/0xad0
[ 1714.871975] [    T172]        evict_folios+0x247/0x800
[ 1714.872851] [    T172]        try_to_shrink_lruvec+0x1cd/0x2b0
[ 1714.873842] [    T172]        lru_gen_shrink_node+0xc3/0x190
[ 1714.874806] [    T172]        do_try_to_free_pages+0xee/0x4b0
[ 1714.875779] [    T172]        try_to_free_pages+0xea/0x280
[ 1714.876699] [    T172]        __alloc_pages_slowpath.constprop.0+0x296/0x970
[ 1714.877897] [    T172]        __alloc_frozen_pages_noprof+0x2b3/0x300
[ 1714.878977] [    T172]        __alloc_pages_noprof+0xa/0x20
[ 1714.879907] [    T172]        alloc_zspage+0xe6/0x2c0 [zsmalloc]
[ 1714.880924] [    T172]        zs_malloc+0xd2/0x2b0 [zsmalloc]
[ 1714.881881] [    T172]        zram_write_page+0xfc/0x300 [zram]
[ 1714.882873] [    T172]        zram_bio_write+0xd1/0x1c0 [zram]
[ 1714.883845] [    T172]        __submit_bio+0x12d/0x1c0
[ 1714.884712] [    T172]        __submit_bio_noacct+0x7f/0x200
[ 1714.885667] [    T172]        ext4_io_submit+0x20/0x40
[ 1714.886532] [    T172]        ext4_do_writepages+0x3e3/0x8b0
[ 1714.887482] [    T172]        ext4_writepages+0xe8/0x280
[ 1714.888377] [    T172]        do_writepages+0xcf/0x260
[ 1714.889247] [    T172]        __writeback_single_inode+0x56/0x350
[ 1714.890273] [    T172]        writeback_sb_inodes+0x227/0x550
[ 1714.891239] [    T172]        __writeback_inodes_wb+0x4c/0xe0
[ 1714.892202] [    T172]        wb_writeback+0x2f2/0x3f0
[ 1714.893071] [    T172]        wb_do_writeback+0x227/0x2a0
[ 1714.893976] [    T172]        wb_workfn+0x56/0x1b0
[ 1714.894777] [    T172]        process_one_work+0x1eb/0x570
[ 1714.895698] [    T172]        worker_thread+0x1d1/0x3b0
[ 1714.896571] [    T172]        kthread+0xf9/0x200
[ 1714.897356] [    T172]        ret_from_fork+0x2d/0x50
[ 1714.898214] [    T172]        ret_from_fork_asm+0x11/0x20
[ 1714.899142] [    T172]
                          other info that might help us debug this:

[ 1714.900906] [    T172] Chain exists of:
                            ptlock_ptr(ptdesc)#2 --> zram-entry->lock --> &zstrm->lock

[ 1714.903183] [    T172]  Possible unsafe locking scenario:

[ 1714.904463] [    T172]        CPU0                    CPU1
[ 1714.905380] [    T172]        ----                    ----
[ 1714.906293] [    T172]   lock(&zstrm->lock);
[ 1714.907006] [    T172]                                lock(zram-entry->lock);
[ 1714.908204] [    T172]                                lock(&zstrm->lock);
[ 1714.909347] [    T172]   lock(ptlock_ptr(ptdesc)#2);
[ 1714.910179] [    T172]
                           *** DEADLOCK ***

[ 1714.911570] [    T172] 7 locks held by kworker/u96:4/172:
[ 1714.912472] [    T172]  #0: ffff88810165d548 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x433/0x570
[ 1714.914273] [    T172]  #1: ffffc90000683e40 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x1ad/0x570
[ 1714.916339] [    T172]  #2: ffff88810b93d0e0 (&type->s_umount_key#28){++++}-{4:4}, at: super_trylock_shared+0x16/0x50
[ 1714.918141] [    T172]  #3: ffff88810b93ab50 (&sbi->s_writepages_rwsem){.+.+}-{0:0}, at: do_writepages+0xcf/0x260
[ 1714.919877] [    T172]  #4: ffffe8ffff981cf8 (&zstrm->lock){+.+.}-{4:4}, at: zcomp_stream_get+0x20/0x40 [zram]
[ 1714.921573] [    T172]  #5: ffff888106809900 (&mapping->i_mmap_rwsem){++++}-{4:4}, at: __rmap_walk_file+0x161/0x1b0
[ 1714.923347] [    T172]  #6: ffffffff82347d40 (rcu_read_lock){....}-{1:3}, at: ___pte_offset_map+0x26/0x1b0
[ 1714.924981] [    T172]
                          stack backtrace:
[ 1714.925998] [    T172] CPU: 6 UID: 0 PID: 172 Comm: kworker/u96:4 Not tainted 6.14.0-rc1-next-20250207+ #936
[ 1714.926005] [    T172] Workqueue: writeback wb_workfn (flush-251:0)
[ 1714.926009] [    T172] Call Trace:
[ 1714.926013] [    T172]  <TASK>
[ 1714.926015] [    T172]  dump_stack_lvl+0x57/0x80
[ 1714.926018] [    T172]  print_circular_bug.cold+0x38/0x45
[ 1714.926021] [    T172]  check_noncircular+0x12e/0x150
[ 1714.926025] [    T172]  check_prev_add+0xeb/0xca0
[ 1714.926027] [    T172]  ? add_chain_cache+0x10c/0x480
[ 1714.926029] [    T172]  __lock_acquire+0xf56/0x12c0
[ 1714.926032] [    T172]  lock_acquire.part.0+0x63/0x1a0
[ 1714.926035] [    T172]  ? page_vma_mapped_walk+0x5c0/0x960
[ 1714.926036] [    T172]  ? page_vma_mapped_walk+0x5c0/0x960
[ 1714.926037] [    T172]  _raw_spin_lock+0x2c/0x40
[ 1714.926040] [    T172]  ? page_vma_mapped_walk+0x5c0/0x960
[ 1714.926041] [    T172]  page_vma_mapped_walk+0x5c0/0x960
[ 1714.926043] [    T172]  folio_referenced_one+0xd0/0x4a0
[ 1714.926046] [    T172]  __rmap_walk_file+0xbe/0x1b0
[ 1714.926047] [    T172]  folio_referenced+0x10b/0x140
[ 1714.926050] [    T172]  ? page_mkclean_one+0xc0/0xc0
[ 1714.926051] [    T172]  ? folio_get_anon_vma+0x220/0x220
[ 1714.926052] [    T172]  ? __traceiter_remove_migration_pte+0x50/0x50
[ 1714.926054] [    T172]  shrink_folio_list+0x72c/0xad0
[ 1714.926060] [    T172]  evict_folios+0x247/0x800
[ 1714.926064] [    T172]  try_to_shrink_lruvec+0x1cd/0x2b0
[ 1714.926066] [    T172]  lru_gen_shrink_node+0xc3/0x190
[ 1714.926068] [    T172]  ? mark_usage+0x61/0x110
[ 1714.926071] [    T172]  do_try_to_free_pages+0xee/0x4b0
[ 1714.926073] [    T172]  try_to_free_pages+0xea/0x280
[ 1714.926077] [    T172]  __alloc_pages_slowpath.constprop.0+0x296/0x970
[ 1714.926079] [    T172]  ? __lock_acquire+0x3d1/0x12c0
[ 1714.926081] [    T172]  ? get_page_from_freelist+0xd9/0x680
[ 1714.926083] [    T172]  ? match_held_lock+0x30/0xa0
[ 1714.926085] [    T172]  __alloc_frozen_pages_noprof+0x2b3/0x300
[ 1714.926088] [    T172]  __alloc_pages_noprof+0xa/0x20
[ 1714.926090] [    T172]  alloc_zspage+0xe6/0x2c0 [zsmalloc]
[ 1714.926092] [    T172]  ? zs_malloc+0xc5/0x2b0 [zsmalloc]
[ 1714.926094] [    T172]  ? __lock_release.isra.0+0x5e/0x180
[ 1714.926096] [    T172]  zs_malloc+0xd2/0x2b0 [zsmalloc]
[ 1714.926099] [    T172]  zram_write_page+0xfc/0x300 [zram]
[ 1714.926102] [    T172]  zram_bio_write+0xd1/0x1c0 [zram]
[ 1714.926105] [    T172]  __submit_bio+0x12d/0x1c0
[ 1714.926107] [    T172]  ? jbd2_journal_stop+0x145/0x320
[ 1714.926109] [    T172]  ? kmem_cache_free+0xb5/0x3e0
[ 1714.926112] [    T172]  ? lock_release+0x6b/0x130
[ 1714.926115] [    T172]  ? __submit_bio_noacct+0x7f/0x200
[ 1714.926116] [    T172]  __submit_bio_noacct+0x7f/0x200
[ 1714.926118] [    T172]  ext4_io_submit+0x20/0x40
[ 1714.926120] [    T172]  ext4_do_writepages+0x3e3/0x8b0
[ 1714.926122] [    T172]  ? lock_acquire.part.0+0x63/0x1a0
[ 1714.926124] [    T172]  ? do_writepages+0xcf/0x260
[ 1714.926127] [    T172]  ? ext4_writepages+0xe8/0x280
[ 1714.926128] [    T172]  ext4_writepages+0xe8/0x280
[ 1714.926130] [    T172]  do_writepages+0xcf/0x260
[ 1714.926133] [    T172]  ? find_held_lock+0x2b/0x80
[ 1714.926134] [    T172]  ? writeback_sb_inodes+0x1b8/0x550
[ 1714.926136] [    T172]  __writeback_single_inode+0x56/0x350
[ 1714.926138] [    T172]  writeback_sb_inodes+0x227/0x550
[ 1714.926143] [    T172]  __writeback_inodes_wb+0x4c/0xe0
[ 1714.926145] [    T172]  wb_writeback+0x2f2/0x3f0
[ 1714.926147] [    T172]  wb_do_writeback+0x227/0x2a0
[ 1714.926150] [    T172]  wb_workfn+0x56/0x1b0
[ 1714.926151] [    T172]  process_one_work+0x1eb/0x570
[ 1714.926154] [    T172]  worker_thread+0x1d1/0x3b0
[ 1714.926157] [    T172]  ? bh_worker+0x250/0x250
[ 1714.926159] [    T172]  kthread+0xf9/0x200
[ 1714.926161] [    T172]  ? kthread_fetch_affinity.isra.0+0x40/0x40
[ 1714.926163] [    T172]  ret_from_fork+0x2d/0x50
[ 1714.926165] [    T172]  ? kthread_fetch_affinity.isra.0+0x40/0x40
[ 1714.926166] [    T172]  ret_from_fork_asm+0x11/0x20
[ 1714.926170] [    T172]  </TASK>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ