[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <kldodwnbi5ab5nostpqrbhxtolyzn5vqvmyjdwgehpkzknyrv4@u5y6ewg6hnon>
Date: Tue, 15 Apr 2025 13:25:44 +0200
From: Jan Kara <jack@...e.cz>
To: Davidlohr Bueso <dave@...olabs.net>
Cc: Luis Chamberlain <mcgrof@...nel.org>, brauner@...nel.org, jack@...e.cz,
tytso@....edu, adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
riel@...riel.com, willy@...radead.org, hannes@...xchg.org, oliver.sang@...el.com,
david@...hat.com, axboe@...nel.dk, hare@...e.de, david@...morbit.com,
djwong@...nel.org, ritesh.list@...il.com, linux-fsdevel@...r.kernel.org,
linux-block@...r.kernel.org, linux-mm@...ck.org, gost.dev@...sung.com, p.raghav@...sung.com,
da.gomez@...sung.com, syzbot+f3c6fda1297c748a7076@...kaller.appspotmail.com
Subject: Re: [PATCH v2 1/8] migrate: fix skipping metadata buffer heads on
migration
On Mon 14-04-25 18:36:41, Davidlohr Bueso wrote:
> On Wed, 09 Apr 2025, Luis Chamberlain wrote:
>
> > corruption can still happen even with the spin lock held. A test was
> > done using vanilla Linux and adding a udelay(2000) right before we
> > spin_lock(&bd_mapping->i_private_lock) on __find_get_block_slow() and
> > we can reproduce the same exact filesystem corruption issues as observed
> > without the spinlock with generic/750 [1].
>
> fyi I was actually able to trigger this on a vanilla 6.15-rc1 kernel,
> not even having to add the artificial delay.
OK, so this is using generic/750, isn't it? How long did you have to run it
to trigger this? Because I've never seen this tripping...
Honza
>
> [336534.157119] ------------[ cut here ]------------
> [336534.158911] WARNING: CPU: 3 PID: 87221 at fs/jbd2/transaction.c:1552 jbd2_journal_dirty_metadata+0x21c/0x230 [jbd2]
> [336534.160771] Modules linked in: loop sunrpc 9p kvm_intel nls_iso8859_1 nls_cp437 vfat fat crc32c_generic kvm ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel gf128mul crypto_simd 9pnet_virtio cryptd virtio_balloon virtio_console evdev joydev button nvme_fabrics nvme_core dm_mod drm nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vsock autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 md_mod virtio_net net_failover virtio_blk failover psmouse serio_raw virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring
> [336534.173218] CPU: 3 UID: 0 PID: 87221 Comm: kworker/u36:8 Not tainted 6.15.0-rc1 #2 PREEMPT(full)
> [336534.175146] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.02-5 03/28/2025
> [336534.176947] Workqueue: writeback wb_workfn (flush-7:5)
> [336534.178183] RIP: 0010:jbd2_journal_dirty_metadata+0x21c/0x230 [jbd2]
> [336534.179626] Code: 30 0f 84 5b fe ff ff 0f 0b 41 bc 8b ff ff ff e9 69 fe ff ff 48 8b 04 24 4c 8b 48 70 4d 39 cf 0f 84 53 ff ff ff e9 32 c3 00 00 <0f> 0b 41 bc e4 ff ff ff e9 41 ff ff ff 0f 0b 90 0f 1f 40 00 90 90
> [336534.183983] RSP: 0018:ffff9f168d38f548 EFLAGS: 00010246
> [336534.185194] RAX: 0000000000000001 RBX: ffff8c0ae8244e10 RCX: 00000000000000fd
> [336534.186810] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [336534.188399] RBP: ffff8c09db0d8618 R08: ffff8c09db0d8618 R09: 0000000000000000
> [336534.189977] R10: ffff8c0b2671a83c R11: 0000000000006989 R12: 0000000000000000
> [336534.191243] R13: ffff8c09cc3b33f0 R14: ffff8c0ae8244e18 R15: ffff8c0ad5e0ef00
> [336534.192469] FS: 0000000000000000(0000) GS:ffff8c0b95b8d000(0000) knlGS:0000000000000000
> [336534.193840] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [336534.194831] CR2: 00007f0ebab4f000 CR3: 000000011e616005 CR4: 0000000000772ef0
> [336534.196044] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [336534.197274] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [336534.198473] PKRU: 55555554
> [336534.198927] Call Trace:
> [336534.199350] <TASK>
> [336534.199701] __ext4_handle_dirty_metadata+0x5c/0x190 [ext4]
> [336534.200626] ext4_ext_insert_extent+0x575/0x1440 [ext4]
> [336534.201465] ? ext4_cache_extents+0x5a/0xd0 [ext4]
> [336534.202243] ? ext4_find_extent+0x37c/0x3a0 [ext4]
> [336534.203024] ext4_ext_map_blocks+0x50e/0x18d0 [ext4]
> [336534.203803] ? mpage_map_and_submit_buffers+0x23f/0x270 [ext4]
> [336534.204723] ext4_map_blocks+0x11a/0x4d0 [ext4]
> [336534.205442] ? ext4_alloc_io_end_vec+0x1f/0x70 [ext4]
> [336534.206239] ? kmem_cache_alloc_noprof+0x310/0x3d0
> [336534.206982] ext4_do_writepages+0x762/0xd40 [ext4]
> [336534.207706] ? __pfx_block_write_full_folio+0x10/0x10
> [336534.208451] ? ext4_writepages+0xc6/0x1a0 [ext4]
> [336534.209161] ext4_writepages+0xc6/0x1a0 [ext4]
> [336534.209834] do_writepages+0xdd/0x250
> [336534.210378] ? filemap_get_read_batch+0x170/0x310
> [336534.211069] __writeback_single_inode+0x41/0x330
> [336534.211738] writeback_sb_inodes+0x21b/0x4d0
> [336534.212375] __writeback_inodes_wb+0x4c/0xe0
> [336534.212998] wb_writeback+0x19c/0x320
> [336534.213546] wb_workfn+0x30e/0x440
> [336534.214039] process_one_work+0x188/0x340
> [336534.214650] worker_thread+0x246/0x390
> [336534.215196] ? _raw_spin_lock_irqsave+0x23/0x50
> [336534.215879] ? __pfx_worker_thread+0x10/0x10
> [336534.216522] kthread+0x104/0x250
> [336534.217004] ? __pfx_kthread+0x10/0x10
> [336534.217554] ? _raw_spin_unlock+0x15/0x30
> [336534.218140] ? finish_task_switch.isra.0+0x94/0x290
> [336534.218979] ? __pfx_kthread+0x10/0x10
> [336534.220347] ret_from_fork+0x2d/0x50
> [336534.221086] ? __pfx_kthread+0x10/0x10
> [336534.221703] ret_from_fork_asm+0x1a/0x30
> [336534.222415] </TASK>
> [336534.222775] ---[ end trace 0000000000000000 ]---
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists