lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <027267db-fa2e-4436-9bd5-ca1dfae2f2b3@ans.pl>
Date: Sun, 19 May 2024 08:59:36 -0700
From: Krzysztof Olędzki <ole@....pl>
To: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-ext4@...r.kernel.org
Cc: tytso@....edu, adilger.kernel@...ger.ca
Subject: Re: WARNING: CPU: 5 PID: 23226 at fs/ext4/ext4_jbd2.c:73
 ext4_journal_check_start+0x2c/0x82 [Linux 6.6.30, data=journal]

On 17.05.2024 at 22:24, Krzysztof Olędzki wrote:
> Hi,
> 
> I noticed that once a while my system generates a WARNING during shutdown and this has been ongoing for some time, perhaps since upgrading to 6.6-stable and/or migrating to "-o 64bit". However, I cannot be 100% sure if this is a relatively recent regression or not, as I think it does not happen very frequently and usually I don't look at the console during shutdown.
> 
> The WARNING happens after all filesystems except the root fs (253:0) get unmounted and root is already mounted read-only. Today, I was finally able to capture it with the serial console attached.
> 
> [ 5802.007527] EXT4-fs (dm-0): re-mounted 5a3b300a-6b1a-4ff1-9138-287dec4dcdd7 ro. Quota mode: none.
> [ 5803.382293] sd 0:0:7:0: [sdh] Synchronizing SCSI cache
> [ 5803.388868] sd 0:0:7:0: [sdh] Stopping disk
> [ 5821.086972] sd 0:0:6:0: [sdg] Synchronizing SCSI cache
> [ 5821.093581] sd 0:0:6:0: [sdg] Stopping disk
> [ 5832.944678] ------------[ cut here ]------------
> [ 5832.950507] WARNING: CPU: 5 PID: 23226 at fs/ext4/ext4_jbd2.c:73 ext4_journal_check_start+0x2c/0x82

Forgot to mention, the WARNING comes from:

	if (WARN_ON_ONCE(sb_rdonly(sb)))
		return -EROFS;

See: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/ext4/ext4_jbd2.c?h=v6.6.30#n73

So, the question is: why we are trying to write something about long (about 30.94s) after remounting the filesystem as read-only?

> [ 5832.961303] CPU: 5 PID: 23226 Comm: kworker/u16:28 Not tainted 6.6.30-o4 #2
> [ 5832.969772] Hardware name: Dell Inc. PowerEdge T330/06FW8M, BIOS 2.20.0 02/22/2024
> [ 5832.978872] Workqueue: writeback wb_workfn (flush-253:0)
> [ 5832.985498] RIP: 0010:ext4_journal_check_start+0x2c/0x82
> [ 5832.992079] Code: 89 3c 24 2e 2e 2e 31 c0 48 8b 3c 24 b8 fb ff ff ff 48 8b 97 78 03 00 00 48 8b 8a 30 02 00 00 80 e1 02 75 56 f6 47 50 01 74 04 <0f> 0b eb 47 66 83 bf 50 02 00 00 04 75 00
> [ 5833.014495] RSP: 0018:ffffc9000eff7908 EFLAGS: 00010202
> [ 5833.021000] RAX: 00000000fffffffb RBX: ffffc9000eff7b18 RCX: 0000000000000000
> [ 5833.029711] RDX: ffff8881021f5000 RSI: ffff888100f8c000 RDI: ffff888100f8c000
> [ 5833.038392] RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000000
> [ 5833.047028] R10: 8080808080808080 R11: fefefefefefefeff R12: ffff888100f8c000
> [ 5833.055698] R13: 0000000000000008 R14: 0000000000000959 R15: ffffffff95c7c06f
> [ 5833.064377] FS:  0000000000000000(0000) GS:ffff889030140000(0000) knlGS:0000000000000000
> [ 5833.074106] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5833.081239] CR2: 00007f8bac652120 CR3: 0000000309e44001 CR4: 00000000003706e0
> [ 5833.089883] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 5833.098573] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 5833.107255] Call Trace:
> [ 5833.110693]  <TASK>
> [ 5833.113727]  ? __warn+0x93/0x11a
> [ 5833.117961]  ? report_bug+0xeb/0x162
> [ 5833.122630]  ? ext4_journal_check_start+0x2c/0x82
> [ 5833.128636]  ? handle_bug+0x3c/0x63
> [ 5833.133212]  ? exc_invalid_op+0x17/0x64
> [ 5833.138176]  ? asm_exc_invalid_op+0x1a/0x20
> [ 5833.143530]  ? __ext4_journal_start+0x15/0x1b
> [ 5833.149031]  ? ext4_journal_check_start+0x2c/0x82
> [ 5833.154920]  __ext4_journal_start_sb+0xe5/0x152
> [ 5833.160753]  __ext4_journal_start+0x15/0x1b
> [ 5833.166046]  mpage_prepare_extent_to_map+0xd1/0x38b
> [ 5833.172142]  ext4_do_writepages+0x2a9/0x922
> [ 5833.177490]  ext4_writepages+0x83/0xda
> [ 5833.182332]  do_writepages+0xca/0x177
> [ 5833.187024]  ? lock_timer_base+0x3b/0x64
> [ 5833.192001]  __writeback_single_inode+0x83/0x301
> [ 5833.197760]  writeback_sb_inodes+0x251/0x418
> [ 5833.203168]  __writeback_inodes_wb+0x80/0xc8
> [ 5833.208574]  wb_writeback+0x142/0x266
> [ 5833.213307]  wb_workfn+0x212/0x3ac
> [ 5833.217733]  ? __raw_spin_unlock_irqrestore+0x14/0x29
> [ 5833.223973]  process_scheduled_works+0x196/0x29d
> [ 5833.229723]  worker_thread+0x1c5/0x21f
> [ 5833.234523]  ? __pfx_worker_thread+0x10/0x10
> [ 5833.239850]  kthread+0xf3/0xfe
> [ 5833.243826]  ? __pfx_kthread+0x10/0x10
> [ 5833.248633]  ret_from_fork+0x20/0x35
> [ 5833.253239]  ? __pfx_kthread+0x10/0x10
> [ 5833.257963]  ret_from_fork_asm+0x1b/0x30
> [ 5833.262874]  </TASK>
> [ 5833.265828] ---[ end trace 0000000000000000 ]---
> [ 5838.780155] sd 0:0:5:0: [sdf] Synchronizing SCSI cache
> [ 5838.788570] sd 0:0:5:0: [sdf] Stopping disk
> [ 5856.502606] sd 0:0:4:0: [sde] Synchronizing SCSI cache
> [ 5856.510119] sd 0:0:4:0: [sde] Stopping disk
> [ 5874.197657] sd 0:0:3:0: [sdd] Synchronizing SCSI cache
> [ 5874.204287] sd 0:0:3:0: [sdd] Stopping disk
> [ 5882.659434] sd 0:0:2:0: [sdc] Synchronizing SCSI cache
> [ 5882.666164] sd 0:0:2:0: [sdc] Stopping disk
> [ 5891.129810] sd 0:0:1:0: [sdb] Synchronizing SCSI cache
> [ 5891.136410] sd 0:0:1:0: [sdb] Stopping disk
> [ 5899.598645] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> [ 5899.605233] sd 0:0:0:0: [sda] Stopping disk
> [ 5908.184767] megaraid_sas 0000:02:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
> [ 5908.198623] mlx4_core 0000:01:00.0: mlx4_shutdown was called
> [ 5908.250311] bond0: (slave eth2): Releasing backup interface
> [ 5908.257384] bond0: (slave eth2): the permanent HWaddr of slave - XX:XX:XX:XX:XX:XX - is still in use by bond - set the HWaddr of slave to a different address to avoid conflicts
> [ 5908.290793] mlx4_en 0000:01:00.0: removed PHC
> [ 5908.327539] bond0: (slave eth3): Releasing backup interface
> [ 5909.421105] ACPI: PM: Preparing to enter system sleep state S5
> [ 5909.428720] kvm: exiting hardware virtualization
> [ 5909.434855] reboot: Power down
> 
> # tune2fs -l /dev/VG0/root |head -12
> tune2fs 1.47.0 (5-Feb-2023)
> Filesystem volume name:   /
> Last mounted on:          /
> Filesystem UUID:          5a3b300a-6b1a-4ff1-9138-287dec4dcdd7
> Filesystem magic number:  0xEF53
> Filesystem revision #:    1 (dynamic)
> Filesystem features:      has_journal ext_attr resize_inode dir_index orphan_file filetype needs_recovery extent 64bit flex_bg metadata_csum_seed sparse_super large_file huge_file dir_nlink extra_isize metadata_csum orphan_present
> Filesystem flags:         signed_directory_hash
> Default mount options:    user_xattr acl
> Filesystem state:         clean
> Errors behavior:          Remount read-only
> Filesystem OS type:       Linux
> 
> # grep /dev/mapper/VG0-root /proc/mounts
> /dev/mapper/VG0-root / ext4 rw,relatime,nodioread_nolock,nodelalloc,stripe=192,data=journal 0 0


Thanks,
 Krzysztof


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ