lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 17 Oct 2023 20:43:35 -0400
From:   "Theodore Ts'o" <tytso@....edu>
To:     Andres Freund <andres@...razel.de>
Cc:     Thorsten Leemhuis <regressions@...mhuis.info>,
        Shreeya Patel <shreeya.patel@...labora.com>,
        linux-ext4@...r.kernel.org,
        Ricardo CaƱuelo 
        <ricardo.canuelo@...labora.com>, gustavo.padovan@...labora.com,
        zsm@...gle.com, garrick@...gle.com,
        Linux regressions mailing list <regressions@...ts.linux.dev>
Subject: Re: task hung in ext4_fallocate #2

On Mon, Oct 16, 2023 at 08:37:25PM -0700, Andres Freund wrote:
> I just was able to reproduce the issue, after upgrading to 6.6-rc6 - this time
> it took ~55min of high load (io_uring using branch of postgres, running a
> write heavy transactional workload concurrently with concurrent bulk data
> load) to trigger the issue.
> 
> For now I have left the system running, in case there's something you would
> like me to check while the system is hung.
> 
> The first hanging task that I observed:
> 
> cat /proc/57606/stack
> [<0>] inode_dio_wait+0xd5/0x100
> [<0>] ext4_fallocate+0x12f/0x1040
> [<0>] vfs_fallocate+0x135/0x360
> [<0>] __x64_sys_fallocate+0x42/0x70
> [<0>] do_syscall_64+0x38/0x80
> [<0>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

This stack trace is from some process (presumably postgres) trying to
do a fallocate() system call:

	/* Wait all existing dio workers, newcomers will block on i_rwsem */
	inode_dio_wait(inode);

The reason for this is that we can't manipulate the extent tree until
any data block I/Os comlplete.  This will block until
iomap_dio_complete() in fs/iomap/direct-io.c calls inode_dio_end().

> [ 3194.579297] INFO: task iou-wrk-58004:58874 blocked for more than 122 seconds.
> [ 3194.579304]       Not tainted 6.6.0-rc6-andres-00001-g01edcfe38260 #77
> [ 3194.579310] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 3194.579314] task:iou-wrk-58004   state:D stack:0     pid:58874 ppid:52606  flags:0x00004000
> [ 3194.579325] Call Trace:
> [ 3194.579329]  <TASK>
> [ 3194.579334]  __schedule+0x388/0x13e0
> [ 3194.579349]  schedule+0x5f/0xe0
> [ 3194.579361]  schedule_preempt_disabled+0x15/0x20
> [ 3194.579374]  rwsem_down_read_slowpath+0x26e/0x4c0
> [ 3194.579385]  down_read+0x44/0xa0
> [ 3194.579393]  ext4_file_write_iter+0x432/0xa80
> [ 3194.579407]  io_write+0x129/0x420

This could potentially be a interesting stack trace; but this is where
we really need to map the stack address to line numbers.  Is that
something you could do?

> Once I hear that you don't want me to test something out on the running
> system, I think a sensible next step could be to compile with lockdep and see
> if that finds a problem?

That's certainly a possibiity.  But also please make sure that you can
compile with with debugging information enabled so that we can get
reliable line numbers.  I use:

CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_INFO_REDUCED=y

Cheers,

						- Ted

Powered by blists - more mailing lists