linux-ext4 - unexpected sync delays in dpkg for small pre-allocated files on ext4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 24 May 2016 19:07:41 +0200
From:	Gernot Hillier <gernot.hillier@...mens.com>
To:	linux-ext4@...r.kernel.org
Subject: unexpected sync delays in dpkg for small pre-allocated files on ext4

Hi there!

We experience strange delays with kernel 4.1.18 during dpkg package
installation on an ext4 filesystem after switching from Ubuntu 14.04 to
16.04. We can reproduce the issue with kernel 4.6. Installation of the
same package takes 2s with ext3 and 31s with ext4 on the same partition.

Hardware is an Intel-based server with Supermicro X8DTH board and
Seagate ST973451SS disks connected to an LSI SAS2008 controller (PCI
0x1000:0x0072, mpt2sas driver).

We could track this down to the introduction of fallocate() in recent
dpkg versions and derived the following synthetic test case. First
sync_file_range() call takes ~5ms, 2nd call ~15ms.

	fd = open("test1.txt", 0xc1);
	ret = fallocate(fd, 0, 0, 20);
	ret = write(fd, "hallo", 6);
	ret = sync_file_range(fd, 0, 0, 2);
	ret = close(fd);

	fd = open("test2.txt", 0xc1);
	ret = fallocate(fd, 0, 0, 20);
	ret = write(fd, "hallo", 6);
	ret = sync_file_range(fd, 0, 0, 2);
	ret = close(fd);

The delay is caused by sb_issue_zeroout() in ext4_ext_zeroout(), called
by ext4_ext_convert_to_initialized().

Ftrace function_graph snippet with important events (recorded with
kernel 4.6):

  7)          |      ext4_ext_map_blocks() {
[...]
  7)          |        /* ext4_ext_handle_unwritten_extents: ... */
  7)          |        /* ext4_ext_convert_to_initialized_enter: ... */
  7)          |        ext4_ext_zeroout() {
  7)          |          ext4_issue_zeroout() {

[... CPU switches to kworker, then to <idle> ...]
[... 14 ms later, we continue...]

  3)          |  /* irq_handler_entry: irq=25 name=mpt2sas0-msix0 */
  3)          |  /* irq_handler_exit: irq=25 ret=handled */
  7)          |  /* softirq_raise: vec=4 [action=BLOCK] */
  7)          |  /* softirq_entry: vec=4 [action=BLOCK] */
  7)          |  /* block_rq_complete: 8,16 W () 114969912 + 8 [0] */
[...]
  7) * 14 ms  |          } /* ext4_issue_zeroout */

ANY of the following changes reduce the sync_file_range() runtime to <<
1 ms:

 * do not use fallocate()
 * fallocate() for more than extent_max_zeroout_kb, e.g. 40 kB
 * echo 0 > /sys/fs/ext4/sdb2/extent_max_zeroout_kb
 * use ext3 instead of ext4
 * mount ext4 with -o dioread_nolock

To me, the problem looks comparable to
https://bugzilla.kernel.org/show_bug.cgi?id=56821 (even if we don't see
a full hang and there's no RAID involved for us), so a closer look on
the SCSI layer or driver might be the next step?

Any suggestions how to continue analysis are greatly appreciated!

-- 
Gernot Hillier, Siemens AG
Corporate Technology, Corporate Competence Center Embedded Linux

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html