lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160531002152.GQ26977@dastard>
Date:	Tue, 31 May 2016 10:21:52 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Gernot Hillier <gernot.hillier@...mens.com>
Cc:	Theodore Ts'o <tytso@....edu>, linux-scsi@...r.kernel.org,
	MPT-FusionLinux.pdl@...adcom.com, linux-ext4@...r.kernel.org,
	sathya.prakash@...adcom.com, chaitra.basappa@...adcom.com,
	suganath-prabu.subramani@...adcom.com
Subject: Re: unexpected sync delays in dpkg for small pre-allocated files on
 ext4

On Mon, May 30, 2016 at 10:27:52AM +0200, Gernot Hillier wrote:
> Hi!
> 
> On 25.05.2016 01:13, Theodore Ts'o wrote:
> > On Tue, May 24, 2016 at 07:07:41PM +0200, Gernot Hillier wrote:
> >> We experience strange delays with kernel 4.1.18 during dpkg
> >> package installation on an ext4 filesystem after switching from
> >> Ubuntu 14.04 to 16.04. We can reproduce the issue with kernel 4.6.
> >> Installation of the same package takes 2s with ext3 and 31s with
> >> ext4 on the same partition.
> >>
> >> Hardware is an Intel-based server with Supermicro X8DTH board and
> >> Seagate ST973451SS disks connected to an LSI SAS2008 controller (PCI
> >> 0x1000:0x0072, mpt2sas driver).
> [...]
> >> To me, the problem looks comparable to
> >> https://bugzilla.kernel.org/show_bug.cgi?id=56821 (even if we don't see
> >> a full hang and there's no RAID involved for us), so a closer look on
> >> the SCSI layer or driver might be the next step?
> > 
> > What I would suggest is to create a small test case which compares the
> > time it takes to allocate 1 megabyte of memory, zero it, and then
> > write one megabytes of zeros using the write(2) system call.  Then try
> > writing one megabytes of zero using the BLKZEROOUT ioctl.
> 
> Ok, this is my test code:
> 
> 	const int SIZE = 1*1024*1024;
> 	char* buffer = malloc(SIZE);
> 	uint64_t range[2] = { 0, SIZE };
> 	int fd = open("/dev/sdb2", O_WRONLY);
> 
> 	bzero(buffer, SIZE);
> 	write(fd, buffer, SIZE);
> 	sync_file_range(fd, 0, 0, 2);
> 
> 	ioctl (fd, BLKZEROOUT, range);
> 
> 	close(fd);
> 	free(buffer);
> 
> # strace -tt ./test-tytso
> [...]
> 15:46:27.481636 open("/dev/sdb2", O_WRONLY) = 3
> 15:46:27.482004 write(3, "\0\0\0\0\0\0"..., 1048576) = 1048576
> 15:46:27.482438 sync_file_range(3, 0, 0, SYNC_FILE_RANGE_WRITE) = 0
> 15:46:27.482698 ioctl(3, BLKZEROOUT, [0, 100000]) = 0
> 15:46:27.546971 close(3)                = 0
> 
> So the write() and sync_file_range() in the first case takes ~400 us
> each while BLKZEROOUT takes... 60 ms. Wow.

Comparing apples to oranges.

Unlike the name implies, sync_file_range() does not provide any data
integrity semantics what-so-ever: SYNC_FILE_RANGE_WRITE only submits
IO to clean dirty pages - that only takes 400us of CPU time.  It
does not wait for completion, nor does it flush the drive cache and
so by the time the syscall returns to userspace the IO may not have
even been sent to the device (e.g. it could be queued by the IO
scheduler in the block layer). i.e. you're not timing IO, you're
timing CPU overhead of IO submission.

For an apples to apples comparison, you need to use fsync() to
physically force the written data to stable storage and wait for
completion. This is what BLKZEROOUT is effectively doing, so I think
you'll find fdatasync() also takes around 60ms...

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ