lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241213172243.GA30046@lst.de>
Date: Fri, 13 Dec 2024 18:22:43 +0100
From: Christoph Hellwig <hch@....de>
To: John Garry <john.g.garry@...cle.com>
Cc: Christoph Hellwig <hch@....de>, brauner@...nel.org, djwong@...nel.org,
	cem@...nel.org, dchinner@...hat.com, ritesh.list@...il.com,
	linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, martin.petersen@...cle.com
Subject: Re: [PATCH v2 0/7] large atomic writes for xfs

On Fri, Dec 13, 2024 at 05:15:55PM +0000, John Garry wrote:
> Sure, so some background is that we are using atomic writes for innodb 
> MySQL so that we can stop relying on the double-write buffer for crash 
> protection. MySQL is using an internal 16K page size (so we want 16K atomic 
> writes).

Make perfect sense so far.

>
> MySQL has what is known as a REDO log - see 
> https://dev.mysql.com/doc/dev/mysql-server/9.0.1/PAGE_INNODB_REDO_LOG.html
>
> Essentially it means that for any data page we write, ahead of time we do a 
> buffered 512B log update followed by a periodic fsync. I think that such a 
> thing is common to many apps.

So it's actually using buffered I/O for that and not direct I/O?

> When we tried just using 16K FS blocksize, we found for low thread count 
> testing that performance was poor - even worse baseline of 4K FS blocksize 
> and double-write buffer. We put this down to high write latency for REDO 
> log. As you can imagine, mostly writing 16K for only a 512B update is not 
> efficient in terms of traffic generated and increased latency (versus 4K FS 
> block size). At higher thread count, performance was better. We put that 
> down to bigger log data portions to be written to REDO per FS block write.

So if the redo log uses buffered I/O I can see how that would bloat writes.
But then again using buffered I/O for a REDO log seems pretty silly
to start with.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ