lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAC6JEv-MuU2sWSvftjHhOhZsE2Fz49Ffv6cm1FMnju5egcyvmQ@mail.gmail.com>
Date:	Thu, 17 Mar 2016 23:52:48 -0700
From:	Gregory Farnum <greg@...gs42.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Eric Sandeen <sandeen@...hat.com>, "Theodore Ts'o" <tytso@....edu>,
	Andreas Dilger <adilger@...ger.ca>,
	"Darrick J. Wong" <darrick.wong@...cle.com>,
	Dave Chinner <david@...morbit.com>,
	Ric Wheeler <rwheeler@...hat.com>,
	Andy Lutomirski <luto@...capital.net>,
	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
	Martin Petersen <martin.petersen@...cle.com>,
	Christoph Hellwig <hch@...radead.org>,
	Jens Axboe <axboe@...nel.dk>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux API <linux-api@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	shane.seymour@....com, Bruce Fields <bfields@...ldses.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Jeff Layton <jlayton@...chiereds.net>
Subject: Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

On Thu, Mar 17, 2016 at 10:47 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Wed, Mar 16, 2016 at 10:18 PM, Gregory Farnum <greg@...gs42.com> wrote:
>>
>> So we've not asked for NO_HIDE_STALE on the mailing lists, but I think
>> it was one of the problems Sage had using xfs in his BlueStore
>> implementation and was a big part of why it moved to pure userspace.
>> FileStore might use NO_HIDE_STALE in some places but it would be
>> pretty limited. When it came up at Linux FAST we were discussing how
>> it and similar things had been problems for us in the past and it
>> would've been nice if they were upstream.
>
> Hmm.
>
> So to me it really sounds like somebody should cook up a patch, but we
> shouldn't put it in the upstream kernel until we get numbers and
> actual "yes, we'd use this" from outside of google.
>
> I say "outside of google", because inside of google not only do we not
> get numbers, but google can maintain their own patch.
>
> But maybe Ted could at least post the patch google uses, and somebody
> in the Ceph community might want to at least try it out...
>
>>                                                                What *is* a big deal for
>> FileStore (and would be easy to take advantage of) is the thematically
>> similar O_NOMTIME flag, which is also about reducing metadata updates
>> and got blocked on similar stupid-user grounds (although not security
>> ones): http://thread.gmane.org/gmane.linux.kernel.api/10727.
>
> Hmm. I don't hate that patch, because the NOATIME thing really does
> wonders on many loads. NOMTIME makes sense.
>
> It's not like you can't do this with utimes() anyway.
>
> That said, I do wonder if people wouldn't just prefer to expand on and
> improve on the lazytime.
>
> Is there some reason you guys didn't use that?

I wasn't really involved in this stuff but I gather from looking at
http://www.spinics.net/lists/xfs/msg36869.html that any durability
command other than fdatasync is going to write out the mtime updates
to the inodes on disk. Given our durability requirements and the
guarantees offered about when things actually hit disk, that doesn't
work for us. We run an fsync on the order of every 30 seconds, and we
do various combinations of fsync, fdatasync, flush_file_range, (and,
well, any command we're provided) to try and bound the amount of dirty
data and prevent fs/disk throughput pauses when we do that full sync.
Anybody trying to do anything similar would want a switch that
prevents operations from updating the mtime on disk no matter what.
-Greg

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ