lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20150326230833.4ccfaebb.akpm@linux-foundation.org>
Date:	Thu, 26 Mar 2015 23:08:33 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Volker.Lendecke@...net.de
Cc:	Milosz Tanski <milosz@...in.com>, linux-kernel@...r.kernel.org,
	Christoph Hellwig <hch@...radead.org>,
	linux-fsdevel@...r.kernel.org, linux-aio@...ck.org,
	Mel Gorman <mgorman@...e.de>, Tejun Heo <tj@...nel.org>,
	Jeff Moyer <jmoyer@...hat.com>,
	"Theodore Ts'o" <tytso@....edu>, Al Viro <viro@...iv.linux.org.uk>,
	linux-api@...r.kernel.org,
	Michael Kerrisk <mtk.manpages@...il.com>,
	linux-arch@...r.kernel.org, Dave Chinner <david@...morbit.com>
Subject: Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache
 only)

On Fri, 27 Mar 2015 06:41:25 +0100 Volker Lendecke <Volker.Lendecke@...net.de> wrote:

> On Thu, Mar 26, 2015 at 08:28:24PM -0700, Andrew Morton wrote:
> > A thing which bugs me about pread2() is that it is specifically
> > tailored to applications which are able to use a partial read result. 
> > ie, by sending it over the network.
> 
> Can you explain what you mean by this? Samba gets a pread
> request from a client for some bytes. The client will be
> confused when we send less than requested although the file
> is long enough to satisfy all.

Well it was my assumption that samba would be able to do something
useful with a partial read - pread() is allowed to return less than requested.

If it isn't the case that samba can use the partial read result then
what does it do?  It has to save the partial data, then do the
additional IO?  That's pretty clunky compared to

	if (it's all in cache)
		read it all now
	else
		ask a worker thread to read it all

> > And of course fincore could be used by Samba etc to avoid blocking on
> > reads.  It wouldn't perform quite as well as pread2(), but I bet it's
> > good enough.
> > 
> > Bottom line: with pread2() there's still a need for fincore(), but with
> > fincore() there probably isn't a need for pread2().
> 
> fincore would be a second syscall per pread, and it is not
> atomic. I've had discussions with MIPS based vendors who
> are worried about every single syscall. This is the #1
> hottest code path in Samba.

Bear in mind that these operations involve physical IO and large
memcpy's.  Yes, a fincore() approach will consume more CPU but the
additional overhead will be relatively small.

Tradeoffs are involved, and it may turn out that choosing a more
flexible and powerful interface which is somewhat more CPU intensive is
a better decision.  It's hard to say until this is quantified (ie:
measured).

> > And I'm doubtful about claims that it absolutely has to be non-blocking
> > 100% of the time.  I bet that 99.99% is good enough.  A fincore()
> > option to run mark_page_accessed() against present pages would help
> > with the race-with-reclaim situation.
> 
> If you can make sure that after an fincore the pages remain
> in memory for x milliseconds the atomicity concern might go
> away.

It won't be guaranteed that the fincore()+pread() will be
non-blocking.  But blocking will be very rare.  I don't know whether
the additional expense of activating the pages within fincore() is
justified - needs runtime testing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ