lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <491C436C.6060603@kernel.org>
Date:	Fri, 14 Nov 2008 00:10:36 +0900
From:	Tejun Heo <tj@...nel.org>
To:	Miklos Szeredi <miklos@...redi.hu>
CC:	fuse-devel@...ts.sourceforge.net, greg@...ah.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCHSET] FUSE: extend FUSE to support more operations

Hello,

Miklos Szeredi wrote:
>> I kind of like the original implementation tho.  The f_ops->poll
>> interface is designed to be used like ->poll returning events if
>> available immediately and queue for later notification as necessary.
>> Notification is asynchronous and can be spurious (this actually comes
>> pretty handy for low level implementation).  When notified, upper layer
>> queries the same way using ->poll.  This is quite convenient for low
>> level implementation as the actual logic of poll can live in ->poll
>> proper while notifications can be scattered around places where events
>> can occur.
> 
> Yes, that kind of interface is nice for f_ops->poll, and for libfuse.
> 
> But for the kernel interface it's inefficient.  A wake up event is 3
> context switches instead of one.  And that's inherent in the interface
> itself for no good reason.

Event notification performance problem is usually in its scalability
not in each notification.  It's nice to optimize that too but I don't
think it weighs too much especially for FUSE.  Doing it request/reply
way could have scalability concerns, please see below.

> Also there's again the question of userspace filesystem messing with
> the caller: your original implementation allows the userspace
> filesystem to block f_ops->poll() forever, which really isn't what
> poll/select is about.

That would simply be a broken poll implementation just as O_NONBLOCK
read can block in ->read forever.

> So I'd still argue for the simple POLL-request/POLL-notify protocol on
> the kernel API, and possibly have the async notification similar to
> the kernel interface on the library API.
> 
> Implementation wise I don't care all that much, but I'd actually
> prefer if it was implemented using the traditional request/reply thing
> and optimized (possibly later) to find requests in a more efficient
> way than searching the linear list, which would benefit not just poll
> but all requests.

Given that the number of in-flight requests are not too high, I think
linear search is fine for now but switching it to b-tree shouldn't be
difficult.

So, pros for req/reply approach.

* Less context switch per event notification.

* No need for separate async notification mechanism.

Cons.

* More interface impedence matching from libfuse.

* Higher overhead when poll/select finishes.  Either all outstanding
  requests need to be cancelled using INTERRUPT whenever poll/select
  returns or kernel needs to keep persistent list of outstanding polls
  so that later poll/select can reuse them.  The problem here is that
  kernel doesn't know when or whether they'll be re-used.  We can put
  in LRU-based heuristics but it's getting too complex.  Note that
  it's different from userland server keeping track.  The same problem
  exists with userland based tracking but for many servers it would be
  just a bit in existing structure and we can be much more lax on
  userland.  ie. actual storage backed files usually don't need
  notification at all as data is always available, so the amount of
  overhead is limited in most cases but we can't assume things like
  that for the kernel.

Overall, I think being lazy about cancellation and let userland notify
asynchronously would be better performance and simplicity wise.  What
do you think?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ