lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877gk8vlnp.fsf@xmission.com>
Date:	Thu, 11 Apr 2013 13:48:26 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Al Viro <viro@...IV.linux.org.uk>
Cc:	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Alexey Dobriyan <adobriyan@...il.com>,
	Takashi Iwai <tiwai@...e.de>
Subject: Re: [RFC] revoke(2) and generic handling of things like remove_proc_entry()

Al Viro <viro@...IV.linux.org.uk> writes:

> On Fri, Apr 05, 2013 at 05:29:32AM +0100, Al Viro wrote:
>> 4) nasty semantics issue - mmap() vs. revoke (of any sort, including
>> remove_proc_entry(), etc.).  Suppose a revokable file had been mmapped;
>> now it's going away.  What should we do to its VMAs?  Right now sysfs
>> and procfs get away with that, but only because there's only one thing
>> that has ->mmap() there - /proc/bus/pci and sysfs equivalents.  I've
>> no idea how does pci_mmap_page_range() interact with PCI hotplug (and
>> I'm not at all sure that whatever it does isn't racy wrt device removal),
>> but I suspect that it strongly depends on lack of ->fault() for those
>> VMAs, which makes killing all PTEs pointing to pages in question enough.
>> How generic do we want to make it?  Anybody wanting to add more files
>> that could be mmapped in procfs/sysfs/debugfs deserves to be hurt, but
>> if we start playing with revoke(2), restriction might become inconvenient.
>> I'm not sure what kind of behaviour do we want there - *BSD at least
>> used to have revoke(2) only for character devices that had no mmap()...
>
> Actually, after looking at what sysfs does...   We might get away with
> the following
> 	* new vma flag - VM_REVOKABLE; set by mmap() if ->f_revoke is
> non-NULL.  We are short on spare bits there, but there still are some...
> 	* start_using_vma(vma) that checks the presence of that flag,
> returns true if it's absent and __start_using(vma->vm_file->f_revoke)
> otherwise; a matching stop_using_vma(vma) as well.
> 	* surround vma method calls with start_using_vma/stop_using_vma,
> similar to file ones.  Do what fs/sysfs/bin.c wrappers do for revoked
> ones - VM_FAULT_SIGBUS for ->fault() and ->page_mkwrite(), -EINVAL for
> ->access() and ->set_policy(), vma->vm_policy for ->get_policy(),
> 0 for ->migrate(), "do nothing" for ->open() (and I'm not at all sure that
> this one is correct), hell knows what for ->close().  Note that the *only*
> instance with ->open and without ->close is sysfs pile of wrappers
> itself...

sysfs used to wrap close but nothing used it, and on review it turned to
be impossible to properly call close on all of the vmas on revoke with
all of the proper locks held without help from the mm subsystem.  So now
sysfs refuses a mapping if ->close is implemented.  In practice that
means having an implementation of ->open in sysfs is likely a waste of
space because open and closed are always paired.

My memory is that the sysfs wrappers work for a pretty narrow range of
cases, and are things only work correctly because memory mapping pci
uses none of the sophisticated facilities.

Last time I was looking at this I was noticing that there is a lock
(mmap_sem?) that is held over every ->vm_op->foo() call.  If that is
true today it should be possible to just grab that lock and change
vm_ops.  That makes for a very cheap and easy implementation, except for
the covolutions needed for taking the lock.

> Hell knows...  We have few enough call sites for ->vm_op->foo() to make
> it feasible and overhead would be trivial.  OTOH, I'm not sure what's the
> right behaviour for mmap of something like drm after revoke(2) - leaving
> writable pages there looks wrong...

I'm not certain of the peculiarities of drm.  In general the correct
semantics of a revoked memory area is that of what we do with a mmap
area on truncate.  Retaining the vma with no pages mapped and generating
SIGBUS on an access.

> BTW, snd_card_disconnect() doesn't do anything to existing mappings; smells
> like a bug, and there we do have ones with non-trivial ->mmap().  Could
> ALSA folks comment?

uio might be worth considering as well.  As some uio devices are
actually hotpluggable..

> One note about the mockup implementation upthread - __release_revoke() should
> suck in a bit more than just ->release() - turning fasync off should also go
> there.

If we can do add useful support at the fs and mm layers without
affecting performance I am all for it.  I remember that tends to make
things easier.  As an alternative let me suggest what I had intended to
do if/when I ever got back to working on revoke.

Make a library like libfs that can be used for files that want to
implement revoke support.

In that library implement what can be implemented reliably and correctly
and error on the sophisticated cases we can't support.

With the semantics and the basic users figured out move what bits we can
into the vfs or the mm subsystem to make things easier.

With a library at the very least we have one implementation that we can
debug and work with instead of a different implementation of revoke for
each different kind of file.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ