linux-kernel - Re: [PATCH] sysfs: move assignment to be under lock in sysfs_remove

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8738ni308z.fsf@xmission.com>
Date:	Wed, 30 Oct 2013 14:41:16 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Tejun Heo <tj@...nel.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Greg KH <gregkh@...uxfoundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sysfs: move assignment to be under lock in sysfs_remove_dir()

Tejun Heo <tj@...nel.org> writes:

> Hello,
>
> On Tue, Oct 29, 2013 at 10:29:43PM -0700, Eric W. Biederman wrote:
>> I never actually looked deeply into it, and I was working from several
>> year old memory and a quick skim of the patch when I asked the question.
>> 
>> The protection we have previous to this patch is that syfs_remove_dir is
>> only sane to call once.
>> 
>> Which makes the code that does:
>> 	if (!dir_sd)
>>         	return;
>> in __sysfs_remove_dir very suspicious.   I expect we want a
>> WARN_ON(!dir_sd);
>
> It was always like that, probably in the same spirit as kfree() taking
> NULL so that it can be easier, for example, in init failure paths.

Which given that we are talking about sysfs either means that no one is
taking advantage of this case or there is some totally broken usage
created by accident and that we aren't warning about.

>> But the entire directory removal process and working on sysfs stopped
>> being fun before I managed to get that cleaned up.  And unless I missed
>> something go by Tejun is going to go generalize this thing before this
>> bit gets cleaned up.  Sigh.
>
> I kept the same behavior for kernfs_remove().  I don't think it's
> something we explicitly want to clean up tho?  It's an acceptable
> behavior.

It has been a long time since I looked but there are or at least were
problematic cases last time I looked.

The pci hotplug layer removes or removed kobjects from sysfs in the
wrong order with parent directories disappearing before their children.

There were leaks in this area.

I fixed the worst of it and everything I had energy to fix when I was
working on sysfs.  But it is a real problem.

One of the particularly problematic things that can happen with sysfs is
that we can get a hotplug event in userspace and then examine sysfs and
not find the attributes of the device because the kernel has not added
them yet.

Which is a particularly good reason to have a campaign against
independent usage of device_create_file and device_remove_file in the
device users.

At which point really the right thing to do when we delete a directory
is to WARN and be very grumpy if there are any attributes in the
directory we were removing.

Being nice here has resulted in buggy semantics exported to userspace,
and buggy kernel code being written.  In a generalized version of sysfs
we don't want to be nice to avoid the existing mess that sysfs sees.

>> On an equally bizarre note.  I don't understand why we have a separate
>> spinlock there.  Looks...  Sigh.  We use a different lock from
>> everything as a premature optimization so that sysfs_remove_dir could be
>> modified to just take a sysfs_dirent, and all of the kobject handling
>> could be removed.
>> 
>> Sigh. It was never in my way and while I was working on the code that
>> there was a good locking reason for doing that silly thing.
>
> Umm... you got it completely wrong.  It's there to address a race
> condition between removal and symlinking and has nothing to do with
> optimization.

No.  It totally is a premature optimization.  There is no reason to add
another lock to sysfs and make understanding of what locks are needed
sto protect data structures something that needs to be thought about.
sysfs_mutex can easily be expanded to cover this role, and in fact all
cases where we take sysfs_assoc_lock today just a few lines later we
take sysfs_mutex.

Furthermore explicitly you state in your original commit when
sysfs_assoc_lock was added that it was to aid in the separation of
kobjects and sysfs dirents.

commit aecdcedaab49ca40620dc7dd70f67ee7269a66c9
Author: Tejun Heo <htejun@...il.com>
Date:   Thu Jun 14 03:45:15 2007 +0900

    sysfs: implement kobj_sysfs_assoc_lock
    
    kobj->dentry can go away anytime unless the user controls when the
    associated sysfs node is deleted.  This patch implements
    kobj_sysfs_assoc_lock which protects kobj->dentry.  This will be used
    to maintain kobj based API when converting sysfs to use sysfs_dirent
    tree instead of dentry/kobject.
    
    Note that this lock belongs to kobject/driver-model not sysfs.  Once
    sysfs is converted to not use kobject in its interface, this can be
    removed from sysfs.
    
    This is in preparation of object reference simplification.
    
    Signed-off-by: Tejun Heo <htejun@...il.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@...e.de>

> The current odd looking locking on removal side serves a purpose in
> making it clear that it isn't synchronizing concurrent removal calls.

At the moment the only reason concurrent calls are not properly
synchronized is because the read is outside of the lock.

If you want to avoid concurrent calls WARN_ON(!kobject->sd) is a much
better approach.

But syfs_mutex a few lines later does synchronize concurrent removal
calls, and sysfs_assoc_lock

Which given how sysfs is used is really insane. If we don't support
something we need to warn about it, not 

> Maybe we should rename the lock to sysfs_symlink_target_lock and add
> fat comments on both sides?  Or we can make it a mutex and exclude the
> entire removal and symlinking, which would probably easier to follow.

Agreed.  A mutex to excluse the entire removal and symlinking does sound
easier to follow.  Why don't we call that mutex sysfs_mutex?

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/