lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120623205755.GJ14083@ZenIV.linux.org.uk>
Date:	Sat, 23 Jun 2012 21:57:55 +0100
From:	Al Viro <viro@...IV.linux.org.uk>
To:	Mimi Zohar <zohar@...ux.vnet.ibm.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	". James Morris" <jmorris@...ei.org>,
	linux-security-module@...r.kernel.org,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Oleg Nesterov <oleg@...hat.com>,
	David Miller <davem@...emloft.net>
Subject: Re: deferring __fput()

On Sat, Jun 23, 2012 at 08:45:05PM +0100, Al Viro wrote:
> BTW, I really wonder why do we need to have that void *data in task_work; we can
> always embed the sucker into a bigger struct (if nothing else, task_work +
> void *data) and get to it via container_of().  And in quite a few cases we don't
> want that data thing at all.  Moreover, the reasons to use hlist_head instead of
> a single forward pointer are very thin on the ground:
> 	* task_work_add() adds to the beginning of the list
> 	* task_work_cancel() walks the list to find the entry to remove
> 	* trask_work_run() takes the list out, walks it to the end,
> then walks it backwards via pprev.  Could as well reverse the pointers
> while walking forward...

You know what...  Let's dump that "reverse the list" thing completely.  What we ought to
do instead of that is honestly keeping both the head of the (single-linked) list and
pointer to pointer to its last element.  Sure, that'll eat one more word in task_struct.
And it doesn't matter, since we'll be able to kill something else in there - namely,
->scm_work_list.  The whole reason we have it is that we want to prevent recursion
in __scm_destroy(); with __fput() getting done via task_work_add() this recursion
will be gone automatically.

So here's what I want to do:

1) kill task_work->data; the only user that cares will allocate task_work + struct cred * and
use container_of() to get to it.

2) replace current->task_works with struct task_work *task_works, **last_work and replace
hlist_node in task_work with struct task_work *next; task_work_add() would add to the tail,
task_work_cancel() would do the obvious "scan the single-linked list, remove the element
found" wihtout bothering with hlist and task_work_run() would just walk the list in direct
order.

3) at that point task_work is equal in size (and layout, BTW) to rcu_head.  So we can add it
into the same union in struct file where we already have list_head and rcu_head.  No space
eaten up.  fput() would, once the counter reaches 0, remove the file from list (the only
place walking that list skips the ones with zero refcount anyway) and, if we are in a normal
process, use task_work_add() to have __fput() done to it.  If we are in kernel thread or
atomic context, just move the sucker to global list and use schedule_work() to have said
list emptied and everything in it fed to __fput().

4) kill ->scm_work_list, along with all contortions in __scm_destroy(); just have it do all
the fput() (if any) it wants to do for this cookie.

5) in fs/aio.c, replace
        if (unlikely(!fput_atomic(req->ki_filp))) {
                spin_lock(&fput_lock);
                list_add(&req->ki_list, &fput_head);
                spin_unlock(&fput_lock);
                schedule_work(&fput_work);
        } else {
                req->ki_filp = NULL;
                really_put_req(ctx, req);
        }
with
	fput(req->ki_filp);
	req->ki_filp = NULL;
	really_put_req(ctx, req);
Sure, we are not allowed to block there.  And we won't.  fput_head/fput_work/aio_fput_routine()
goes to hell, and so does the whole "called_fput" mess in kiocb_batch_refill() in there - retry
loop and all.

Objections, anyone?  Linus, Oleg, Dave?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ