linux-kernel - Re: [PATCH 1/2] fput: task_work_add() can fail if the caller has passed exit_task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20130614145831.c65ad42447637e3ad33eb79d@linux-foundation.org>
Date:	Fri, 14 Jun 2013 14:58:31 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Al Viro <viro@...iv.linux.org.uk>,
	Andrey Vagin <avagin@...nvz.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	David Howells <dhowells@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] fput: task_work_add() can fail if the caller has
 passed exit_task_work()

On Fri, 14 Jun 2013 21:09:47 +0200 Oleg Nesterov <oleg@...hat.com> wrote:

> fput() assumes that it can't be called after exit_task_work() but
> this is not true, for example free_ipc_ns()->shm_destroy() can do
> this. In this case fput() silently leaks the file.
> 
> Change it to fallback to delayed_fput_work if task_work_add() fails.
> The patch looks complicated but it is not, it changes the code from
> 
> 	if (PF_KTHREAD) {
> 		schedule_work(...);
> 		return;
> 	}
> 	task_work_add(...)
> 
> to
> 	if (!PF_KTHREAD) {
> 		if (!task_work_add(...))
> 			return;
> 		/* fallback */
> 	}
> 	schedule_work(...);
> 
> As for shm_destroy() in particular, we could make another fix but I
> think this change makes sense anyway. There could be another similar
> user, it is not safe to assume that task_work_add() can't fail.
> 
> ...
>
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -306,17 +306,18 @@ void fput(struct file *file)
>  {
>  	if (atomic_long_dec_and_test(&file->f_count)) {
>  		struct task_struct *task = current;
> +		unsigned long flags;
> +
>  		file_sb_list_del(file);
> -		if (unlikely(in_interrupt() || task->flags & PF_KTHREAD)) {
> -			unsigned long flags;
> -			spin_lock_irqsave(&delayed_fput_lock, flags);
> -			list_add(&file->f_u.fu_list, &delayed_fput_list);
> -			schedule_work(&delayed_fput_work);
> -			spin_unlock_irqrestore(&delayed_fput_lock, flags);
> -			return;
> +		if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
> +			init_task_work(&file->f_u.fu_rcuhead, ____fput);
> +			if (!task_work_add(task, &file->f_u.fu_rcuhead, true))
> +				return;

A comment here would be useful, explaining the circumstances under
which we fall through to the delayed fput.  This is particularly needed
because kernel/task_work.c is such undocumented crap.

This?

--- a/fs/file_table.c~fput-task_work_add-can-fail-if-the-caller-has-passed-exit_task_work-fix
+++ a/fs/file_table.c
@@ -313,6 +313,12 @@ void fput(struct file *file)
 			init_task_work(&file->f_u.fu_rcuhead, ____fput);
 			if (!task_work_add(task, &file->f_u.fu_rcuhead, true))
 				return;
+			/*
+			 * After this task has run exit_task_work(),
+			 * task_work_add() will fail.  free_ipc_ns()->
+			 * shm_destroy() can do this.  Fall through to delayed
+			 * fput to avoid leaking *file.
+			 */
 		}
 		spin_lock_irqsave(&delayed_fput_lock, flags);
 		list_add(&file->f_u.fu_list, &delayed_fput_list);


>  		}
> -		init_task_work(&file->f_u.fu_rcuhead, ____fput);
> -		task_work_add(task, &file->f_u.fu_rcuhead, true);
> +		spin_lock_irqsave(&delayed_fput_lock, flags);
> +		list_add(&file->f_u.fu_list, &delayed_fput_list);
> +		schedule_work(&delayed_fput_work);
> +		spin_unlock_irqrestore(&delayed_fput_lock, flags);

OT: I don't think that schedule_work() needs to be inside the locked
region.  Scalability improvements beckon!

>  	}
>  }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/