lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pn9euks9.fsf@x220.int.ebiederm.org>
Date:   Thu, 02 Jul 2020 08:08:22 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        linux-kernel@...r.kernel.org, David Miller <davem@...emloft.net>,
        Greg Kroah-Hartman <greg@...ah.com>,
        Kees Cook <keescook@...omium.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Al Viro <viro@...iv.linux.org.uk>, bpf <bpf@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        Gary Lin <GLin@...e.com>, Bruno Meneguele <bmeneg@...hat.com>,
        LSM List <linux-security-module@...r.kernel.org>,
        Casey Schaufler <casey@...aufler-ca.com>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v2 00/15] Make the user mode driver code a better citizen

Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp> writes:

> On 2020/06/30 21:29, Eric W. Biederman wrote:
>> Hmm.  The wake up happens just of tgid->wait_pidfd happens just before
>> release_task is called so there is a race.  As it is possible to wake
>> up and then go back to sleep before pid_has_task becomes false.
>
> What is the reason we want to wait until pid_has_task() becomes false?
>
> - wait_event(tgid->wait_pidfd, !pid_has_task(tgid, PIDTYPE_TGID));
> + while (!wait_event_timeout(tgid->wait_pidfd, !pid_has_task(tgid, PIDTYPE_TGID), 1));

So that it is safe to call bpfilter_umh_cleanup.  The previous code
performed the wait by having a callback in do_exit.

It might be possible to call bpf_umh_cleanup early but I have not done
that analysis.

To perform the test correctly what I have right now is:

bool thread_group_exited(struct pid *pid)
{
	struct task_struct *tsk;
	bool exited;

	rcu_read_lock();
	tsk = pid_task(pid, PIDTYPE_PID);
	exited = !tsk || (READ_ONCE(tsk->exit_state) && thread_group_empty(tsk));
	rcu_read_unlock();

	return exited;
}

Which is factored out of pidfd_poll.  Which means that this won't be
something that the bpfilter code has to maintain.  That seems to be a
fundamentally good facility to have regardless of bpfilter.

I will post the whole thing in a bit once I have a chance to dot my i's
and cross my t's.

> By the way, commit 4a9d4b024a3102fc ("switch fput to task_work_add") says
> that use of flush_delayed_fput() has to be careful. Al, is it safe to call
> flush_delayed_fput() from blob_to_mnt() from umd_load_blob() (which might be
> called from both kernel thread and from process context (e.g. init_module()
> syscall by /sbin/insmod )) ?

And __fput_sync needs to be even more careful.
umd_load_blob is called in these changes without any locks held.

We fundamentally AKA in any correct version of this code need to flush
the file descriptor before we call exec or exec can not open it a
read-only denying all writes from any other opens.

The use case of flush_delayed_fput is exactly the same as that used
when loading the initramfs.

Eric




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ