lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 Jul 2022 21:37:52 +0000
From:   Song Liu <songliubraving@...com>
To:     Steven Rostedt <rostedt@...dmis.org>
CC:     Song Liu <song@...nel.org>, Networking <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, lkml <linux-kernel@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Kernel Team <Kernel-team@...com>,
        "jolsa@...nel.org" <jolsa@...nel.org>,
        "mhiramat@...nel.org" <mhiramat@...nel.org>
Subject: Re: [PATCH v2 bpf-next 5/5] bpf: trampoline: support
 FTRACE_OPS_FL_SHARE_IPMODIFY



> On Jul 6, 2022, at 12:38 PM, Steven Rostedt <rostedt@...dmis.org> wrote:
> 
> On Thu, 2 Jun 2022 12:37:06 -0700
> Song Liu <song@...nel.org> wrote:
> 
> 
>> --- a/kernel/bpf/trampoline.c
>> +++ b/kernel/bpf/trampoline.c
>> @@ -27,6 +27,44 @@ static struct hlist_head trampoline_table[TRAMPOLINE_TABLE_SIZE];
>> /* serializes access to trampoline_table */
>> static DEFINE_MUTEX(trampoline_mutex);
>> 
>> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> +static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mutex);
>> +
>> +static int bpf_tramp_ftrace_ops_func(struct ftrace_ops *ops, enum ftrace_ops_cmd cmd)
>> +{
>> +	struct bpf_trampoline *tr = ops->private;
>> +	int ret;
>> +
>> +	/*
>> +	 * The normal locking order is
>> +	 *    tr->mutex => direct_mutex (ftrace.c) => ftrace_lock (ftrace.c)
>> +	 *
>> +	 * This is called from prepare_direct_functions_for_ipmodify, with
>> +	 * direct_mutex locked. Use mutex_trylock() to avoid dead lock.
>> +	 * Also, bpf_trampoline_update here should not lock direct_mutex.
>> +	 */
>> +	if (!mutex_trylock(&tr->mutex))
> 
> Can you comment here that returning -EAGAIN will not cause this to repeat.
> That it will change things where the next try will not return -EGAIN?

Hmm.. this is not the guarantee here. This conflict is a real race condition 
that an IPMODIFY function (i.e. livepatch) is being registered at the same time 
when something else, for example bpftrace, is updating the BPF trampoline. 

This EAGAIN will propagate to the user of the IPMODIFY function (i.e. livepatch),
and we need to retry there. In the case of livepatch, the retry is initiated 
from user space. 

> 
>> +		return -EAGAIN;
>> +
>> +	switch (cmd) {
>> +	case FTRACE_OPS_CMD_ENABLE_SHARE_IPMODIFY:
>> +		tr->indirect_call = true;
>> +		ret = bpf_trampoline_update(tr, false /* lock_direct_mutex */);
>> +		break;
>> +	case FTRACE_OPS_CMD_DISABLE_SHARE_IPMODIFY:
>> +		tr->indirect_call = false;
>> +		tr->fops->flags &= ~FTRACE_OPS_FL_SHARE_IPMODIFY;
>> +		ret = bpf_trampoline_update(tr, false /* lock_direct_mutex */);
>> +		break;
>> +	default:
>> +		ret = -EINVAL;
>> +		break;
>> +	};
>> +	mutex_unlock(&tr->mutex);
>> +	return ret;
>> +}
>> +#endif
>> +
>> 
> 
> 
>> @@ -330,7 +387,7 @@ static struct bpf_tramp_image *bpf_tramp_image_alloc(u64 key, u32 idx)
>> 	return ERR_PTR(err);
>> }
>> 
>> -static int bpf_trampoline_update(struct bpf_trampoline *tr)
>> +static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mutex)
>> {
>> 	struct bpf_tramp_image *im;
>> 	struct bpf_tramp_links *tlinks;
>> @@ -363,20 +420,45 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
>> 	if (ip_arg)
>> 		flags |= BPF_TRAMP_F_IP_ARG;
>> 
>> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> +again:
>> +	if (tr->indirect_call)
>> +		flags |= BPF_TRAMP_F_ORIG_STACK;
>> +#endif
>> +
>> 	err = arch_prepare_bpf_trampoline(im, im->image, im->image + PAGE_SIZE,
>> 					  &tr->func.model, flags, tlinks,
>> 					  tr->func.addr);
>> 	if (err < 0)
>> 		goto out;
>> 
>> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> +	if (tr->indirect_call)
>> +		tr->fops->flags |= FTRACE_OPS_FL_SHARE_IPMODIFY;
>> +#endif
>> +
>> 	WARN_ON(tr->cur_image && tr->selector == 0);
>> 	WARN_ON(!tr->cur_image && tr->selector);
>> 	if (tr->cur_image)
>> 		/* progs already running at this address */
>> -		err = modify_fentry(tr, tr->cur_image->image, im->image);
>> +		err = modify_fentry(tr, tr->cur_image->image, im->image, lock_direct_mutex);
>> 	else
>> 		/* first time registering */
>> 		err = register_fentry(tr, im->image);
>> +
>> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>> +	if (err == -EAGAIN) {
>> +		if (WARN_ON_ONCE(tr->indirect_call))
>> +			goto out;
>> +		/* should only retry on the first register */
>> +		if (WARN_ON_ONCE(tr->cur_image))
>> +			goto out;
>> +		tr->indirect_call = true;
>> +		tr->fops->func = NULL;
>> +		tr->fops->trampoline = 0;
>> +		goto again;
> 
> I'm assuming that the above will prevent a return of -EAGAIN again. As if
> it can, then this could turn into a dead lock.
> 
> Can you please comment that?

This is a different case. This EAGAIN happens when the live patch came first, 
and we register bpf trampoline later. By enabling tr->indirect_call, we should
not get EAGAIN from register_fentry. 

I will add more comments for both cases. 

Thanks,
Song

Powered by blists - more mailing lists