[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <954e3b7f-9001-e528-c6ab-f8f69c84cfd8@iogearbox.net>
Date: Tue, 15 May 2018 21:19:15 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: John Fastabend <john.fastabend@...il.com>, ast@...nel.org
Cc: netdev@...r.kernel.org, davem@...emloft.net
Subject: Re: [PATCH bpf-next v6 1/4] bpf: sockmap, refactor sockmap routines
to work with hashmap
On 05/14/2018 07:00 PM, John Fastabend wrote:
[...]
> +static int __sock_map_ctx_update_elem(struct bpf_map *map,
> + struct bpf_sock_progs *progs,
> + struct sock *sock,
> + struct sock **map_link,
> + void *key)
> {
[...]
> * sock being added. If the sock is already attached to BPF programs
> * this results in an error.
> */
> - verdict = READ_ONCE(stab->bpf_verdict);
> - parse = READ_ONCE(stab->bpf_parse);
> - tx_msg = READ_ONCE(stab->bpf_tx_msg);
> + verdict = READ_ONCE(progs->bpf_verdict);
> + parse = READ_ONCE(progs->bpf_parse);
> + tx_msg = READ_ONCE(progs->bpf_tx_msg);
>
> if (parse && verdict) {
> /* bpf prog refcnt may be zero if a concurrent attach operation
> @@ -1703,11 +1691,11 @@ static int sock_map_ctx_update_elem(struct bpf_sock_ops_kern *skops,
> * we increment the refcnt. If this is the case abort with an
> * error.
> */
> - verdict = bpf_prog_inc_not_zero(stab->bpf_verdict);
> + verdict = bpf_prog_inc_not_zero(progs->bpf_verdict);
> if (IS_ERR(verdict))
> return PTR_ERR(verdict);
>
> - parse = bpf_prog_inc_not_zero(stab->bpf_parse);
> + parse = bpf_prog_inc_not_zero(progs->bpf_parse);
> if (IS_ERR(parse)) {
> bpf_prog_put(verdict);
> return PTR_ERR(parse);
> @@ -1715,7 +1703,7 @@ static int sock_map_ctx_update_elem(struct bpf_sock_ops_kern *skops,
> }
>
> if (tx_msg) {
> - tx_msg = bpf_prog_inc_not_zero(stab->bpf_tx_msg);
> + tx_msg = bpf_prog_inc_not_zero(progs->bpf_tx_msg);
> if (IS_ERR(tx_msg)) {
> if (verdict)
> bpf_prog_put(verdict);
Not directly related to the patch since it doesn't change the logic, but
it feels something is not quite right here (unless I'm missing something).
The code is as follows:
[...]
verdict = READ_ONCE(progs->bpf_verdict);
parse = READ_ONCE(progs->bpf_parse);
tx_msg = READ_ONCE(progs->bpf_tx_msg);
if (parse && verdict) {
/* bpf prog refcnt may be zero if a concurrent attach operation
* removes the program after the above READ_ONCE() but before
* we increment the refcnt. If this is the case abort with an
* error.
*/
verdict = bpf_prog_inc_not_zero(progs->bpf_verdict);
if (IS_ERR(verdict))
return PTR_ERR(verdict);
parse = bpf_prog_inc_not_zero(progs->bpf_parse);
if (IS_ERR(parse)) {
bpf_prog_put(verdict);
return PTR_ERR(parse);
}
}
if (tx_msg) {
tx_msg = bpf_prog_inc_not_zero(progs->bpf_tx_msg);
if (IS_ERR(tx_msg)) {
if (verdict)
bpf_prog_put(verdict);
if (parse)
bpf_prog_put(parse);
return PTR_ERR(tx_msg);
}
}
[...]
As you say in the comment above the function wrt locking notes that the
__sock_map_ctx_update_elem() can be called concurrently.
All operations operate on sock_map using cmpxchg and xchg operations to ensure we
do not get stale references. Any reads into the map must be done with READ_ONCE()
because of this.
You initially use the READ_ONCE() on the verdict/parse/tx_msg, but later on when
grabbing the reference you use again progs->bpf_verdict/bpf_parse/bpf_tx_msg which
would potentially refetch it, but if updates would happen concurrently e.g. to the
three progs, they could be NULL in the mean-time, no? bpf_prog_inc_not_zero() would
then crash. Why are not the ones used that you fetched previously via READ_ONCE()
for taking the ref?
The second question I had is that verdict/parse/tx_msg are updated independently
from each other and each could be NULL or non-NULL. What if, say, parse is NULL
and verdict as well as tx_msg is non-NULL and the bpf_prog_inc_not_zero() on the
tx_msg prog fails. Doesn't this cause a use-after-free since a ref on verdict wasn't
taken earlier but the bpf_prog_put() will cause accidental misbalance/free of the
progs?
It would probably help to clarify the locking comment a bit more if indeed the
above should be okay as is.
Thanks,
Daniel
Powered by blists - more mailing lists