[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a45b0353-8878-f86f-7d85-2e38c8b8f99e@intel.com>
Date: Mon, 6 May 2019 12:58:49 +0200
From: Björn Töpel <bjorn.topel@...el.com>
To: Daniel Borkmann <daniel@...earbox.net>,
Björn Töpel <bjorn.topel@...il.com>,
ast@...nel.org, netdev@...r.kernel.org
Cc: magnus.karlsson@...el.com, magnus.karlsson@...il.com,
bpf@...r.kernel.org, Bruce Richardson <bruce.richardson@...el.com>
Subject: Re: [PATCH bpf-next 1/2] xsk: remove AF_XDP socket from map when the
socket is released
On 2019-05-06 12:07, Daniel Borkmann wrote:
> On 05/06/2019 12:04 PM, Daniel Borkmann wrote:
>> On 05/04/2019 06:06 PM, Björn Töpel wrote:
>>> From: Björn Töpel <bjorn.topel@...el.com>
>>>
>>> When an AF_XDP socket is released/closed the XSKMAP still holds a
>>> reference to the socket in a "released" state. The socket will still
>>> use the netdev queue resource, and block newly created sockets from
>>> attaching to that queue, but no user application can access the
>>> fill/complete/rx/tx rings. This results in that all applications need
>>> to explicitly clear the map entry from the old "zombie state"
>>> socket. This should be done automatically.
>>>
>>> After this patch, when a socket is released, it will remove itself
>>> from all the XSKMAPs it resides in, allowing the socket application to
>>> remove the code that cleans the XSKMAP entry.
>>>
>>> This behavior is also closer to that of SOCKMAP, making the two socket
>>> maps more consistent.
>>>
>>> Reported-by: Bruce Richardson <bruce.richardson@...el.com>
>>> Signed-off-by: Björn Töpel <bjorn.topel@...el.com>
>> [...]
>>
>>
>>> +static void __xsk_map_delete_elem(struct xsk_map *map,
>>> + struct xdp_sock **map_entry)
>>> +{
>>> + struct xdp_sock *old_xs;
>>> +
>>> + spin_lock_bh(&map->lock);
>>> + old_xs = xchg(map_entry, NULL);
>>> + if (old_xs)
>>> + xsk_map_del_node(old_xs, map_entry);
>>> + spin_unlock_bh(&map->lock);
>>> +
>>> +}
>>> +
>>> static void xsk_map_free(struct bpf_map *map)
>>> {
>>> struct xsk_map *m = container_of(map, struct xsk_map, map);
>>> @@ -78,15 +142,16 @@ static void xsk_map_free(struct bpf_map *map)
>>> bpf_clear_redirect_map(map);
>>> synchronize_net();
>>>
>>> + spin_lock_bh(&m->lock);
>>> for (i = 0; i < map->max_entries; i++) {
>>> + struct xdp_sock **entry = &m->xsk_map[i];
>>> struct xdp_sock *xs;
>>>
>>> - xs = m->xsk_map[i];
>>> - if (!xs)
>>> - continue;
>>> -
>>> - sock_put((struct sock *)xs);
>>> + xs = xchg(entry, NULL);
>>> + if (xs)
>>> + __xsk_map_delete_elem(m, entry);
>>> }
>>> + spin_unlock_bh(&m->lock);
>>>
>>
>> Was this tested? Doesn't the above straight run into a deadlock?
>>
>> From xsk_map_free() you iterate over the map with m->lock held. Once you
>> xchg'ed the entry and call into __xsk_map_delete_elem(), you attempt to
>> call map->lock on the same map once again. What am I missing?
>
> (It also does the xchg() twice so we'd leak the xs since it's NULL in the
> second one.)
>
No, you're not missing anything. Just plain old sloppiness from my side.
Apologies for the wasted time.
Björn
>> Thanks,
>> Daniel
Powered by blists - more mailing lists