netdev - Re: [PATCH net-next v3 07/24] ovpn: introduce the ovpn

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <04558c43-6b7d-4076-a6eb-d60222a292fc@openvpn.net>
Date: Thu, 9 May 2024 16:36:09 +0200
From: Antonio Quartulli <antonio@...nvpn.net>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
 Sergey Ryazanov <ryazanov.s.a@...il.com>, Paolo Abeni <pabeni@...hat.com>,
 Eric Dumazet <edumazet@...gle.com>, Andrew Lunn <andrew@...n.ch>,
 Esben Haabendal <esben@...nix.com>
Subject: Re: [PATCH net-next v3 07/24] ovpn: introduce the ovpn_peer object

On 09/05/2024 16:17, Sabrina Dubroca wrote:
> 2024-05-09, 15:44:26 +0200, Antonio Quartulli wrote:
>> On 09/05/2024 15:04, Sabrina Dubroca wrote:
>>>>>> +void ovpn_peer_release(struct ovpn_peer *peer)
>>>>>> +{
>>>>>> +	call_rcu(&peer->rcu, ovpn_peer_release_rcu);
>>>>>> +}
>>>>>> +
>>>>>> +/**
>>>>>> + * ovpn_peer_delete_work - work scheduled to release peer in process context
>>>>>> + * @work: the work object
>>>>>> + */
>>>>>> +static void ovpn_peer_delete_work(struct work_struct *work)
>>>>>> +{
>>>>>> +	struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
>>>>>> +					      delete_work);
>>>>>> +	ovpn_peer_release(peer);
>>>>>
>>>>> Does call_rcu really need to run in process context?
>>>>
>>>> Reason for switching to process context is that we have to invoke
>>>> ovpn_nl_notify_del_peer (that sends a netlink event to userspace) and the
>>>> latter requires a reference to the peer.
>>>
>>> I'm confused. When you say "requires a reference to the peer", do you
>>> mean accessing fields of the peer object? I don't see why this
>>> requires ovpn_nl_notify_del_peer to to run from process context.
>>
>> ovpn_nl_notify_del_peer sends a netlink message to userspace and I was under
>> the impression that it may block/sleep, no?
>> For this reason I assumed it must be executed in process context.
> 
> With s/GFP_KERNEL/GFP_ATOMIC/, it should be ok to run from whatever
> context. Firing up a workqueue just to send a 100B netlink message
> seems a bit overkill.

Oh ok, I thought the send could be a problem too.

Will test with GFP_ATOMIC then. Thanks for the hint.

> 
> 
> 
>> This said, I have a question regarding DEBUG_NET_WARN_ON_ONCE: it prints
>> something only if CONFIG_DEBUG_NET is enabled.
>> Is this the case on standard desktop/server distribution? Otherwise how are
>> we going to get reports from users?
> 
> That's pretty much why I'm suggesting to use it. For those things that
> should really never happen, I think letting developers find them
> during testing (or syzbot when it gets to your driver) is enough. I'm
> not convinced getting a stack trace from a user without any ability to
> reproduce is that useful.
> 
> But if you or someone else really want some WARN_ONs, I can live with
> that.

I would personally prefer to keep the WARN_ON.

Since these bogus conditions may have consequences, users will open 
report in any case.
Having some extra text that they can post for us to contextualize the 
issue may be useful.

> 
>>>>> And if this happens during interface deletion, aren't we leaking the
>>>>> peer memory here?
>>>>
>>>> at interface deletion we call
>>>>
>>>> ovpn_iface_destruct -> ovpn_peer_release_p2p ->
>>>> ovpn_peer_del_p2p(ovpn->peer)
>>>>
>>>> so at the last step we just ask to remove the very same peer that is
>>>> curently stored, which should just never fail.
>>>
>>> But that's not what the test checks for. If ovpn->peer->ovpn != ovpn,
>>> the test in ovpn_peer_del_p2p will fail. That's "objects getting out
>>> of sync" in my previous email. The peer has a bogus back reference to
>>> its ovpn parent, but it's ovpn->peer nevertheless.
>>>
>>
>> Oh thanks for explaining that.
>>
>> Ok, my assumption is that "ovpn->peer->ovpn != ovpn" can never be true.
>>
>> Peers are created within the context of one ovpn object and are never
>> exposed to other ovpns.
>>
>> I hope it makes sense.
> 
> Ok, so this would indicate that something has gone badly wrong. Is it
> really worth checking for that (or maybe just during development)?
> 

A peer is created in ovpn_nl_set_peer_doit(), where the ovpn object is 
used to first assign peer->ovpn and then to store the peer in its own 
members. This all happens in one call and the value of ovpn can't be 
switched.

Anyway, bugs hide where we are most confident that things cannot go 
wrong :) So I'll still add a WARN_ON, just in case.

Thanks

-- 
Antonio Quartulli
OpenVPN Inc.