netdev - Re: [PATCH net-next] l2tp: avoid using drain_workqueue in l2tp_pre_exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a5468b0a-6963-5f04-4827-1f15aae7f3e7@katalix.com>
Date: Thu, 22 Aug 2024 15:41:14 +0100
From: James Chapman <jchapman@...alix.com>
To: Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
 dsahern@...nel.org, tparkin@...alix.com, xiyou.wangcong@...il.com
Subject: Re: [PATCH net-next] l2tp: avoid using drain_workqueue in
 l2tp_pre_exit_net

On 22/08/2024 11:22, Paolo Abeni wrote:
> 
> 
> On 8/19/24 16:52, James Chapman wrote:
>> Recent commit c1b2e36b8776 ("l2tp: flush workqueue before draining
>> it") incorrectly uses drain_workqueue. 
> 
> isn't the relevant commit fc7ec7f554d7d0a27ba339fcf48df11d14413329?

Good spot. Thanks.

>> The use of drain_workqueue in
>> l2tp_pre_exit_net is flawed because the workqueue is shared by all
>> nets and it is therefore possible for new work items to be queued
>> while drain_workqueue runs.
>>
>> Instead of using drain_workqueue, use a loop to delete all tunnels and
>> __flush_workqueue until all tunnel/session lists of the net are
>> empty. Add a per-net flag to ensure that no new tunnel can be created
>> in the net once l2tp_pre_exit_net starts.
> 
> We need a fixes tag even for net-next fixes :)

Oh ok. My mistake.

>> Signed-off-by: James Chapman <jchapman@...alix.com>
>> Signed-off-by: Tom Parkin <tparkin@...alix.com>
>> ---
>>   net/l2tp/l2tp_core.c    | 38 +++++++++++++++++++++++++++++---------
>>   net/l2tp/l2tp_core.h    |  2 +-
>>   net/l2tp/l2tp_netlink.c |  2 +-
>>   net/l2tp/l2tp_ppp.c     |  3 ++-
>>   4 files changed, 33 insertions(+), 12 deletions(-)
>>
>> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
>> index af87c781d6a6..246b07342b86 100644
>> --- a/net/l2tp/l2tp_core.c
>> +++ b/net/l2tp/l2tp_core.c
>> @@ -107,6 +107,7 @@ static struct workqueue_struct *l2tp_wq;
>>   /* per-net private data for this module */
>>   static unsigned int l2tp_net_id;
>>   struct l2tp_net {
>> +    bool net_closing;
>>       /* Lock for write access to l2tp_tunnel_idr */
>>       spinlock_t l2tp_tunnel_idr_lock;
>>       struct idr l2tp_tunnel_idr;
>> @@ -1560,13 +1561,19 @@ static int l2tp_tunnel_sock_create(struct net 
>> *net,
>>       return err;
>>   }
>> -int l2tp_tunnel_create(int fd, int version, u32 tunnel_id, u32 
>> peer_tunnel_id,
>> +int l2tp_tunnel_create(struct net *net, int fd, int version,
>> +               u32 tunnel_id, u32 peer_tunnel_id,
>>                  struct l2tp_tunnel_cfg *cfg, struct l2tp_tunnel 
>> **tunnelp)
>>   {
>> +    struct l2tp_net *pn = l2tp_pernet(net);
>>       struct l2tp_tunnel *tunnel = NULL;
>>       int err;
>>       enum l2tp_encap_type encap = L2TP_ENCAPTYPE_UDP;
>> +    /* This pairs with WRITE_ONCE() in l2tp_pre_exit_net(). */
>> +    if (READ_ONCE(pn->net_closing))
>> +        return -ENETDOWN;
> 
> Is this necessary? the netns is going away, no user space process should 
> be able to touch it.

I considered this too. I was thinking that a bad process could cause 
l2tp_pre_exit_net to loop forever if it keeps creating tunnels. But if 
the net isn't usable by userspace when the pre_exit handler starts then 
I think we're ok to remove the flag.

> 
>> +
>>       if (cfg)
>>           encap = cfg->encap;
>> @@ -1855,16 +1870,21 @@ static __net_exit void 
>> l2tp_pre_exit_net(struct net *net)
>>       }
>>       rcu_read_unlock_bh();
>> -    if (l2tp_wq) {
>> -        /* ensure that all TUNNEL_DELETE work items are run before
>> -         * draining the work queue since TUNNEL_DELETE requests may
>> -         * queue SESSION_DELETE work items for each session in the
>> -         * tunnel. drain_workqueue may otherwise warn if SESSION_DELETE
>> -         * requests are queued while the work queue is being drained.
>> -         */
>> +    if (l2tp_wq)
>>           __flush_workqueue(l2tp_wq);
>> -        drain_workqueue(l2tp_wq);
>> +
>> +    /* repeat until all of the net's IDR lists are empty, in case 
>> tunnels
>> +     * or sessions were being created just before l2tp_pre_exit_net was
>> +     * called.
>> +     */
>> +    rcu_read_lock_bh();
>> +    if (!idr_is_empty(&pn->l2tp_tunnel_idr) ||
>> +        !idr_is_empty(&pn->l2tp_v2_session_idr) ||
>> +        !idr_is_empty(&pn->l2tp_v3_session_idr)) {
>> +        rcu_read_unlock_bh();
>> +        goto again;
> 
> This looks not nice, it could keep the kernel spinning for a while.
> 
> What about i.e. queue a 'dummy' work on l2tp_wq after 
> __flush_workqueue() and explicitly wait for such work to complete?
> 
> when such work completes are other l2tp related one in the same netns 
> should also be completed.

The loop is there in case one or more threads were in 
l2tp_tunnel_register or l2tp_session_register at the point where 
l2tp_pre_exit_net starts. If a tunnel or session is registered after 
l2tp_pre_exit_net loops over all tunnels calling l2tp_tunnel_delete, 
then it would be left behind.

I'll think more on this.

Thanks