linux-kernel - Re: [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when rpc task is killed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <b8c236d99fd0f4e08dd0ee12a81274bd643a7690.camel@hammerspace.com>
Date:   Wed, 22 Dec 2021 15:02:58 +0000
From:   Trond Myklebust <trondmy@...merspace.com>
To:     "cuibixuan@...ux.alibaba.com" <cuibixuan@...ux.alibaba.com>,
        "linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC:     "pete.wl@...baba-inc.com" <pete.wl@...baba-inc.com>,
        "xiaoh.peixh@...baba-inc.com" <xiaoh.peixh@...baba-inc.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "chuck.lever@...cle.com" <chuck.lever@...cle.com>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "bfields@...ldses.org" <bfields@...ldses.org>,
        "weipu.zy@...baba-inc.com" <weipu.zy@...baba-inc.com>,
        "wenan.mwa@...baba-inc.com" <wenan.mwa@...baba-inc.com>,
        "anna.schumaker@...app.com" <anna.schumaker@...app.com>
Subject: Re: [PATCH -next] SUNRPC: Clean XPRT_CONGESTED of xprt->state when
 rpc task is killed

On Wed, 2021-12-22 at 10:55 +0800, Bixuan Cui wrote:
> 
> 在 2021/12/21 上午2:22, Trond Myklebust 写道:
>  
> > On Mon, 2021-12-20 at 11:39 +0800, Bixuan Cui wrote:
> >  
> > > ping~
> > > 
> > > 在 2021/12/14 下午9:53, Bixuan Cui 写道:
> > >  
> > > > When the values of tcp_max_slot_table_entries and
> > > > sunrpc.tcp_slot_table_entries are lower than the number of rpc
> > > > tasks,
> > > > xprt_dynamic_alloc_slot() in xprt_alloc_slot() will return -
> > > > EAGAIN,
> > > > and
> > > > then set xprt->state to XPRT_CONGESTED:
> > > >    xprt_retry_reserve
> > > >      ->xprt_do_reserve
> > > >        ->xprt_alloc_slot
> > > >          ->xprt_dynamic_alloc_slot // return -EAGAIN and task-
> > > >  
> > > > > tk_rqstp is NULL
> > > >            ->xprt_add_backlog // set_bit(XPRT_CONGESTED, &xprt-
> > > >  
> > > > > state);
> > > > When rpc task is killed, XPRT_CONGESTED bit of xprt->state will
> > > > not
> > > > be
> > > > cleaned up and nfs hangs:
> > > >    rpc_exit_task
> > > >      ->xprt_release // if (req == NULL) is true, then
> > > > XPRT_CONGESTED
> > > >                    // bit not clean
> > > > 
> > > > Add xprt_wake_up_backlog(xprt) to clean XPRT_CONGESTED bit in
> > > > xprt_release().
> > I'm not seeing how this explanation makes sense. If the task
> > doesn't
> > hold a slot, then freeing that task isn't going to clear the
> > congestion
> > caused by all the slots being in use.
> Hi，
> If the rpc task is free, call xprt_release() :
> void xprt_release(struct rpc_task *task)
>  {
>       if (req == NULL) {
>                   if (task->tk_client) {
>                           xprt = task->tk_xprt;
>                           xprt_release_write(xprt, task); // 1.
> release xprt
>                   }
>                   return;
>           }
>       ....
>       if (likely(!bc_prealloc(req)))
>                   xprt->ops->free_slot(xprt, req); // 2. release slot
> and call xprt_wake_up_backlog(xprt, req) to wakeup next task(clear
> XPRT_CONGESTED bit if next is NULL) in xprt_free_slot()
>           else
>                   xprt_free_bc_request(req);
>  }
>  I mean that in step 1, xprt was only released, but
> xprt_wake_up_backlog was not called (I don’t know if it is necessary,
> but xprt->state may still be XPRT_CONGESTED), which causes xprt to
> hold up. I think it happens when the task that does not hold a slot
> is the last released task，xprt_wake_up_backlog(clear XPRT_CONGESTED)
> will not be executed. :-)
> Thanks，
> Bixuan Cui
> 
>  

My point is that in that case 1, there is no slot to free, so there is
no change to the congestion state.

IOW: your patch is incorrect because it is trying to assign a slot in a
case where there is no slot to assign.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@...merspace.com