linux-kernel - Re: [PATCH v3 1/2] RDMA/rxe: Update wqe

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <dc2efb0b-68be-102b-a041-47c799361d35@fujitsu.com>
Date:   Mon, 27 Jun 2022 03:41:57 +0000
From:   "lizhijian@...itsu.com" <lizhijian@...itsu.com>
To:     Bob Pearson <rpearsonhpe@...il.com>,
        Yanjun Zhu <yanjun.zhu@...ux.dev>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Haakon Bugge <haakon.bugge@...cle.com>,
        Cheng Xu <chengyou@...ux.alibaba.com>,
        "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 1/2] RDMA/rxe: Update wqe_index for each wqe error
 completion



On 27/06/2022 05:51, Bob Pearson wrote:
> On 5/15/22 20:53, Li Zhijian wrote:
>> Previously, if user space keeps sending abnormal wqe, queue.prod will
>> keep increasing while queue.index doesn't. Once
>> queue.index==queue.prod in next round, req_next_wqe() will treat queue
>> as empty. In such case, no new completion would be generated.
>>
>> Update wqe_index for each wqe completion so that req_next_wqe() can get
>> next wqe properly.
>>
>> Signed-off-by: Li Zhijian <lizhijian@...itsu.com>
>> ---
>>   drivers/infiniband/sw/rxe/rxe_req.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
>> index a0d5e57f73c1..8bdd0b6b578f 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_req.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_req.c
>> @@ -773,6 +773,8 @@ int rxe_requester(void *arg)
>>   	if (ah)
>>   		rxe_put(ah);
>>   err:
>> +	/* update wqe_index for each wqe completion */
>> +	qp->req.wqe_index = queue_next_index(qp->sq.queue, qp->req.wqe_index);
>>   	wqe->state = wqe_state_err
>>   	__rxe_do_task(&qp->comp.task);
>>   
> This change looks plausible, but I am not sure if it will make a difference since the qp
> will get transitioned to the error state very shortly.
>
> In order for it to matter the requester must be a ways ahead of the completer in the send queue
> and someone be actively posting new wqes which will reschedule the requester. Currently it
> will fail on the same wqe again unless the error described above occurs but if we post a new valid
> wqe it will get executed even though we have detected an error that should have stopped the qp.
>
> It looks like the intent was to keep the qp in the non error state until all the old
> wqes get completed before making the transition.
Not really, My first intent was just let req_next_wqe() return wqe if the queue is not empty.
Since, currently if  rxe_requester() always goes to the error path for some reasons, req_next_wqe()
will becomes false empty at next round though the queue is almost full.

BTW, i will review your newly private patches

Thanks
Zhijian

> But we should disable the requester
> from processing new wqes in this case. That seems like a safer solution to the problem.
>
> Bob
>