linux-kernel - Re: [PATCH for-next] RDMA/hns: Support mmapping reset state to userspace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f046d3f8-a1c8-0174-8db9-24467c038557@hisilicon.com>
Date: Tue, 10 Dec 2024 14:24:16 +0800
From: Junxian Huang <huangjunxian6@...ilicon.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: <leon@...nel.org>, <linux-rdma@...r.kernel.org>, <linuxarm@...wei.com>,
	<linux-kernel@...r.kernel.org>, <tangchengchang@...wei.com>
Subject: Re: [PATCH for-next] RDMA/hns: Support mmapping reset state to
 userspace



On 2024/12/10 3:01, Jason Gunthorpe wrote:
> On Mon, Oct 14, 2024 at 09:07:31PM +0800, Junxian Huang wrote:
>> From: Chengchang Tang <tangchengchang@...wei.com>
>>
>> Mmap reset state to notify userspace about HW reset. The mmaped flag
>> hw_ready will be initiated to a non-zero value. When HW is reset,
>> the mmap page will be zapped and userspace will get a zero value of
>> hw_ready.
> 
> This needs alot more explanation about *why* does userspace need this
> information and why is hns unique here.
> 

Our HW cannot flush WQEs by itself unless the driver posts a modify-qp-to-err
mailbox. But when the HW is reset, it'll stop handling mailbox too, so the HW
becomes unable to produce any more CQEs for the existing WQEs. This will break
some users' expectation that they should be able to poll CQEs as many as the
number of the posted WQEs in any cases.

We try to notify the reset state to userspace so that we can generate software
WCs for the existing WQEs in userspace instead of HW in reset state, which is
what this rdma-core PR does:

https://github.com/linux-rdma/rdma-core/pull/1504

Junxian

> Usually when the HW is reset there are enough existing system calls
> that will start failing that a driver should not need to do something
> like this.
> 
> Jason