[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZMpVZwh9Y5W1XCsX@ziepe.ca>
Date: Wed, 2 Aug 2023 10:08:55 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Ajay Sharma <sharmaajay@...rosoft.com>
Cc: Long Li <longli@...rosoft.com>, Wei Hu <weh@...rosoft.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"leon@...nel.org" <leon@...nel.org>,
KY Srinivasan <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>,
Dexuan Cui <decui@...rosoft.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
vkuznets <vkuznets@...hat.com>,
"ssengar@...ux.microsoft.com" <ssengar@...ux.microsoft.com>,
"shradhagupta@...ux.microsoft.com" <shradhagupta@...ux.microsoft.com>
Subject: Re: [EXTERNAL] Re: [PATCH v4 1/1] RDMA/mana_ib: Add EQ interrupt
support to mana ib driver.
On Wed, Aug 02, 2023 at 04:11:18AM +0000, Ajay Sharma wrote:
>
>
> > On Aug 1, 2023, at 6:46 PM, Jason Gunthorpe <jgg@...pe.ca> wrote:
> >
> > On Tue, Aug 01, 2023 at 07:06:57PM +0000, Long Li wrote:
> >
> >> The driver interrupt code limits the CPU processing time of each EQ
> >> by reading a small batch of EQEs in this interrupt. It guarantees
> >> all the EQs are checked on this CPU, and limits the interrupt
> >> processing time for any given EQ. In this way, a bad EQ (which is
> >> stormed by a bad user doing unreasonable re-arming on the CQ) can't
> >> storm other EQs on this CPU.
> >
> > Of course it can, the bad use just creates a million EQs and pushes a
> > bit of work through them constantly. How is that really any different
> > from pushing more EQEs into a single EQ?
> >
> > And how does your EQ multiplexing work anyhow? Do you poll every EQ on
> > every interrupt? That itself is a DOS vector.
>
> User does not create eqs directly . EQ creation is by product of
> opening device ie allocating context.
Which is done directly by the user.
> I am not sure if the same
> process is allowed to open device multiple times
Of course it can.
> of lock implemented. So million eqs are probably far fetched .
Uh, how do you conclude that?
> As for how the eq servicing is done - only those eq’s for which the
> interrupt is raised are checked. And each eq is tied only once and
> only to a single interrupt.
So you iterate over a list of EQs in every interrupt?
Allowing userspace to increase the number of EQs on an interrupt is a
direct DOS vector, no special fussing required.
If you want this to work properly you need to have your HW arrange
things so there is only ever one EQE in the EQ for a given CQ at any
time. Another EQE cannot be stuffed by the HW until the kernel reads
the first EQE and acks it back.
You have almost got this right, the mistake is that userspace is the
thing that allows the HW to generate a new EQE. If you care about DOS
then this is the wrong design, the kernel and only the kernel must be
able to trigger a new EQE for the CQ.
In effect you need two CQ doorbells, a userspace one that re-arms the
CQ, and a kernel one that allows a CQ that triggered on ARM to
generate an EQE.
Thus the kernel can strictly limit the flow of EQEs through the EQs
such that an EQ can never overflow and a CQ can never consume more
than one EQE.
You cannot really fix this hardware problem with a software
solution. You will always have a DOS at some point.
Jason
Powered by blists - more mailing lists