lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 1 Jun 2022 09:18:08 +0200
From:   Lukas Wunner <lukas@...ner.de>
To:     Ira Weiny <ira.weiny@...el.com>
Cc:     Jonathan Cameron <Jonathan.Cameron@...wei.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Christoph Hellwig <hch@...radead.org>,
        Alison Schofield <alison.schofield@...el.com>,
        Vishal Verma <vishal.l.verma@...el.com>,
        Ben Widawsky <ben.widawsky@...el.com>,
        linux-kernel@...r.kernel.org, linux-cxl@...r.kernel.org,
        linux-pci@...r.kernel.org
Subject: Re: [PATCH V8 03/10] PCI: Create PCI library functions in support of
 DOE mailboxes.

On Tue, May 31, 2022 at 07:59:21PM -0700, Ira Weiny wrote:
> On Tue, May 31, 2022 at 11:33:50AM +0100, Jonathan Cameron wrote:
> > On Mon, 30 May 2022 21:06:57 +0200 Lukas Wunner <lukas@...ner.de> wrote:
> > > On Thu, Apr 14, 2022 at 01:32:30PM -0700, ira.weiny@...el.com wrote:
> > > > +	/* First 2 dwords have already been read */
> > > > +	length -= 2;
> > > > +	/* Read the rest of the response payload */
> > > > +	for (i = 0; i < min(length, task->response_pl_sz / sizeof(u32)); i++) {
> > > > +		pci_read_config_dword(pdev, offset + PCI_DOE_READ,
> > > > +				      &task->response_pl[i]);
> > > > +		pci_write_config_dword(pdev, offset + PCI_DOE_READ, 0);
> > > > +	}  
> > > 
> > > You need to check the Data Object Ready bit.  The device may clear the
> > > bit prematurely (e.g. as a result of a concurrent FLR or Conventional
> > > Reset).  You'll continue reading zero dwords from the mailbox and
> > > pretend success to the caller even though the response is truncated.
> > > 
> > > If you're concerned about performance when checking the bit on every
> > > loop iteration, checking it only on the last but one iteration should
> > > be sufficient to detect truncation.
> > 
> > Good catch - I hate corner cases.  Thankfully this one is trivial to
> > check for.
> 
> Ok looking at the spec:  Strictly speaking this needs to happen multiple
> times both in doe_statemachine_work() and inside pci_doe_recv_resp();
> not just in this loop.  :-(
> 
> This is because, the check in doe_statemachine_work() only covers the
> 1st dword read IIUC.

The spec says "this bit indicates the DOE instance has a *data object*
available to be read by system firmware/software".

So, the entire object is available for reading, not just one dword.

You've already got checks in place for the first two dwords which
cover reading an "all zeroes" response.  No need to amend them.

You only need to re-check the Data Object Ready bit on the last-but-one
dword in case the function was reset concurrently.  Per sec. 6.30.2,
"An FLR to a Function must result in the aborting of any DOE transfer
in progress."


> > > > +static irqreturn_t pci_doe_irq_handler(int irq, void *data)
> > > > +{
> > > > +	struct pci_doe_mb *doe_mb = data;
> > > > +	struct pci_dev *pdev = doe_mb->pdev;
> > > > +	int offset = doe_mb->cap_offset;
> > > > +	u32 val;
> > > > +
> > > > +	pci_read_config_dword(pdev, offset + PCI_DOE_STATUS, &val);
> > > > +
> > > > +	/* Leave the error case to be handled outside IRQ */
> > > > +	if (FIELD_GET(PCI_DOE_STATUS_ERROR, val)) {
> > > > +		mod_delayed_work(system_wq, &doe_mb->statemachine, 0);
> > > > +		return IRQ_HANDLED;
> > > > +	}
> > > > +
> > > > +	if (FIELD_GET(PCI_DOE_STATUS_INT_STATUS, val)) {
> > > > +		pci_write_config_dword(pdev, offset + PCI_DOE_STATUS,
> > > > +					PCI_DOE_STATUS_INT_STATUS);
> > > > +		mod_delayed_work(system_wq, &doe_mb->statemachine, 0);
> > > > +		return IRQ_HANDLED;
> > > > +	}
> > > > +
> > > > +	return IRQ_NONE;
> > > > +}  
> > > 
> > > PCIe 6.0, table 7-316 says that an interrupt is also raised when
> > > "the DOE Busy bit has been Cleared", yet such an interrupt is
> > > not handled here.  It is incorrectly treated as a spurious
> > > interrupt by returning IRQ_NONE.  The right thing to do
> > > is probably to wake the state machine in case it's polling
> > > for the Busy flag to clear.
> > 
> > Ah. I remember testing this via a lot of hacking on the QEMU code
> > to inject the various races that can occur (it was really ugly to do).
> > 
> > Guess we lost the handling at some point.  I think your fix
> > is the right one.
> 
> Perhaps I am missing something but digging into this more.  I disagree
> that the handler fails to handle this case.  If I read the spec correctly
> DOE Interrupt Status must be set when an interrupt is generated.
> The handler wakes the state machine in that case.  The state machine
> then checks for busy if there is work to be done.

Right, I was mistaken, sorry for the noise.


> Normally we would not even need to check for status error.  But that is
> special cased because clearing that status is left to the state machine.

That however looks wrong because the DOE Interrupt Status bit is never
cleared after a DOE Error is signaled.  The state machine performs an
explicit abort upon an error by setting the DOE Abort bit, but that
doesn't seem to clear DOE Interrupt Status:

Per section 6.30.2, "At any time, the system firmware/software is
permitted to set the DOE Abort bit in the DOE Control Register,
and the DOE instance must Clear the Data Object Ready bit,
if not already Clear, and Clear the DOE Error bit, if already Set,
in the DOE Status Register, within 1 second."

No mention of the DOE Interrupt Status bit, so we cannot assume that
it's cleared by a DOE Abort and we must clear it explicitly.

Thanks,

Lukas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ