lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 28 Jan 2020 11:40:47 -0600
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Muni Sekhar <munisekharrms@...il.com>
Cc:     linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: pcie: xilinx: kernel hang - ISR readl()

On Sat, Jan 18, 2020 at 07:16:14AM +0530, Muni Sekhar wrote:
> On Thu, Jan 9, 2020 at 10:05 AM Bjorn Helgaas <helgaas@...nel.org> wrote:
> >
> > On Thu, Jan 09, 2020 at 08:47:51AM +0530, Muni Sekhar wrote:
> > > On Thu, Jan 9, 2020 at 1:45 AM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > > > On Tue, Jan 07, 2020 at 09:45:13PM +0530, Muni Sekhar wrote:
> > > > > Hi,
> > > > >
> > > > > I have module with Xilinx FPGA. It implements UART(s), SPI(s),
> > > > > parallel I/O and interfaces them to the Host CPU via PCI Express bus.
> > > > > I see that my system freezes without capturing the crash dump for
> > > > > certain tests. I debugged this issue and it was tracked down to the
> > > > > below mentioned interrupt handler code.
> > > > >
> > > > >
> > > > > In ISR, first reads the Interrupt Status register using ‘readl()’ as
> > > > > given below.
> > > > >     status = readl(ctrl->reg + INT_STATUS);
> > > > >
> > > > >
> > > > > And then clears the pending interrupts using ‘writel()’ as given blow.
> > > > >         writel(status, ctrl->reg + INT_STATUS);
> > > > >
> > > > >
> > > > > I've noticed a kernel hang if INT_STATUS register read again after
> > > > > clearing the pending interrupts.
> > > > >
> > > > > Can someone clarify me why the kernel hangs without crash dump incase
> > > > > if I read the INT_STATUS register using readl() after clearing the
> > > > > pending bits?
> > > > >
> > > > > Can readl() block?
> > > >
> > > > readl() should not block in software.  Obviously at the hardware CPU
> > > > instruction level, the read instruction has to wait for the result of
> > > > the read.  Since that data is provided by the device, i.e., your FPGA,
> > > > it's possible there's a problem there.
> > >
> > > Thank you very much for your reply.
> > > Where can I find the details about what is protocol for reading the
> > > ‘memory mapped IO’? Can you point me to any useful links..
> > > I tried locate the exact point of the kernel code where CPU waits for
> > > read instruction as given below.
> > > readl() -> __raw_readl() -> return *(const volatile u32 __force *)add
> > > Do I need to check for the assembly instructions, here?
> >
> > The C pointer dereference, e.g., "*address", will be some sort of a
> > "load" instruction in assembly.  The CPU wait isn't explicit; it's
> > just that when you load a value, the CPU waits for the value.
> >
> > > > Can you tell whether the FPGA has received the Memory Read for
> > > > INT_STATUS and sent the completion?
> I have not seen any ‘missing’ completions on the logic analyser. Is
> there any other ways to debug this one?

If you see the Memory Read and the associated Completion, and you
still see a hang in the kernel, then mostly likely the problem is not
in PCIe.

I would start by trying to prove that the instruction after the
readl() is or is not executed.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ