lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+eFSM1MvyC3y7cNC--K_m2CvrtgiUYNm3bgEgmghxiX+ZUzKg@mail.gmail.com>
Date:	Wed, 18 May 2016 21:27:03 +0800
From:	Gavin Guo <gavin.guo@...onical.com>
To:	Vinod Koul <vinod.koul@...el.com>
Cc:	dmaengine@...r.kernel.org,
	linux-kernel <linux-kernel@...r.kernel.org>,
	dan.j.williams@...el.com, dave.jiang@...el.com
Subject: Re: ioatdma(Intel(R) I/OAT DMA Engine init failed)

On Tue, May 17, 2016 at 6:06 PM, Vinod Koul <vinod.koul@...el.com> wrote:
> On Mon, May 16, 2016 at 06:08:20PM +0800, Gavin Guo wrote:
>> The following error messages can be observed on the Intel Haswell-E
>> chipset with v3.13 kernel. After the analysis, I found there is no
>> difference in the logic of these error messages in the current
>> upstream kernel. I also searched the git log and can't find any commit
>> which is fix to the error(correct me if I am wrong). The following is
>> the detail, and I'll really appreciate if there is any comment. :)
>
> 3.13 is ancient, can you check this on latest kernel

Thank you for the comment. It's running on the production system. However,
I'll try to figure out if it's possible to test the latest kernel.

>
>>
>> ioatdma 0000:00:04.0: channel error register unreachable
>> ioatdma 0000:00:04.0: channel enumeration error
>> ioatdma 0000:00:04.0: Intel(R) I/OAT DMA Engine init failed
>> ioatdma 0000:00:04.1: channel error register unreachable
>> ioatdma 0000:00:04.1: channel enumeration error
>> ioatdma 0000:00:04.1: Intel(R) I/OAT DMA Engine init failed
>> ...
>> ioatdma 0000:00:04.7: channel error register unreachable
>> ioatdma 0000:00:04.7: channel enumeration error
>> ioatdma 0000:00:04.7: Intel(R) I/OAT DMA Engine init failed
>> mei_me 0000:00:16.0: initialization failed.
>>
>> There are 8 I/OAT DMA controllers on the Haswell-E chipset:
>> 8086:2f20 ~ 8086:2f27
>> 80:04.0 System peripheral: Intel Corporation Haswell-E DMA Channel 0 (rev 02)
>> 80:04.1 System peripheral: Intel Corporation Haswell-E DMA Channel 1 (rev 02)
>> 80:04.2 System peripheral: Intel Corporation Haswell-E DMA Channel 2 (rev 02)
>> 80:04.3 System peripheral: Intel Corporation Haswell-E DMA Channel 3 (rev 02)
>> 80:04.4 System peripheral: Intel Corporation Haswell-E DMA Channel 4 (rev 02)
>> 80:04.5 System peripheral: Intel Corporation Haswell-E DMA Channel 5 (rev 02)
>> 80:04.6 System peripheral: Intel Corporation Haswell-E DMA Channel 6 (rev 02)
>> 80:04.7 System peripheral: Intel Corporation Haswell-E DMA Channel 7 (rev 02)
>>
>> Analysis:
>> The bug happens when the driver is resetting DMA controller, this is
>> the sequence: The function, ioat_pci_probe, is called when the DMA
>> controller is detected by the PCI bus. Then,
>> ioat3_dma_probe -> ioat_probe -> ioat2_enumerate_channels ->
>> ioat3_reset_hw. The following code can be found in the ioat3_reset_hw:
>>
>> drivers/dma/ioat/dma_v3.c:
>>         chanerr = readl(chan->reg_base + IOAT_CHANERR_OFFSET);
>>         writel(chanerr, chan->reg_base + IOAT_CHANERR_OFFSET);
>> ...
>>         err = pci_read_config_dword(pdev,
>> IOAT_PCI_CHANERR_INT_OFFSET, &chanerr);
>> if (err) {
>> dev_err(&pdev->dev,
>> "channel error register unreachable\n");
>> return err;
>> }
>>
>> Obviously, there are something wrong in the channel error register
>> reset process. Then all the way back to ioat_probe(). Because the
>> error happens, the dma->chancnt will be set to 0:
>>
>> drivers/dma/ioat/dma.c:
>>         if (!dma->chancnt) {
>>                 dev_err(dev, "channel enumeration error\n");
>>                 goto err_setup_interrupts;
>>         }
>>
>> Finally back to ioat_pci_probe:
>>
>> drivers/dma/ioat/pci.c:
>>                 err = ioat3_dma_probe(device, ioat_dca_enabled);
>>         else
>>                 return -ENODEV;
>>
>>         if (err) {
>>                 dev_err(dev, "Intel(R) I/OAT DMA Engine init
>> failed\n");
>>                 return -ENODEV;
>
> --
> ~Vinod

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ