lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 Dec 2023 14:04:19 -0600
From:   Mario Limonciello <mario.limonciello@....com>
To:     Takashi Sakamoto <o-takashi@...amocchi.jp>,
        a.mark.broadworth@...il.com, matthias.schrumpf@...enet.de,
        LKML <linux-kernel@...r.kernel.org>, aros@....com,
        bagasdotme@...il.com,
        "open list:PCI SUBSYSTEM" <linux-pci@...r.kernel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Borislav Petkov <bp@...en8.de>
Subject: Re: Regression from dcadfd7f7c74ef9ee415e072a19bdf6c085159eb

On 12/3/2023 06:29, Takashi Sakamoto wrote:
> Hi Mario,
> 
> Thanks for the advices.
> 
> I note that In my experiments I use Ubuntu 23.04 amd64 (v6.2 kernel) with
> backported FireWire stack[1]. Except for the stack, the kernel and software
> packages can be retrieved from repositories of Ubuntu project.
> 
> On Tue, Nov 28, 2023 at 12:09:41AM -0600, Mario Limonciello wrote:
>> On 11/27/2023 23:24, Takashi Sakamoto wrote:
>>> Hi Mario
>>>
>>> Following up on our last conversation, I purchase some hardware to
>>> attempt to retrieve outputs from serial port. Finally, I bought another
>>> mother board in used market which provides serial port from Super I/O
>>> chip (ASUS TUF Gaming X570-Plus). However, I have retrieved no helpful
>>> outputs yet when encountering the system reboot.
>>
>> Did you up the loglevel to 8 to make sure you'll get all kernel output on
>> the serial port, not just errors?
> 
> Even if giving either 'debug' cmdline option or incrementing console
> loglevel via syctl, I receive no useful output from console when loading
> the module at or after booting up.
> 
> ```
> $ sysctl kernel.printk
> kernel.printk = 7	7	1	7
> ```
> 
> I tried at several difference cases; enabling/disabling IOMMU,
> enabling/disabling SVM in motherboard level. But nothing effective.
> 
>>> As you mentioned, I check whether PCIe AER is enabled or not in the
>>> running kernel (Ubuntu 23.04 linux-image-6.2.0-37-generic). It is
>>> certainly enabled, however I can see nothing in the output as I noted.
>>>
>>> I experienced extra troubles relevant to AMD Ryzen machine and the issued
>>> PCIe device:
>>>
>>> * ASRock X570 Phantom Gaming 4 with AMD Ryzen 5 3600X does not detect
>>>     the card. We can see no corresponding entry in lspci.
>>> * After associating the card to vfio-pci, lspci command can reboot the
>>>     system even if firewire-ohci driver is not loaded. I can regenerate it
>>>     in both Gigabyte AX370-Gaming 5/ASUS TUF Gaming X570-plus with AMD
>>>     Ryzen 2400G.
>>
>> Rather than lspci, is it specifically config space access from sysfs? Does
>> the output from the serial port change with IOMMU enabled vs disabled?
> 
> In lspci case, I can work with debugger and figure out that 'pread(2)' to
> file descriptor for 'config' node in sysfs causes the unexpected system
> reboot. Additionally I can regenerate it by hexdump(1) to the node:

OK - is this by chance related to access to PCI extended config space 
failing for this device then?  If you read just the first 256 bytes it's 
ok, but beyond that it fails?

If so, can you please try to reproduce using this series from Bjorn applied:
https://lore.kernel.org/r/20231121183643.249006-1-helgaas@kernel.org

And then add this to kernel command line:
efi=debug "dyndbg=file arch/x86/pci/* +p"

Capture the dmesg and share it.

> 
> ```
> $ lspci
> ...
> 04:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge [1b21:1080] (rev 03)
> 05:00.0 FireWire (IEEE 1394): VIA Technologies, Inc. VT6306/7/8 [Fire II(M)] IEEE 1394 OHCI Controller [1106:3044] (rev 80)
> ...
> $ hexdump -C /sys/bus/pci/devices/0000\:05\:00.0/config
> 00000000  06 11 44 30 80 00 10 02  80 10 00 0c 10 20 00 00  |..D0......... ..|
> 00000010  00 00 90 fc 01 d0 00 00  00 00 00 00 00 00 00 00  |................|
> 00000020  00 00 00 00 00 00 00 00  00 00 00 00 06 11 44 30  |..............D0|
> 00000030  00 00 00 00 50 00 00 00  00 00 00 00 ff 01 00 20  |....P.......... |
> 00000040
> 
> $ lsmod | grep firewire
> (no output)
> 
> $ sudo -i
> # modprobe vfio-pci
> # echo 1106 3044 > /sys/bus/pci/drivers/vfio-pci/new_id
> # exit
> 
> $ hexdump -C /sys/bus/pci/devices/0000\:05\:00.0/config
> (reboot)
> ```

Can you access config space for other PCIe devices successfully on this 
system?
Specifically extended config space?

> 
> I can suppress it when disabling IOMMU in motherboard. In this point, the
> issue of lspci is a bit different from the issue of driver issue.
> 
>>> I'm plreased to see if you have extra ideas to get helpful output from
>>> the system. But I guess that I should start finding some workaround to
>>> avoid the issued access to register instead of investigating the reboot
>>> mechanism, sigh...
>>>
>>> Anyway, thanks for your help. >
>>
>> Can you check FCH::PM::S5_RESET_STATUS on next boot after failure has
>> occurred?  It is available at MMIO FED80300 or through indirect IO access at
>> 0xC0.
>>
>> If MMIO doesn't work, double check FCH::PM_ISACONTROL bit 1 (described on
>> page 296) to confirm if your system enables it.
>>
>> The meanings of the different bits can be found in a recent PPR:
>> https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/55901_B1_pub_053.zip
>>
>> Indirect IO is described on PDF page 294.
>>
>> This will at least give us a hint what's going on in this case.
> 
> I'll try the above in this week. Thanks.
> 
> 
> [1] https://github.com/takaswie/linux-firewire-dkms/
> 
> Takashi Sakamoto

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ