[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2a19254d-1b5d-4a52-bd54-9ef3eb3f8ebf@molgen.mpg.de>
Date: Fri, 19 Nov 2021 15:43:58 +0100
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Krzysztof Wilczyński <kw@...ux.com>
Cc: Jörg Rödel <joro@...tes.org>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
iommu@...ts.linux-foundation.org,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, LKML <linux-kernel@...r.kernel.org>,
linux-pci@...r.kernel.org
Subject: Re: How to reduce PCI initialization from 5 s (1.5 s adding them to
IOMMU groups)
Dear Krzysztof,
Am 10.11.21 um 00:10 schrieb Krzysztof Wilczyński:
> [...]
>>> I am curious - why is this a problem? Are you power-cycling your servers
>>> so often to the point where the cumulative time spent in enumerating PCI
>>> devices and adding them later to IOMMU groups is a problem?
>>>
>>> I am simply wondering why you decided to signal out the PCI enumeration as
>>> slow in particular, especially given that a large server hardware tends to
>>> have (most of the time, as per my experience) rather long initialisation
>>> time either from being powered off or after being power cycled. I can take
>>> a while before the actual operating system itself will start.
>>
>> It’s not a problem per se, and more a pet peeve of mine. Systems get faster
>> and faster, and boottime slower and slower. On desktop systems, it’s much
>> more important with firmware like coreboot taking less than one second to
>> initialize the hardware and passing control to the payload/operating system.
>> If we are lucky, we are going to have servers with FLOSS firmware.
>>
>> But, already now, using kexec to reboot a system, avoids the problems you
>> pointed out on servers, and being able to reboot a system as quickly as
>> possible, lowers the bar for people to reboot systems more often to, for
>> example, so updates take effect.
>
> A very good point about the kexec usage.
>
> This is definitely often invaluable to get security updates out of the door
> quickly, update kernel version, or when you want to switch operating system
> quickly (a trick that companies like Equinix Metal use when offering their
> baremetal as a service).
>
>>> We talked about this briefly with Bjorn, and there might be an option to
>>> perhaps add some caching, as we suspect that the culprit here is doing PCI
>>> configuration space read for each device, which can be slow on some
>>> platforms.
>>>
>>> However, we would need to profile this to get some quantitative data to see
>>> whether doing anything would even be worthwhile. It would definitely help
>>> us understand better where the bottlenecks really are and of what magnitude.
>>>
>>> I personally don't have access to such a large hardware like the one you
>>> have access to, thus I was wondering whether you would have some time, and
>>> be willing, to profile this for us on the hardware you have.
>>>
>>> Let me know what do you think?
>>
>> Sounds good. I’d be willing to help. Note, that I won’t have time before
>> Wednesday next week though.
>
> Not a problem! I am very grateful you are willing to devote some of you
> time to help with this.
>
> I only have access to a few systems such as some commodity hardware like
> a desktop PC and notebooks, and some assorted SoCs. These are sadly not
> even close to a proper server platforms, and trying to measure anything on
> these does not really yield any useful data as the delays related to PCI
> enumeration on startup are quite insignificant in comparison - there is
> just not enough hardware there, so to speak.
>
> I am really looking forward to the data you can gather for us and what
> insight it might provide us with.
So, kexec seems to work besides some DMAR-IR warnings [1].
`initcall_debug` increases the Linux boot time by over 50 % from 7.7 s
to 12 s, which I didn’t expect.
Here are the functions taking more than 200 ms:
initcall pci_apply_final_quirks+0x0/0x132 returned 0 after 228433 usecs
initcall raid6_select_algo+0x0/0x2d6 returned 0 after 383789 usecs
initcall pcibios_assign_resources+0x0/0xc0 returned 0 after 610757
usecs
initcall _mpt3sas_init+0x0/0x1c0 returned 0 after 721257 usecs
initcall ahci_pci_driver_init+0x0/0x1a returned 0 after 945094 usecs
initcall pci_iommu_init+0x0/0x3f returned 0 after 1487134 usecs
initcall acpi_init+0x0/0x349 returned 0 after 7291015 usecs
Some of them are run later though, but `acpi_init` sticks out with 7.3 s.
Kind regards,
Paul
[1]:
https://lore.kernel.org/linux-iommu/40a7581d-985b-f12b-0bb2-99c586a9f968@molgen.mpg.de/T/#u
View attachment "furoncles-linux-5.10.70-dmesg-initcall_debug-kexec-2.txt" of type "text/plain" (272428 bytes)
Powered by blists - more mailing lists