[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d4f2978-22f4-4e8d-a6b8-e6b90888dc25@intel.com>
Date: Wed, 20 Nov 2024 12:49:10 -0800
From: Sohil Mehta <sohil.mehta@...el.com>
To: Borislav Petkov <bp@...nel.org>, <linux-doc@...r.kernel.org>
CC: X86 ML <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>, "Borislav
Petkov (AMD)" <bp@...en8.de>
Subject: Re: [PATCH 1/2] Documentation: Merge x86-specific boot options doc
into kernel-parameters.txt
On 11/20/2024 8:30 AM, Borislav Petkov wrote:
> @@ -3254,9 +3306,68 @@
> devices can be requested on-demand with the
> /dev/loop-control interface.
>
> - mce [X86-32] Machine Check Exception
> + mce= [X86-{32,64}]
> +
> + Please see Documentation/arch/x86/x86_64/machinecheck.rst for sysfs runtime tunables.
> +
> + off: disable machine check
> +
> + no_cmci: disable CMCI(Corrected Machine Check
> + Interrupt) that Intel processor supports. Usually
> + this disablement is not recommended, but it might be
> + handy if your hardware is misbehaving.
> +
> + Note that you'll get more problems without CMCI than
> + with due to the shared banks, i.e. you might get
> + duplicated error logs.
> +
> + dont_log_ce: don't make logs for corrected errors.
> + All events reported as corrected are silently cleared
> + by OS. This option will be useful if you have no
> + interest in any of corrected errors.
> +
> + ignore_ce: disable features for corrected errors, e.g.
> + polling timer and CMCI. All events reported as
> + corrected are not cleared by OS and remained in its
> + error banks.
> +
> + Usually this disablement is not recommended, however
> + if there is an agent checking/clearing corrected
> + errors (e.g. BIOS or hardware monitoring
> + applications), conflicting with OS's error handling,
> + and you cannot deactivate the agent, then this option
> + will be a help.
> +
> + no_lmce: do not opt-in to Local MCE delivery. Use
> + legacy method to broadcast MCEs.
> +
> + bootlog: enable logging of machine checks left over
> + from booting. Disabled by default on AMD Fam10h and
> + older because some BIOS leave bogus ones.
> +
> + If your BIOS doesn't do that it's a good idea to
> + enable though to make sure you log even machine check
> + events that result in a reboot. On Intel systems it is
> + enabled by default.
> +
> + nobootlog: disable boot machine check logging.
> +
> + monarchtimeout (number): sets the time in us to wait
> + for other CPUs on machine checks. 0 to disable.
> +
> + bios_cmci_threshold: don't overwrite the bios-set CMCI
> + threshold. This boot option prevents Linux from
> + overwriting the CMCI threshold set by the bios.
> + Without this option, Linux always sets the CMCI
> + threshold to 1. Enabling this may make memory
> + predictive failure analysis less effective if the bios
> + sets thresholds for memory errors since we will not
> + see details for all errors.
> +
> + recovery: force-enable recoverable machine check code paths
> +
> + Everything else is in sysfs now.
>
Instead of double tabs and <option>: <description>, would this be more
readable if the options and their descriptions are separated? Something
like the below wouldn't increase over width either.
mce= [X86-{32,64}]
Please see Documentation/arch/x86/x86_64/machinecheck.rst for sysfs
runtime tunables.
off disable machine check
no_cmci
disable CMCI(Corrected Machine Check
Interrupt) that Intel processor supports. Usually
this disablement is not recommended, but it might be
handy if your hardware is misbehaving.
Note that you'll get more problems without CMCI than
with due to the shared banks, i.e. you might get
duplicated error logs.
dont_log_ce
don't make logs for corrected errors.
All events reported as corrected are silently cleared
by OS. This option will be useful if you have no
interest in any of corrected errors.
ignore_ce
disable features for corrected errors, e.g.
polling timer and CMCI. All events reported as
corrected are not cleared by OS and remained in its
error banks.
Usually this disablement is not recommended, however
if there is an agent checking/clearing corrected
errors (e.g. BIOS or hardware monitoring
applications), conflicting with OS's error handling,
and you cannot deactivate the agent, then this option
will be a help.
no_lmce
do not opt-in to Local MCE delivery. Use
legacy method to broadcast MCEs.
bootlog: enable logging of machine checks left over
from booting. Disabled by default on AMD Fam10h and
older because some BIOS leave bogus ones.
If your BIOS doesn't do that it's a good idea to
enable though to make sure you log even machine check
events that result in a reboot. On Intel systems it is
enabled by default.
nobootlog
disable boot machine check logging.
monarchtimeout (number)
sets the time in us to wait
for other CPUs on machine checks. 0 to disable.
bios_cmci_threshold
don't overwrite the bios-set CMCI
threshold. This boot option prevents Linux from
overwriting the CMCI threshold set by the bios.
Without this option, Linux always sets the CMCI
threshold to 1. Enabling this may make memory
predictive failure analysis less effective if the bios
sets thresholds for memory errors since we will not
see details for all errors.
recovery
force-enable recoverable machine check code paths
Everything else is in sysfs now.
> - mce=option [X86-64] See Documentation/arch/x86/x86_64/boot-options.rst
>
> @@ -5701,6 +5825,47 @@
> reboot_cpu is s[mp]#### with #### being the processor
> to be used for rebooting.
>
> + acpi: Use the ACPI RESET_REG in the FADT. If ACPI is
> + not configured or the ACPI reset does not work, the
> + reboot path attempts the reset using the keyboard
> + controller.
> +
> + bios: Use the CPU reboot vector for warm reset
> +
> + cold: Set the cold reboot flag
> +
> + default: There are some built-in platform specific
> + "quirks" - you may see: "reboot: <name> series board
> + detected. Selecting <type> for reboots." In the case
> + where you think the quirk is in error (e.g. you have
> + newer BIOS, or newer board) using this option will
> + ignore the built-in quirk table, and use the generic
> + default reboot actions.
> +
> + efi: Use efi reset_system runtime service. If EFI is
> + not configured or the EFI reset does not work, the
> + reboot path attempts the reset using the keyboard
> + controller.
> +
> + force: Don't stop other CPUs on reboot. This can make
> + reboot more reliable in some cases.
> +
> + kbd: Use the keyboard controller. cold reset (default)
> +
> + pci: Use a write to the PCI config space register
> + 0xcf9 to trigger reboot.
> +
> + triple: Force a triple fault (init)
> +
> + warm: Don't set the cold reboot flag
> +
> + Using warm reset will be much faster especially on big
> + memory systems because the BIOS will not go through
> + the memory check. Disadvantage is that not all
> + hardware will be completely reinitialized on reboot so
> + there may be boot problems on some systems.
> +
> +
Same suggestion here.
Powered by blists - more mailing lists