[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR21MB16880251FC59B60542D2D996D75A9@BYAPR21MB1688.namprd21.prod.outlook.com>
Date: Tue, 4 Oct 2022 16:24:54 +0000
From: "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
To: "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"bhe@...hat.com" <bhe@...hat.com>,
"pmladek@...e.com" <pmladek@...e.com>,
"kexec@...ts.infradead.org" <kexec@...ts.infradead.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
"kernel-dev@...lia.com" <kernel-dev@...lia.com>,
"kernel@...ccoli.net" <kernel@...ccoli.net>,
"halves@...onical.com" <halves@...onical.com>,
"fabiomirmar@...il.com" <fabiomirmar@...il.com>,
"alejandro.j.jimenez@...cle.com" <alejandro.j.jimenez@...cle.com>,
"andriy.shevchenko@...ux.intel.com"
<andriy.shevchenko@...ux.intel.com>,
"arnd@...db.de" <arnd@...db.de>, "bp@...en8.de" <bp@...en8.de>,
"corbet@....net" <corbet@....net>,
"d.hatayama@...fujitsu.com" <d.hatayama@...fujitsu.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"dyoung@...hat.com" <dyoung@...hat.com>,
"feng.tang@...el.com" <feng.tang@...el.com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"hidehiro.kawai.ez@...achi.com" <hidehiro.kawai.ez@...achi.com>,
"jgross@...e.com" <jgross@...e.com>,
"john.ogness@...utronix.de" <john.ogness@...utronix.de>,
"keescook@...omium.org" <keescook@...omium.org>,
"luto@...nel.org" <luto@...nel.org>,
"mhiramat@...nel.org" <mhiramat@...nel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"paulmck@...nel.org" <paulmck@...nel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"rostedt@...dmis.org" <rostedt@...dmis.org>,
"senozhatsky@...omium.org" <senozhatsky@...omium.org>,
"stern@...land.harvard.edu" <stern@...land.harvard.edu>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"vgoyal@...hat.com" <vgoyal@...hat.com>,
vkuznets <vkuznets@...hat.com>,
"will@...nel.org" <will@...nel.org>,
"xuqiang36@...wei.com" <xuqiang36@...wei.com>,
Andrea Parri <parri.andrea@...il.com>,
Dexuan Cui <decui@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
KY Srinivasan <kys@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Tianyu Lan <Tianyu.Lan@...rosoft.com>,
Wei Liu <wei.liu@...nel.org>
Subject: RE: [PATCH V3 10/11] drivers/hv/vmbus, video/hyperv_fb: Untangle and
refactor Hyper-V panic notifiers
From: Guilherme G. Piccoli <gpiccoli@...lia.com> Sent: Friday, August 19, 2022 3:18 PM
>
> Currently Hyper-V guests are among the most relevant users of the panic
> infrastructure, like panic notifiers, kmsg dumpers, etc. The reasons rely
> both in cleaning-up procedures (closing hypervisor <-> guest connection,
> disabling some paravirtualized timer) as well as to data collection
> (sending panic information to the hypervisor) and framebuffer management.
>
> The thing is: some notifiers are related to others, ordering matters, some
> functionalities are duplicated and there are lots of conditionals behind
> sending panic information to the hypervisor. As part of an effort to
> clean-up the panic notifiers mechanism and better document things, we
> hereby address some of the issues/complexities of Hyper-V panic handling
> through the following changes:
>
> (a) We have die and panic notifiers on vmbus_drv.c and both have goals of
> sending panic information to the hypervisor, though the panic notifier is
> also responsible for a cleaning-up procedure.
>
> This commit clears the code by splitting the panic notifier in two, one
> for closing the vmbus connection whereas the other is only for sending
> panic info to hypervisor. With that, it was possible to merge the die and
> panic notifiers in a single/well-documented function, and clear some
> conditional complexities on sending such information to the hypervisor.
>
> (b) There is a Hyper-V framebuffer panic notifier, which relies in doing
> a vmbus operation that demands a valid connection. So, we must order this
> notifier with the panic notifier from vmbus_drv.c, to guarantee that the
> framebuffer code executes before the vmbus connection is unloaded.
>
> Also, this commit removes a useless header.
>
> Although there is code rework and re-ordering, we expect that this change
> has no functional regressions but instead optimize the path and increase
> panic reliability on Hyper-V. This was tested on Hyper-V with success.
>
> Cc: Andrea Parri (Microsoft) <parri.andrea@...il.com>
> Cc: Dexuan Cui <decui@...rosoft.com>
> Cc: Haiyang Zhang <haiyangz@...rosoft.com>
> Cc: "K. Y. Srinivasan" <kys@...rosoft.com>
> Cc: Petr Mladek <pmladek@...e.com>
> Cc: Stephen Hemminger <sthemmin@...rosoft.com>
> Cc: Tianyu Lan <Tianyu.Lan@...rosoft.com>
> Cc: Wei Liu <wei.liu@...nel.org>
> Reviewed-by: Michael Kelley <mikelley@...rosoft.com>
> Tested-by: Fabio A M Martins <fabiomirmar@...il.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@...lia.com>
>
> ---
>
> V3:
> - Added Michael's review tag - thanks!
>
> V2:
> - Unfortunately we cannot rely in the crash shutdown (custom) handler
> to perform the vmbus unload - arm64 architecture doesn't have this
> "feature" [0]. So, in V2 we kept the notifier behavior and always
> unload the vmbus connection, no matter what - thanks Michael for
> pointing that;
>
> - Removed the Fixes tags as per Michael suggestion;
>
> - As per Petr suggestion, we abandoned the idea of distinguish among
> notifiers using an id - so, in V2 we rely in the old and good address
> comparison for that. Thanks Petr for the enriching discussion!
>
> [0] https://lore.kernel.org/lkml/427a8277-49f0-4317-d6c3-4a15d7070e55@igalia.com/
>
>
> drivers/hv/vmbus_drv.c | 109 +++++++++++++++++++-------------
> drivers/video/fbdev/hyperv_fb.c | 8 +++
> 2 files changed, 74 insertions(+), 43 deletions(-)
>
Tested this patch in combination with Patch 9 in this series. Verified
that both the panic and die paths work correctly with notification to
Hyper-V via hyperv_report_panic() or via hv_kmsg_dump(). Hyper-V
framebuffer is updated as expected, though I did not reproduce
a case where the ring buffer lock is held. vmbus_initiate_unload() runs
as expected.
Tested-by: Michael Kelley <mikelley@...rosoft.com>
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 23c680d1a0f5..0a77e2bb0b70 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -25,7 +25,6 @@
> #include <linux/sched/task_stack.h>
>
> #include <linux/delay.h>
> -#include <linux/notifier.h>
> #include <linux/panic_notifier.h>
> #include <linux/ptrace.h>
> #include <linux/screen_info.h>
> @@ -69,53 +68,74 @@ static int hyperv_report_reg(void)
> return !sysctl_record_panic_msg || !hv_panic_page;
> }
>
> -static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,
> +/*
> + * The panic notifier below is responsible solely for unloading the
> + * vmbus connection, which is necessary in a panic event.
> + *
> + * Notice an intrincate relation of this notifier with Hyper-V
> + * framebuffer panic notifier exists - we need vmbus connection alive
> + * there in order to succeed, so we need to order both with each other
> + * [see hvfb_on_panic()] - this is done using notifiers' priorities.
> + */
> +static int hv_panic_vmbus_unload(struct notifier_block *nb, unsigned long val,
> void *args)
> +{
> + vmbus_initiate_unload(true);
> + return NOTIFY_DONE;
> +}
> +static struct notifier_block hyperv_panic_vmbus_unload_block = {
> + .notifier_call = hv_panic_vmbus_unload,
> + .priority = INT_MIN + 1, /* almost the latest one to execute */
> +};
> +
> +static int hv_die_panic_notify_crash(struct notifier_block *self,
> + unsigned long val, void *args);
> +
> +static struct notifier_block hyperv_die_report_block = {
> + .notifier_call = hv_die_panic_notify_crash,
> +};
> +static struct notifier_block hyperv_panic_report_block = {
> + .notifier_call = hv_die_panic_notify_crash,
> +};
> +
> +/*
> + * The following callback works both as die and panic notifier; its
> + * goal is to provide panic information to the hypervisor unless the
> + * kmsg dumper is used [see hv_kmsg_dump()], which provides more
> + * information but isn't always available.
> + *
> + * Notice that both the panic/die report notifiers are registered only
> + * if we have the capability HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE set.
> + */
> +static int hv_die_panic_notify_crash(struct notifier_block *self,
> + unsigned long val, void *args)
> {
> struct pt_regs *regs;
> + bool is_die;
>
> - vmbus_initiate_unload(true);
> -
> - /*
> - * Hyper-V should be notified only once about a panic. If we will be
> - * doing hv_kmsg_dump() with kmsg data later, don't do the notification
> - * here.
> - */
> - if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE
> - && hyperv_report_reg()) {
> + /* Don't notify Hyper-V unless we have a die oops event or panic. */
> + if (self == &hyperv_panic_report_block) {
> + is_die = false;
> regs = current_pt_regs();
> - hyperv_report_panic(regs, val, false);
> + } else { /* die event */
> + if (val != DIE_OOPS)
> + return NOTIFY_DONE;
> +
> + is_die = true;
> + regs = ((struct die_args *)args)->regs;
> }
> - return NOTIFY_DONE;
> -}
> -
> -static int hyperv_die_event(struct notifier_block *nb, unsigned long val,
> - void *args)
> -{
> - struct die_args *die = args;
> - struct pt_regs *regs = die->regs;
> -
> - /* Don't notify Hyper-V if the die event is other than oops */
> - if (val != DIE_OOPS)
> - return NOTIFY_DONE;
>
> /*
> - * Hyper-V should be notified only once about a panic. If we will be
> - * doing hv_kmsg_dump() with kmsg data later, don't do the notification
> - * here.
> + * Hyper-V should be notified only once about a panic/die. If we will
> + * be calling hv_kmsg_dump() later with kmsg data, don't do the
> + * notification here.
> */
> if (hyperv_report_reg())
> - hyperv_report_panic(regs, val, true);
> + hyperv_report_panic(regs, val, is_die);
> +
> return NOTIFY_DONE;
> }
>
> -static struct notifier_block hyperv_die_block = {
> - .notifier_call = hyperv_die_event,
> -};
> -static struct notifier_block hyperv_panic_block = {
> - .notifier_call = hyperv_panic_event,
> -};
> -
> static const char *fb_mmio_name = "fb_range";
> static struct resource *fb_mmio;
> static struct resource *hyperv_mmio;
> @@ -1538,16 +1558,17 @@ static int vmbus_bus_init(void)
> if (hyperv_crash_ctl & HV_CRASH_CTL_CRASH_NOTIFY_MSG)
> hv_kmsg_dump_register();
>
> - register_die_notifier(&hyperv_die_block);
> + register_die_notifier(&hyperv_die_report_block);
> + atomic_notifier_chain_register(&panic_notifier_list,
> + &hyperv_panic_report_block);
> }
>
> /*
> - * Always register the panic notifier because we need to unload
> - * the VMbus channel connection to prevent any VMbus
> - * activity after the VM panics.
> + * Always register the vmbus unload panic notifier because we
> + * need to shut the VMbus channel connection on panic.
> */
> atomic_notifier_chain_register(&panic_notifier_list,
> - &hyperv_panic_block);
> + &hyperv_panic_vmbus_unload_block);
>
> vmbus_request_offers();
>
> @@ -2776,15 +2797,17 @@ static void __exit vmbus_exit(void)
>
> if (ms_hyperv.misc_features &
> HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
> kmsg_dump_unregister(&hv_kmsg_dumper);
> - unregister_die_notifier(&hyperv_die_block);
> + unregister_die_notifier(&hyperv_die_report_block);
> + atomic_notifier_chain_unregister(&panic_notifier_list,
> + &hyperv_panic_report_block);
> }
>
> /*
> - * The panic notifier is always registered, hence we should
> + * The vmbus panic notifier is always registered, hence we should
> * also unconditionally unregister it here as well.
> */
> atomic_notifier_chain_unregister(&panic_notifier_list,
> - &hyperv_panic_block);
> + &hyperv_panic_vmbus_unload_block);
>
> free_page((unsigned long)hv_panic_page);
> unregister_sysctl_table(hv_ctl_table_hdr);
> diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c
> index e1b65a01fb96..9234d31d4305 100644
> --- a/drivers/video/fbdev/hyperv_fb.c
> +++ b/drivers/video/fbdev/hyperv_fb.c
> @@ -1216,7 +1216,15 @@ static int hvfb_probe(struct hv_device *hdev,
> par->fb_ready = true;
>
> par->synchronous_fb = false;
> +
> + /*
> + * We need to be sure this panic notifier runs _before_ the
> + * vmbus disconnect, so order it by priority. It must execute
> + * before the function hv_panic_vmbus_unload() [drivers/hv/vmbus_drv.c],
> + * which is almost at the end of list, with priority = INT_MIN + 1.
> + */
> par->hvfb_panic_nb.notifier_call = hvfb_on_panic;
> + par->hvfb_panic_nb.priority = INT_MIN + 10,
> atomic_notifier_chain_register(&panic_notifier_list,
> &par->hvfb_panic_nb);
>
> --
> 2.37.2
Powered by blists - more mailing lists