[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB41573CB5E4B2F3E35501934BD432A@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Thu, 21 Aug 2025 15:45:10 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Vitaly Kuznetsov <vkuznets@...hat.com>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>
CC: "K. Y. Srinivasan" <kys@...rosoft.com>, Haiyang Zhang
<haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>, Dexuan Cui
<decui@...rosoft.com>, "x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Nuno Das Neves
<nunodasneves@...ux.microsoft.com>, Tianyu Lan <tiala@...rosoft.com>, Li Tian
<litian@...hat.com>, Philipp Rudo <prudo@...hat.com>
Subject: RE: [PATCH v2] x86/hyperv: Fix kdump on Azure CVMs
From: Vitaly Kuznetsov <vkuznets@...hat.com> Sent: Thursday, August 21, 2025 2:37 AM
>
> Michael Kelley <mhklinux@...look.com> writes:
>
> > From: Vitaly Kuznetsov <vkuznets@...hat.com> Sent: Monday, August 18, 2025 2:54 AM
[snip]
> >>
> >> +/*
> >> + * Keep track of the PFN regions which were shared with the host. The access
> >> + * must be revoked upon kexec/kdump (see hv_ivm_clear_host_access()).
> >> + */
> >> +struct hv_enc_pfn_region {
> >> + struct list_head list;
> >> + u64 pfn;
> >> + int count;
> >> +};
> >
> > I'm wondering if there's an existing kernel data structure that would handle
> > the requirements here. Did you look at using xarray()? It's probably not as
> > memory efficient since it presumably needs a separate entry for each PFN,
> > whereas your code below uses a single entry for a range of PFNs. But
> > maybe that's a worthwhile tradeoff to simplify the code and avoid some
> > of the messy issues I point out below. Just a thought ....
>
> I thought about it before I looked at how these regions really look
> like. Here's what I see on a DC2ads instance upon kdump (with debug
> printk added):
>
> [ 37.255921] hv_ivm_clear_host_access: PFN_START: 102548 COUNT:8
> [ 37.256833] hv_ivm_clear_host_access: PFN_START: 10bc60 COUNT:16
> [ 37.257743] hv_ivm_clear_host_access: PFN_START: 10bd00 COUNT:256
> [ 37.259177] hv_ivm_clear_host_access: PFN_START: 10ada0 COUNT:255
> [ 37.260639] hv_ivm_clear_host_access: PFN_START: 1097e8 COUNT:24
> [ 37.261630] hv_ivm_clear_host_access: PFN_START: 103ce3 COUNT:45
> [ 37.262741] hv_ivm_clear_host_access: PFN_START: 103ce1 COUNT:1
>
> ... 57 more items with 1-4 PFNs ...
>
> [ 37.320659] hv_ivm_clear_host_access: PFN_START: 103c98 COUNT:1
> [ 37.321611] hv_ivm_clear_host_access: PFN_START: 109d00 COUNT:4199
> [ 37.331656] hv_ivm_clear_host_access: PFN_START: 10957f COUNT:129
> [ 37.332902] hv_ivm_clear_host_access: PFN_START: 103c9b COUNT:2
> [ 37.333811] hv_ivm_clear_host_access: PFN_START: 1000 COUNT:256
> [ 37.335066] hv_ivm_clear_host_access: PFN_START: 100 COUNT:256
> [ 37.336340] hv_ivm_clear_host_access: PFN_START: 100e00 COUNT:256
> [ 37.337626] hv_ivm_clear_host_access: PFN_START: 7b000 COUNT:131072
>
> Overall, the liked list contains 72 items of 32 bytes each so we're
> consuming 2k of extra memory. Handling of such a short list should also
> be pretty fast.
>
> If we switch to handling each PFN separately, that would be 136862
> items. I'm not exactly sure about xarray's memory consumption but I'm
> afraid we are talking megabytes for this case. This is the price
> every CVM user is going to pay. Also, the chance of getting into
> (interim) memory allocation problems is going to be much higher.
>
OK, make sense. The entry above with 4199 PFNs is probably the netvsc
receive buffer array. And the big entry with 131072 PFNs is almost certainly
the swiotlb, which I had forgotten about. That one really blows up the count,
and would be 262144 PFNs on bigger CVMs with 16 Gbytes or more of memory.
So, yes, having an xarray entry per PFN almost certainly gets too big.
Michael
Powered by blists - more mailing lists