lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230926073857.GB1344149@ik1-406-35019.vs.sakura.ne.jp>
Date:   Tue, 26 Sep 2023 16:38:57 +0900
From:   Naoya Horiguchi <naoya.horiguchi@...ux.dev>
To:     ankita@...dia.com
Cc:     jgg@...dia.com, alex.williamson@...hat.com,
        akpm@...ux-foundation.org, tony.luck@...el.com, bp@...en8.de,
        naoya.horiguchi@....com, linmiaohe@...wei.com, aniketa@...dia.com,
        cjia@...dia.com, kwankhede@...dia.com, targupta@...dia.com,
        vsethi@...dia.com, acurrid@...dia.com, anuaggarwal@...dia.com,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-edac@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH v1 4/4] vfio/nvgpu: register device memory for poison
 handling

On Wed, Sep 20, 2023 at 07:32:10PM +0530, ankita@...dia.com wrote:
> From: Ankit Agrawal <ankita@...dia.com>
> 
> The nvgrace-gpu-vfio-pci module [1] maps the device memory to the user VA
> (Qemu) using remap_pfn_range() without adding the memory to the kernel.
> The device memory pages are not backed by struct page. Patches 1-3
> implements the mechanism to handle ECC/poison on memory page without
> struct page and expose a registration function. This new mechanism is
> leveraged here.
>  
> The module registers its memory region with the kernel MM for ECC handling
> using the register_pfn_address_space() registration API exposed by the
> kernel. It also defines a failure callback function pfn_memory_failure()
> to get the poisoned PFN from the MM.
>  
> The module track poisoned PFN as a bitmap with a bit per PFN. The PFN is
> communicated by the kernel MM to the module through the failure function,
> which sets the appropriate bit in the bitmap.
>  
> The module also defines a VMA fault ops for the module. It returns
> VM_FAULT_HWPOISON in case the bit for the PFN is set in the bitmap.
> 
> [1] https://lore.kernel.org/all/20230915025415.6762-1-ankita@nvidia.com/
> 
> Signed-off-by: Ankit Agrawal <ankita@...dia.com>
> ---

...

> @@ -406,6 +494,19 @@ nvgrace_gpu_vfio_pci_fetch_memory_property(struct pci_dev *pdev,
>  
>  	nvdev->memlength = memlength;
>  
> +#ifdef CONFIG_MEMORY_FAILURE
> +	/*
> +	 * A bitmap is maintained to track the pages that are poisoned. Each
> +	 * page is represented by a bit. Allocation size in bytes is
> +	 * determined by shifting the device memory size by PAGE_SHIFT to
> +	 * determine the number of pages; and further shifted by 3 as each
> +	 * byte could track 8 pages.
> +	 */
> +	nvdev->pfn_bitmap
> +		= vzalloc((nvdev->memlength >> PAGE_SHIFT)/BITS_PER_TYPE(char));
> +	if (!nvdev->pfn_bitmap)
> +		ret = -ENOMEM;
> +#endif
>  	return ret;
>  }
>  

I assume that memory failure is a relatively rare event (otherwise the device
is simply broken and it's better to stop using it), so the bitmap is mostly
full of zeros.
I think that the size of device memory is on the order of 100GB, then the
bitmap size is about 3.2MB, which might be not too large in modern systems,
but using other data structure with smaller memory footprint like hash table
can be more beneficial?

Thanks,
Naoya Horiguchi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ