lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e53d612-534c-46b5-9746-a4a9814d41c3@samsung.com>
Date: Wed, 11 Jun 2025 10:42:16 +0200
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: David Hildenbrand <david@...hat.com>, Alistair Popple
	<apopple@...dia.com>
Cc: linux-mm@...ck.org, akpm@...ux-foundation.org,
	"linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
	Christoph Hellwig <hch@....de>, Jason Gunthorpe <jgg@...dia.com>,
	gerald.schaefer@...ux.ibm.com, dan.j.williams@...el.com, jgg@...pe.ca,
	willy@...radead.org, linux-kernel@...r.kernel.org, nvdimm@...ts.linux.dev,
	jhubbard@...dia.com, zhang.lyra@...il.com, debug@...osinc.com,
	bjorn@...nel.org, balbirs@...dia.com, lorenzo.stoakes@...cle.com,
	John@...ves.net
Subject: Re: [PATCH] mm: Remove PFN_MAP, PFN_SPECIAL, PFN_SG_CHAIN and
 PFN_SG_LAST

On 11.06.2025 10:23, David Hildenbrand wrote:
> On 11.06.25 10:03, Marek Szyprowski wrote:
>> On 11.06.2025 04:38, Alistair Popple wrote:
>>> On Tue, Jun 10, 2025 at 06:18:09PM +0200, Marek Szyprowski wrote:
>>>> On 04.06.2025 05:21, Alistair Popple wrote:
>>>>> The PFN_MAP flag is no longer used for anything, so remove it.
>>>>> The PFN_SG_CHAIN and PFN_SG_LAST flags never appear to have been
>>>>> used so also remove them. The last user of PFN_SPECIAL was removed
>>>>> by 653d7825c149 ("dcssblk: mark DAX broken, remove FS_DAX_LIMITED
>>>>> support").
>>>>>
>>>>> Signed-off-by: Alistair Popple <apopple@...dia.com>
>>>>> Acked-by: David Hildenbrand <david@...hat.com>
>>>>> Reviewed-by: Christoph Hellwig <hch@....de>
>>>>> Reviewed-by: Jason Gunthorpe <jgg@...dia.com>
>>>>> Cc: gerald.schaefer@...ux.ibm.com
>>>>> Cc: dan.j.williams@...el.com
>>>>> Cc: jgg@...pe.ca
>>>>> Cc: willy@...radead.org
>>>>> Cc: david@...hat.com
>>>>> Cc: linux-kernel@...r.kernel.org
>>>>> Cc: nvdimm@...ts.linux.dev
>>>>> Cc: jhubbard@...dia.com
>>>>> Cc: hch@....de
>>>>> Cc: zhang.lyra@...il.com
>>>>> Cc: debug@...osinc.com
>>>>> Cc: bjorn@...nel.org
>>>>> Cc: balbirs@...dia.com
>>>>> Cc: lorenzo.stoakes@...cle.com
>>>>> Cc: John@...ves.net
>>>>>
>>>>> ---
>>>>>
>>>>> Splitting this off from the rest of my series[1] as a separate 
>>>>> clean-up
>>>>> for consideration for the v6.16 merge window as suggested by 
>>>>> Christoph.
>>>>>
>>>>> [1] - 
>>>>> https://lore.kernel.org/linux-mm/cover.541c2702181b7461b84f1a6967a3f0e823023fcc.1748500293.git-series.apopple@nvidia.com/
>>>>> ---
>>>>>     include/linux/pfn_t.h             | 31 
>>>>> +++----------------------------
>>>>>     mm/memory.c                       |  2 --
>>>>>     tools/testing/nvdimm/test/iomap.c |  4 ----
>>>>>     3 files changed, 3 insertions(+), 34 deletions(-)
>>>> This patch landed in today's linux-next as commit 28be5676b4a3 ("mm:
>>>> remove PFN_MAP, PFN_SPECIAL, PFN_SG_CHAIN and PFN_SG_LAST"). In my 
>>>> tests
>>>> I've noticed that it breaks operation of all RISC-V 64bit boards on my
>>>> test farm (VisionFive2, BananaPiF3 as well as QEMU's Virt machine). 
>>>> I've
>>>> isolated the changes responsible for this issue, see the inline 
>>>> comments
>>>> in the patch below. Here is an example of the issues observed in the
>>>> logs from those machines:
>>> Thanks for the report. I'm really confused by this because this 
>>> change should
>>> just be removal of dead code - nothing sets any of the removed PFN_* 
>>> flags
>>> AFAICT.
>>>
>>> I don't have access to any RISC-V hardwdare but you say this 
>>> reproduces under
>>> qemu - what do you run on the system to cause the error? Is it just 
>>> a simple
>>> boot and load a module or are you running selftests or something else?
>>
>> It fails a simple boot test. Here is a detailed instruction how to
>> reproduce this issue with the random Debian rootfs image found on the
>> internet (tested on Ubuntu 22.04, with next-20250610
>> kernel source):
>
> riscv is one of the archs where pte_mkdevmap() will *not* set the pte 
> as special. (I
> raised this recently in the original series, it's all a big mess)
>
> So, before this change here, pfn_t_devmap() would have returned 
> "false" if only
> PFN_DEV was set, now it would return "true" if only PFN_DEV is set.
>
> Consequently, in insert_pfn() we would have done a pte_mkspecial(), 
> now we do a
> pte_mkdevmap() -- again, which does not imply "special" on riscv.
>
> riscv selects CONFIG_ARCH_HAS_PTE_SPECIAL, so if !pte_special(), it's 
> considered as
> normal.
>
> Would the following fix your issue?
>
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 8eba595056fe3..0e972c3493692 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -589,6 +589,10 @@ struct page *vm_normal_page(struct vm_area_struct 
> *vma, unsigned long addr,
>  {
>         unsigned long pfn = pte_pfn(pte);
>
> +       /* TODO: remove this crap and set pte_special() instead. */
> +       if (pte_devmap(pte))
> +               return NULL;
> +
>         if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
>                 if (likely(!pte_special(pte)))
>                         goto check_pfn;
> @@ -598,16 +602,6 @@ struct page *vm_normal_page(struct vm_area_struct 
> *vma, unsigned long addr,
>                         return NULL;
>                 if (is_zero_pfn(pfn))
>                         return NULL;
> -               if (pte_devmap(pte))
> -               /*
> -                * NOTE: New users of ZONE_DEVICE will not set 
> pte_devmap()
> -                * and will have refcounts incremented on their struct 
> pages
> -                * when they are inserted into PTEs, thus they are 
> safe to
> -                * return here. Legacy ZONE_DEVICE pages that set 
> pte_devmap()
> -                * do not have refcounts. Example of legacy 
> ZONE_DEVICE is
> -                * MEMORY_DEVICE_FS_DAX type in pmem or virtio_fs 
> drivers.
> -                */
> -                       return NULL;
>
>                 print_bad_pte(vma, addr, pte, NULL);
>                 return NULL;
>
>
> But, I would have thought the later patches in Alistairs series would 
> sort that out
> (where we remove pte_devmap() ... )
>
The above change fixes the issues observed on RISCV boards.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ