lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Nov 2022 14:12:16 +0000
From:   Biju Das <biju.das.jz@...renesas.com>
To:     Anshuman Khandual <anshuman.khandual@....com>,
        Robin Murphy <robin.murphy@....com>,
        Mark Rutland <mark.rutland@....com>
CC:     Nathan Chancellor <nathan@...nel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "llvm@...ts.linux.dev" <llvm@...ts.linux.dev>
Subject: RE: [PATCH] arm64/mm: Drop redundant BUG_ON(!pgtable_alloc)

Hi All,

> Subject: Re: [PATCH] arm64/mm: Drop redundant BUG_ON(!pgtable_alloc)
> 
> 
> 
> On 11/21/22 19:26, Robin Murphy wrote:
> > On 2022-11-21 12:27, Mark Rutland wrote:
> >> On Mon, Nov 21, 2022 at 11:00:42AM +0530, Anshuman Khandual wrote:
> >>> Hello Nathan,
> >>>
> >>> Thanks for the report.
> >>>
> >>> On 11/20/22 21:46, Nathan Chancellor wrote:
> >>>> Hi Anshuman,
> >>
> >>>> I just bisected a boot failure in our QEMU-based continuous

This patch created boot failure on RZ/G2L platforms as well.
Reverting the patch fixed the boot failure.

Cheers,
Biju

> >>>> integration setup to this change as commit 9ed2b4616d4e ("arm64/mm:
> >>>> Drop redundant
> >>>> BUG_ON(!pgtable_alloc)") in the arm64 tree. There is no output so
> >>>> the panic clearly happens early at boot. If I move back to the
> >>>> previous commit and add a WARN_ON() like so:
> >>>>
> >>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index
> >>>> d386033a074c..9280a92ff920 100644
> >>>> --- a/arch/arm64/mm/mmu.c
> >>>> +++ b/arch/arm64/mm/mmu.c
> >>>> @@ -383,6 +383,7 @@ static void __create_pgd_mapping_locked(pgd_t
> >>>> *pgdir, phys_addr_t phys,
> >>>>       phys &= PAGE_MASK;
> >>>>       addr = virt & PAGE_MASK;
> >>>>       end = PAGE_ALIGN(virt + size);
> >>>> +    WARN_ON(!pgtable_alloc);
> >>>>         do {
> >>>>           next = pgd_addr_end(addr, end);
> >>>>
> >>>> I do see some stacktraces. I have attached the boot log from QEMU.
> >>>>
> >>>> If there is any additional information I can provide or patches I
> >>>> can test, I am more than happy to do so.
> >>>
> >>> There are couple of instances, where __create_pgd_mapping() function
> >>> gets called without a valid pgtable alloc function (NULL is passed
> >>> on instead), as it is not expected to allocate page table pages,
> >>> during the mapping process. The following change after this patch
> should solve the reported problem.
> >>>
> >>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index
> >>> 9ea8e9039992..a00563122fcb 100644
> >>> --- a/arch/arm64/mm/mmu.c
> >>> +++ b/arch/arm64/mm/mmu.c
> >>> @@ -42,6 +42,7 @@
> >>>   #define NO_BLOCK_MAPPINGS      BIT(0)
> >>>   #define NO_CONT_MAPPINGS       BIT(1)
> >>>   #define NO_EXEC_MAPPINGS       BIT(2)  /* assumes FEAT_HPDS is not
> >>> used */
> >>> +#define NO_ALLOC_MAPPINGS      BIT(3)  /* does not allocate page
> >>> +table pages */
> >>>     int idmap_t0sz __ro_after_init;
> >>>   @@ -380,7 +381,7 @@ static void __create_pgd_mapping_locked(pgd_t
> >>> *pgdir, phys_addr_t phys,
> >>>          phys &= PAGE_MASK;
> >>>          addr = virt & PAGE_MASK;
> >>>          end = PAGE_ALIGN(virt + size);
> >>> -       BUG_ON(!pgtable_alloc);
> >>> +       BUG_ON(!(flags & NO_ALLOC_MAPPINGS) && !pgtable_alloc);
> >>>            do {
> >>>                  next = pgd_addr_end(addr, end); @@ -453,7 +454,7 @@
> >>> static void __init create_mapping_noalloc(phys_addr_t phys, unsigned
> >>> long virt,
> >>>                  return;
> >>>          }
> >>>          __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot,
> >>> NULL,
> >>> -                            NO_CONT_MAPPINGS);
> >>> +                            NO_CONT_MAPPINGS | NO_ALLOC_MAPPINGS);
> >>>   }
> >>>     void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t
> >>> phys, @@ -481,7 +482,7 @@ static void
> >>> update_mapping_prot(phys_addr_t phys, unsigned long virt,
> >>>          }
> >>>            __create_pgd_mapping(init_mm.pgd, phys, virt, size, prot,
> >>> NULL,
> >>> -                            NO_CONT_MAPPINGS);
> >>> +                            NO_CONT_MAPPINGS | NO_ALLOC_MAPPINGS);
> >>>            /* flush the TLBs after updating live kernel mappings */
> >>>          flush_tlb_kernel_range(virt, virt + size);
> >>
> >> This is now more complicated than what we had originally, and it
> >> doesn't catch the case where the caller sets NO_ALLOC_MAPPINGS but
> >> the callee ends up needing to perform an allocation, which the old
> code would have caught.
> >
> > Well, it's still "caught" as such - all that BUG_ON(!pgtable_alloc)
> does in these cases is encode the source location in the backtrace, vs.
> having to decode it (if necessary) from the LR in a backtrace from
> immediately dereferencing pgtable_alloc(). If that happens before the
> user has a console up then the difference is moot anyway.
> 
> Agreed, also the fact that certain callers are sure about no page table
> page allocation would be required (hence they just pass "NULL" for
> pgtable_alloc function), needs to be notified appropriately when page
> page table creation stumbles upon a pxd_none().
> 
> if (pxd_none(pmd)) {
> 	BUG_ON(flags & NO_ALLOC_MAPPINGS);
> }
> 
> Although there are might be an argument that all erstwhile
> BUG_ON(!pgtable_alloc) has replacements with BUG_ON(flags &
> NO_ALLOC_MAPPINGS), but there is a subtle difference.
> It captures the fact (in a rather structured manner), that caller made a
> choice with NO_ALLOC_MAPPINGS but called __create_pgd_mapping() on a
> page table, where intermediate page table page alloc would still be
> required. IMHO adding that explicit structure here via the new flag
> NO_ALLOC_MAPPINGS makes sense.
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@...ts.infradead.org
> https://jpn01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.i
> nfradead.org%2Fmailman%2Flistinfo%2Flinux-arm-
> kernel&amp;data=05%7C01%7Cbiju.das.jz%40bp.renesas.com%7Cbcd488ff5f5c44a
> aaf3708dacc387479%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638046840
> 104902697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLC
> JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=F%2B3%2F2E583sYkS
> Z2bYIUN6HmHO7aGD5KT2EuGCE8MY2w%3D&amp;reserved=0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ