lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190906210346.5ecbff01@thinkpad>
Date:   Fri, 6 Sep 2019 21:03:46 +0200
From:   Gerald Schaefer <gerald.schaefer@...ibm.com>
To:     Anshuman Khandual <anshuman.khandual@....com>
Cc:     linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Dan Williams <dan.j.williams@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Michal Hocko <mhocko@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Mark Brown <broonie@...nel.org>,
        Steven Price <Steven.Price@....com>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        Kees Cook <keescook@...omium.org>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Matthew Wilcox <willy@...radead.org>,
        Sri Krishna chowdary <schowdary@...dia.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Russell King - ARM Linux <linux@...linux.org.uk>,
        Michael Ellerman <mpe@...erman.id.au>,
        Paul Mackerras <paulus@...ba.org>,
        Martin Schwidefsky <schwidefsky@...ibm.com>,
        Heiko Carstens <heiko.carstens@...ibm.com>,
        "David S. Miller" <davem@...emloft.net>,
        Vineet Gupta <vgupta@...opsys.com>,
        James Hogan <jhogan@...nel.org>,
        Paul Burton <paul.burton@...s.com>,
        Ralf Baechle <ralf@...ux-mips.org>,
        linux-snps-arc@...ts.infradead.org, linux-mips@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, linux-ia64@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
        linux-sh@...r.kernel.org, sparclinux@...r.kernel.org,
        x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture
 page table helpers

On Fri, 6 Sep 2019 11:58:59 +0530
Anshuman Khandual <anshuman.khandual@....com> wrote:

> On 09/05/2019 10:36 PM, Gerald Schaefer wrote:
> > On Thu, 5 Sep 2019 14:48:14 +0530
> > Anshuman Khandual <anshuman.khandual@....com> wrote:
> >   
> >>> [...]    
> >>>> +
> >>>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
> >>>> +static void pud_clear_tests(pud_t *pudp)
> >>>> +{
> >>>> +	memset(pudp, RANDOM_NZVALUE, sizeof(pud_t));
> >>>> +	pud_clear(pudp);
> >>>> +	WARN_ON(!pud_none(READ_ONCE(*pudp)));
> >>>> +}    
> >>>
> >>> For pgd/p4d/pud_clear(), we only clear if the page table level is present
> >>> and not folded. The memset() here overwrites the table type bits, so
> >>> pud_clear() will not clear anything on s390 and the pud_none() check will
> >>> fail.
> >>> Would it be possible to OR a (larger) random value into the table, so that
> >>> the lower 12 bits would be preserved?    
> >>
> >> So the suggestion is instead of doing memset() on entry with RANDOM_NZVALUE,
> >> it should OR a large random value preserving lower 12 bits. Hmm, this should
> >> still do the trick for other platforms, they just need non zero value. So on
> >> s390, the lower 12 bits on the page table entry already has valid value while
> >> entering this function which would make sure that pud_clear() really does
> >> clear the entry ?  
> > 
> > Yes, in theory the table entry on s390 would have the type set in the last
> > 4 bits, so preserving those would be enough. If it does not conflict with
> > others, I would still suggest preserving all 12 bits since those would contain
> > arch-specific flags in general, just to be sure. For s390, the pte/pmd tests
> > would also work with the memset, but for consistency I think the same logic
> > should be used in all pxd_clear_tests.  
> 
> Makes sense but..
> 
> There is a small challenge with this. Modifying individual bits on a given
> page table entry from generic code like this test case is bit tricky. That
> is because there are not enough helpers to create entries with an absolute
> value. This would have been easier if all the platforms provided functions
> like __pxx() which is not the case now. Otherwise something like this should
> have worked.
> 
> 
> pud_t pud = READ_ONCE(*pudp);
> pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0))
> WRITE_ONCE(*pudp, pud);
> 
> But __pud() will fail to build in many platforms.

Hmm, I simply used this on my system to make pud_clear_tests() work, not
sure if it works on all archs:

pud_val(*pudp) |= RANDOM_NZVALUE;

> 
> The other alternative will be to make sure memset() happens on all other
> bits except the lower 12 bits which will depend on endianness. If s390
> has a fixed endianness, we can still use either of them which will hold
> good for others as well.
> 
> memset(pudp, RANDOM_NZVALUE, sizeof(pud_t) - 3);
> 
> OR
> 
> memset(pudp + 3, RANDOM_NZVALUE, sizeof(pud_t) - 3);
> 
> > 
> > However, there is another issue on s390 which will make this only work
> > for pud_clear_tests(), and not for the p4d/pgd_tests. The problem is that
> > mm_alloc() will only give you a 3-level page table initially on s390.
> > This means that pudp == p4dp == pgdp, and so the p4d/pgd_tests will
> > both see the pud level (of course this also affects other tests).  
> 
> Got it.
> 
> > 
> > Not sure yet how to fix this, i.e. how to initialize/update the page table
> > to 5 levels. We can handle 5 level page tables, and it would be good if
> > all levels could be tested, but using mm_alloc() to establish the page
> > tables might not work on s390. One option could be to provide an arch-hook
> > or weak function to allocate/initialize the mm.  
> 
> Sure, got it. Though I plan to do add some arch specific tests or init sequence
> like the above later on but for now the idea is to get the smallest possible set
> of test cases which builds and runs on all platforms without requiring any arch
> specific hooks or special casing (#ifdef) to be agreed upon broadly and accepted.
> 
> Do you think this is absolutely necessary on s390 for the very first set of test
> cases or we can add this later on as an improvement ?

It can be added later, no problem. I did not expect this to work flawlessly
on s390 right from the start anyway, with all our peculiarities, so don't
let this hinder you. I might come up with an add-on patch later.

Actually, using get_unmapped_area() as suggested by Kirill could also
solve this issue. We do create a new mm with 3-level page tables on s390,
and the dynamic upgrade to 4 or 5 levels is then triggered exactly by
arch_get_unmapped_area(), depending on the addr. But I currently don't
see how / where arch_get_unmapped_area() is set up for such a dummy mm
created by mm_alloc().

Regards,
Gerald

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ