[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190906210346.5ecbff01@thinkpad>
Date: Fri, 6 Sep 2019 21:03:46 +0200
From: Gerald Schaefer <gerald.schaefer@...ibm.com>
To: Anshuman Khandual <anshuman.khandual@....com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Mike Rapoport <rppt@...ux.vnet.ibm.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Dan Williams <dan.j.williams@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Michal Hocko <mhocko@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Mark Brown <broonie@...nel.org>,
Steven Price <Steven.Price@....com>,
Ard Biesheuvel <ard.biesheuvel@...aro.org>,
Masahiro Yamada <yamada.masahiro@...ionext.com>,
Kees Cook <keescook@...omium.org>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
Matthew Wilcox <willy@...radead.org>,
Sri Krishna chowdary <schowdary@...dia.com>,
Dave Hansen <dave.hansen@...el.com>,
Russell King - ARM Linux <linux@...linux.org.uk>,
Michael Ellerman <mpe@...erman.id.au>,
Paul Mackerras <paulus@...ba.org>,
Martin Schwidefsky <schwidefsky@...ibm.com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
"David S. Miller" <davem@...emloft.net>,
Vineet Gupta <vgupta@...opsys.com>,
James Hogan <jhogan@...nel.org>,
Paul Burton <paul.burton@...s.com>,
Ralf Baechle <ralf@...ux-mips.org>,
linux-snps-arc@...ts.infradead.org, linux-mips@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-ia64@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
linux-sh@...r.kernel.org, sparclinux@...r.kernel.org,
x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] mm/pgtable/debug: Add test validating architecture
page table helpers
On Fri, 6 Sep 2019 11:58:59 +0530
Anshuman Khandual <anshuman.khandual@....com> wrote:
> On 09/05/2019 10:36 PM, Gerald Schaefer wrote:
> > On Thu, 5 Sep 2019 14:48:14 +0530
> > Anshuman Khandual <anshuman.khandual@....com> wrote:
> >
> >>> [...]
> >>>> +
> >>>> +#if !defined(__PAGETABLE_PMD_FOLDED) && !defined(__ARCH_HAS_4LEVEL_HACK)
> >>>> +static void pud_clear_tests(pud_t *pudp)
> >>>> +{
> >>>> + memset(pudp, RANDOM_NZVALUE, sizeof(pud_t));
> >>>> + pud_clear(pudp);
> >>>> + WARN_ON(!pud_none(READ_ONCE(*pudp)));
> >>>> +}
> >>>
> >>> For pgd/p4d/pud_clear(), we only clear if the page table level is present
> >>> and not folded. The memset() here overwrites the table type bits, so
> >>> pud_clear() will not clear anything on s390 and the pud_none() check will
> >>> fail.
> >>> Would it be possible to OR a (larger) random value into the table, so that
> >>> the lower 12 bits would be preserved?
> >>
> >> So the suggestion is instead of doing memset() on entry with RANDOM_NZVALUE,
> >> it should OR a large random value preserving lower 12 bits. Hmm, this should
> >> still do the trick for other platforms, they just need non zero value. So on
> >> s390, the lower 12 bits on the page table entry already has valid value while
> >> entering this function which would make sure that pud_clear() really does
> >> clear the entry ?
> >
> > Yes, in theory the table entry on s390 would have the type set in the last
> > 4 bits, so preserving those would be enough. If it does not conflict with
> > others, I would still suggest preserving all 12 bits since those would contain
> > arch-specific flags in general, just to be sure. For s390, the pte/pmd tests
> > would also work with the memset, but for consistency I think the same logic
> > should be used in all pxd_clear_tests.
>
> Makes sense but..
>
> There is a small challenge with this. Modifying individual bits on a given
> page table entry from generic code like this test case is bit tricky. That
> is because there are not enough helpers to create entries with an absolute
> value. This would have been easier if all the platforms provided functions
> like __pxx() which is not the case now. Otherwise something like this should
> have worked.
>
>
> pud_t pud = READ_ONCE(*pudp);
> pud = __pud(pud_val(pud) | RANDOM_VALUE (keeping lower 12 bits 0))
> WRITE_ONCE(*pudp, pud);
>
> But __pud() will fail to build in many platforms.
Hmm, I simply used this on my system to make pud_clear_tests() work, not
sure if it works on all archs:
pud_val(*pudp) |= RANDOM_NZVALUE;
>
> The other alternative will be to make sure memset() happens on all other
> bits except the lower 12 bits which will depend on endianness. If s390
> has a fixed endianness, we can still use either of them which will hold
> good for others as well.
>
> memset(pudp, RANDOM_NZVALUE, sizeof(pud_t) - 3);
>
> OR
>
> memset(pudp + 3, RANDOM_NZVALUE, sizeof(pud_t) - 3);
>
> >
> > However, there is another issue on s390 which will make this only work
> > for pud_clear_tests(), and not for the p4d/pgd_tests. The problem is that
> > mm_alloc() will only give you a 3-level page table initially on s390.
> > This means that pudp == p4dp == pgdp, and so the p4d/pgd_tests will
> > both see the pud level (of course this also affects other tests).
>
> Got it.
>
> >
> > Not sure yet how to fix this, i.e. how to initialize/update the page table
> > to 5 levels. We can handle 5 level page tables, and it would be good if
> > all levels could be tested, but using mm_alloc() to establish the page
> > tables might not work on s390. One option could be to provide an arch-hook
> > or weak function to allocate/initialize the mm.
>
> Sure, got it. Though I plan to do add some arch specific tests or init sequence
> like the above later on but for now the idea is to get the smallest possible set
> of test cases which builds and runs on all platforms without requiring any arch
> specific hooks or special casing (#ifdef) to be agreed upon broadly and accepted.
>
> Do you think this is absolutely necessary on s390 for the very first set of test
> cases or we can add this later on as an improvement ?
It can be added later, no problem. I did not expect this to work flawlessly
on s390 right from the start anyway, with all our peculiarities, so don't
let this hinder you. I might come up with an add-on patch later.
Actually, using get_unmapped_area() as suggested by Kirill could also
solve this issue. We do create a new mm with 3-level page tables on s390,
and the dynamic upgrade to 4 or 5 levels is then triggered exactly by
arch_get_unmapped_area(), depending on the addr. But I currently don't
see how / where arch_get_unmapped_area() is set up for such a dummy mm
created by mm_alloc().
Regards,
Gerald
Powered by blists - more mailing lists