lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 06 May 2015 14:18:17 +0530
From:	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
To:	"Kirill A. Shutemov" <kirill@...temov.name>
Cc:	akpm@...ux-foundation.org, mpe@...erman.id.au, paulus@...ba.org,
	benh@...nel.crashing.org, kirill.shutemov@...ux.intel.com,
	aarcange@...hat.com, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [RFC PATCH] mm/thp: Use new function to clear pmd before THP splitting

"Kirill A. Shutemov" <kirill@...temov.name> writes:

> On Mon, May 04, 2015 at 10:59:16PM +0530, Aneesh Kumar K.V wrote:
>> Archs like ppc64 require pte_t * to remain stable in some code path.
>> They use local_irq_disable to prevent a parallel split. Generic code
>> clear pmd instead of marking it _PAGE_SPLITTING in code path
>> where we can afford to mark pmd none before splitting. Use a
>> variant of pmdp_splitting_clear_notify that arch can override.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@...ux.vnet.ibm.com>
>
> Sorry, I still try wrap my head around this problem.
>
> So, Power has __find_linux_pte_or_hugepte() which does lock-less lookup in
> page tables with local interrupts disabled. For huge pages it casts pmd_t
> to pte_t. Since format of pte_t is different from pmd_t we want to prevent
> transit from pmd pointing to page table to pmd pinging to huge page (and
> back) while interrupts are disabled.
>
> The complication for Power is that it doesn't do implicit IPI on tlb
> flush.
>

s/doesn't do/doesn't need to do/


> Is it correct?

that is correct. I will add more info to the commit message of the patch
I will end up doing.

>
> For THP, split_huge_page() and collapse sides are covered. This patch
> should address two cases of splitting PMD, but not compound page in
> current upstream.
>
> But I think there's still *big* problem for Power -- zap_huge_pmd().
>
> For instance: other CPU can shoot out a THP PMD with MADV_DONTNEED and
> fault in small pages instead. IIUC, for __find_linux_pte_or_hugepte(),
> it's equivalent of splitting.
>
> I don't see how this can be fixed without kick_all_cpus_sync() in all
> pmdp_clear_flush() on Power.
>


Yes we could run into issue with that. Thanks for catching this. Now i
am not sure whether we want to do the kick_all_cpus_sync in
pmdp_get_and_clear. We do use that function while updating huge pte. The
one i am looking at is change_huge_pmd. We don't need a IPI there
and we would really like to avoid the IPI. Any idea why we follow
the sequence of pmd_clear and set_pmd, instead of pmd_update there ?

I looked at code paths we are clearing pmd where we would not
require an IPI. Listing them

move_huge_pmd
do_huge_pmd_wp_page
migrate_misplace_transhuge_page
change_huge_pmd.

Of this IIUC change_huge_pmd may be called more frequently and hence we
may want to avoid doing kick_all_cpus_sync there ?

One way to fix that would be switch change_huge_pmd to pmd_update and
then we could do a kick_all_cpus_sync in pmdp_get_and_clear.

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ