From: Steven Rostedt Impact: fix to set_memory_rw I was hitting a hard lock up when I would set a page range to read-write, and then write to it. The lock up happened because the PTE was set to RW but its PMD was not. This would take a page fault, but the page fault handler mistaken it for a spurious fault caused by lazy TLB transactions. This was because it only checked the permission bits of the PTE, which were correct. The PMD was not. The fault handler would return only to take the page fault again. fault -> PTE OK must be spurious -> return -> fault -> etc. What caused this anomaly was this: 1) The kernel pages were set at the end of boot up to read-only. 2) Since the change could keep the large 2M page tables it just changed the PTE bit for the 2M section. 3) The 2M section needed to be split up for NX bit being set. 4) The break up made the original PTE into a PMD and moved the protection bits to the smaller 4K PTEs. The PMD kept its RW bit off. 5) Now to set the range of pages for RW. Only the PTEs were modified (already split up), and not the PMD that contained them. After that, we were in a state where the PTEs allowed the write but the PMD did not. Signed-off-by: Steven Rostedt --- arch/x86/mm/pageattr.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 84ba748..79c700d 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -513,11 +513,13 @@ static int split_large_page(pte_t *kpte, unsigned long address) * On Intel the NX bit of all levels must be cleared to make a * page executable. See section 4.13.2 of Intel 64 and IA-32 * Architectures Software Developer's Manual). + * The same is true for RW. Let the PTE determine the + * the RW protection, and keep the PMD RW set. * * Mark the entry present. The current mapping might be * set to not present, which we preserved above. */ - ref_prot = pte_pgprot(pte_mkexec(pte_clrhuge(*kpte))); + ref_prot = pte_pgprot(pte_mkwrite(pte_mkexec(pte_clrhuge(*kpte)))); pgprot_val(ref_prot) |= _PAGE_PRESENT; __set_pmd_pte(kpte, address, mk_pte(base, ref_prot)); base = NULL; -- 1.5.6.5 -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/