lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5011AFEC.2040609@redhat.com>
Date:	Thu, 26 Jul 2012 17:00:28 -0400
From:	Rik van Riel <riel@...hat.com>
To:	Mel Gorman <mgorman@...e.de>
CC:	Linux-MM <linux-mm@...ck.org>, Michal Hocko <mhocko@...e.cz>,
	Hugh Dickins <hughd@...gle.com>,
	David Gibson <david@...son.dropbear.id.au>,
	Ken Chen <kenchen@...gle.com>,
	Cong Wang <xiyou.wangcong@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Larry Woodman <lwoodman@...hat.com>
Subject: Re: [PATCH] mm: hugetlbfs: Close race during teardown of hugetlbfs
 shared page tables v2

On 07/20/2012 09:49 AM, Mel Gorman wrote:
> This V2 is still the mmap_sem approach that fixes a potential deadlock
> problem pointed out by Michal.

Larry and I were looking around the hugetlb code some
more, and found what looks like yet another race.

In hugetlb_no_page, we have the following code:


         spin_lock(&mm->page_table_lock);
         size = i_size_read(mapping->host) >> huge_page_shift(h);
         if (idx >= size)
                 goto backout;

         ret = 0;
         if (!huge_pte_none(huge_ptep_get(ptep)))
                 goto backout;

         if (anon_rmap)
                 hugepage_add_new_anon_rmap(page, vma, address);
         else
                 page_dup_rmap(page);
         new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE)
                                 && (vma->vm_flags & VM_SHARED)));
         set_huge_pte_at(mm, address, ptep, new_pte);
	...
	spin_unlock(&mm->page_table_lock);

Notice how we check !huge_pte_none with our own
mm->page_table_lock held.

This offers no protection at all against other
processes, that also hold their own page_table_lock.

In short, it looks like it is possible for multiple
processes to go through the above code simultaneously,
potentially resulting in:

1) one process overwriting the pte just created by
    another process

2) data corruption, as one partially written page
    gets superceded by an newly zeroed page, but no
    TLB invalidates get sent to other CPUs

3) a memory leak of a huge page

Is there anything that would make this race impossible,
or is this a real bug?

If so, are there more like it in the hugetlbfs code?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ