linux-kernel - Re: [PATCH v2 2/4] mm/hugetlb: Setup hugetlb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 20 Oct 2015 18:02:38 -0700
From:	Mike Kravetz <mike.kravetz@...cle.com>
To:	Dave Hansen <dave.hansen@...el.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Cc:	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Hugh Dickins <hughd@...gle.com>,
	Davidlohr Bueso <dave@...olabs.net>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 2/4] mm/hugetlb: Setup hugetlb_falloc during fallocate
 hole punch

On 10/20/2015 05:11 PM, Dave Hansen wrote:
> On 10/20/2015 04:52 PM, Mike Kravetz wrote:
>>  	if (hole_end > hole_start) {
>>  		struct address_space *mapping = inode->i_mapping;
>> +		DECLARE_WAIT_QUEUE_HEAD_ONSTACK(hugetlb_falloc_waitq);
>> +		/*
>> +		 * Page faults on the area to be hole punched must be stopped
>> +		 * during the operation.  Initialize struct and have
>> +		 * inode->i_private point to it.
>> +		 */
>> +		struct hugetlb_falloc hugetlb_falloc = {
>> +			.waitq = &hugetlb_falloc_waitq,
>> +			.start = hole_start >> hpage_shift,
>> +			.end = hole_end >> hpage_shift
>> +		};
> ...
>> @@ -527,6 +550,12 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
>>  						hole_end  >> PAGE_SHIFT);
>>  		i_mmap_unlock_write(mapping);
>>  		remove_inode_hugepages(inode, hole_start, hole_end);
>> +
>> +		spin_lock(&inode->i_lock);
>> +		inode->i_private = NULL;
>> +		wake_up_all(&hugetlb_falloc_waitq);
>> +		spin_unlock(&inode->i_lock);
> 
> I see the shmem code doing something similar.  But, in the end, we're
> passing the stack-allocated 'hugetlb_falloc_waitq' over to the page
> faulting thread.  Is there something subtle that keeps
> 'hugetlb_falloc_waitq' from becoming invalid while the other task is
> sleeping?
> 
> That wake_up_all() obviously can't sleep, but it seems like the faulting
> thread's finish_wait() *HAS* to run before wake_up_all() can return.
> 

The 'trick' is noted in the comment in the shmem_fault code:

                        /*
                         * shmem_falloc_waitq points into the
shmem_fallocate()
                         * stack of the hole-punching task:
shmem_falloc_waitq
                         * is usually invalid by the time we reach here, but
                         * finish_wait() does not dereference it in that
case;
                         * though i_lock needed lest racing with
wake_up_all().
                         */

The faulting thread is removed from the waitq when awakened with
wake_up_all().  See the DEFINE_WAIT() and supporting code in the
faulting thread.  Because of this, when the faulting thread calls
finish_wait() it does not access the waitq that was/is on the stack.

At least I've convinced myself it works this way. :)

-- 
Mike Kravetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/