linux-kernel - Re: [PATCH 01/23] mm: Introduce revoke_file

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1vdnfze70.fsf@fess.ebiederm.org>
Date:	Mon, 01 Jun 2009 17:12:19 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	viro@...IV.linux.org.uk, linux-kernel@...r.kernel.org,
	linux-pci@...r.kernel.org, linux-mm@...ck.org,
	linux-fsdevel@...r.kernel.org, tj@...nel.org,
	hugh.dickins@...cali.co.uk, adobriyan@...il.com,
	torvalds@...ux-foundation.org, alan@...rguk.ukuu.org.uk,
	gregkh@...e.de, npiggin@...e.de, hch@...radead.org,
	ebiederm@...stanetworks.com
Subject: Re: [PATCH 01/23] mm: Introduce revoke_file_mappings.

Andrew Morton <akpm@...ux-foundation.org> writes:

> On Mon,  1 Jun 2009 14:50:26 -0700
> "Eric W. Biederman" <ebiederm@...ssion.com> wrote:
>
>> +static void revoke_vma(struct vm_area_struct *vma)
>
> This looks odd.
>
>> +{
>> +	struct file *file = vma->vm_file;
>> +	struct address_space *mapping = file->f_mapping;
>> +	unsigned long start_addr, end_addr, size;
>> +	struct mm_struct *mm;
>> +
>> +	start_addr = vma->vm_start;
>> +	end_addr = vma->vm_end;
>
> We take a copy of start_addr/end_addr (and this end_addr value is never used)
A foolish consistency.

>> +	/* Switch out the locks so I can maninuplate this under the mm sem.
>> +	 * Needed so I can call vm_ops->close.
>> +	 */
>> +	mm = vma->vm_mm;
>> +	atomic_inc(&mm->mm_users);
>> +	spin_unlock(&mapping->i_mmap_lock);
>> +
>> +	/* Block page faults and other code modifying the mm. */
>> +	down_write(&mm->mmap_sem);
>> +
>> +	/* Lookup a vma for my file address */
>> +	vma = find_vma(mm, start_addr);
>
> Then we look up a vma.  Is there reason to believe that this will
> differ from the incoming arg which we just overwrote?  Maybe the code
> is attempting to handle racing concurrent mmap/munmap activity?  If so,
> what are the implications of this?

Yes it is.  The file based index is only safe while we hold the i_mmap_lock.
The manipulation that needs to happen under the mmap_sem.

So I drop all of the locks and restart.  And use the time honored
kernel practice of relooking up the thing I am going to manipulate.
As long as it is for the same file I don't care.

> I _think_ that what the function is attempting to do is "unmap the vma
> which covers the address at vma->start_addr".  If so, why not just pass
> it that virtual address?

Actually it is  unmapping a vma for the file I am revoking.  I hand it
one and then it does an address space jig.

> Anyway, it's all a bit obscure and I do think that the semantics and
> behaviour should be carefully explained in a comment, no?
>
>> +	if (vma->vm_file != file)
>> +		goto out;
>
> This strengthens the theory that some sort of race-management is
> happening here.

Totally.  I dropped all of my locks so I am having to restart in
a different locking context.

>> +	start_addr = vma->vm_start;
>> +	end_addr   = vma->vm_end;
>> +	size	   = end_addr - start_addr;
>> +
>> +	/* Unlock the pages */
>> +	if (mm->locked_vm && (vma->vm_flags & VM_LOCKED)) {
>> +		mm->locked_vm -= vma_pages(vma);
>> +		vma->vm_flags &= ~VM_LOCKED;
>> +	}
>> +
>> +	/* Unmap the vma */
>> +	zap_page_range(vma, start_addr, size, NULL);
>> +
>> +	/* Unlink the vma from the file */
>> +	unlink_file_vma(vma);
>> +
>> +	/* Close the vma */
>> +	if (vma->vm_ops && vma->vm_ops->close)
>> +		vma->vm_ops->close(vma);
>> +	fput(vma->vm_file);
>> +	vma->vm_file = NULL;
>> +	if (vma->vm_flags & VM_EXECUTABLE)
>> +		removed_exe_file_vma(vma->vm_mm);
>> +
>> +	/* Repurpose the vma  */
>> +	vma->vm_private_data = NULL;
>> +	vma->vm_ops = &revoked_vm_ops;
>> +	vma->vm_flags &= ~(VM_NONLINEAR | VM_CAN_NONLINEAR);
>> +out:
>> +	up_write(&mm->mmap_sem);
>> +	spin_lock(&mapping->i_mmap_lock);
>> +}
>
> Also, I'm not a bit fan of the practice of overwriting the value of a
> formal argument, especially in a function which is this large and
> complex.  It makes the code harder to follow, because the one variable
> holds two conceptually different things within the span of the same
> function.  And it adds risk that someone will will later access a field
> of *vma and it will be the wrong vma.  Worse, the bug is only exposed
> under exeedingly rare conditions.
>
> So..  Use a new local, please.

We can never legitimately have more than one vma manipulated in this function.
As for the rest.  I guess I just assumed that the reader of the code would
have a basic understanding of the locking rules for those data structures.

Certainly the worst thing I suffer from is being close to the code,
and not realizing which pieces are not obvious to a naive observer.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/