lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 09 Jul 2007 20:50:23 -0500
From:	William Tambe <tambewilliam@...il.com>
To:	Hugh Dickins <hugh@...itas.com>
CC:	Stas Sergeev <stsp@...et.ru>,
	"Rohland, Hans-Christoph" <hans-christoph.rohland@....com>,
	linux-kernel@...r.kernel.org
Subject: Re: Concerning a post that you made about expandable anonymous shared
 mappings


Hugh Dickins wrote:
> On Mon, 2 Jul 2007, Stas Sergeev wrote:
>> Hugh Dickins wrote:
>>> You've answered your own question: we did not make the change Stas
>>> suggested, IIRC because I remained a little uneasy with that change
>>> in behaviour, and nobody else spoke up for it.
>> IIRC your argument, that made sense to me,
>> was that with such an approach, you can only
>> expand the backing-store, but never shrink.
>> I agree this is a problem from some point of
>> view. I have no idea whether it is important
>> or not, but it definitely _looks_ not very good.
> 
> You were gracious enough to accept my arguments back then, but after
> mulling this over overnight, I've come to think I was just too timid
> back then, and gave too much weight to the issue of there being no
> shrink, and to the issue that child might expand the object without
> parent knowing (that surplus remaining allocated until both exited).
> 
> I've come right around to your original view, Stas, and William's:
> if that mmap creates such an object, then the expanding mremap really
> ought to be useful, and allow the underlying object to be expanded.
> The shared anonymous object is already anomalous: expanding it on
> fault makes it more consistent with its own nature, not less.
> 
>>>>         //why does this failed. I am well in the interval [4096, 8192]
>>>>         *(unsigned int *)(ptr + 4096 + 8)= 10;
>>>> }
>> Well, this generates a bus error, while, for
>> example, doing the same trick with having a
>> /dev/mem as a backing-store, simply maps the
>> "enlarged" space with the anonymous memory,
>> and so does not generate a SIGBUS (not producing
>> a desired effect either, of course).
>> Why do we have it both ways? Shouldn't they
>> (i.e. /dev/zero and /dev/mem mapping) behave
>> the same after expanding with mremap?
> 
> They've very different: mapping /dev/mem maps an existing object;
> whereas mapping /dev/zero is a convention by which a new object
> is created - and we can't afford the memory to make it of infinite
> extent, so have limited it to the mapped size.
> 
>>> I haven't given it any thought since then:
>> I was thinking about it a bit, and I imagined
>> we need something like
>>    int mopen(void *addr, size_t len, unsigned int flags);
>> which will give you an fd for the memory area,
>> which you can then ftruncate() and mmap() (and
>> mremap).
> 
> Ah, if we added an mopen(), there's no end to the discussions
> we could have about what it can do.  It may be a great idea:
> but it's really not needed to solve this particular little
> problem.  As last time around, you were suggesting .mremap
> callouts; but I much prefer your original shmem_zero_nopage,
> which is a solution of the scale appropriate to the problem.
> 
>> Such a thing could solve that mremap() problem
>> that me and William Tambe have.
> 
> If you're thinking of going that way, for this problem it
> would be better just to stick with POSIX shm_open, as Christoph
> suggested before, and you suggest to William in other mail.
> 
> But I now accept your view, that the shared anonymous object
> is not behaving consistently in response to mremap, and would
> like to put in a patch for that.  This isn't a particularly good
> time for such a patch - it's unclear right now whether 2.6.23 will
> have shmem_nopage like 2.6.22 or shmem_fault like 2.6.22-rc-mm;
> but we can easily adjust to whichever.
> 
> Here's a patch against 2.6.22-rc7: would you, Stas, put your
> Signed-off-by into this, and accept authorship - although I'm
> sending this back to you, it's very much your idea, and only
> trivially modified from your three-year-old patch by me.  If
> you're agreeable, I can then forward it or its shmem_zero_fault
> equivalent to Andrew when we see which way 2.6.23 is going.
> 
> [I'll fill in the patch comment later]
> 
> Signed-off-by: Hugh Dickins <hugh@...itas.com>
> ----
> 
>  mm/shmem.c |   41 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 40 insertions(+), 1 deletion(-)
> 
> --- 2.6.22-rc7/mm/shmem.c	2007-06-17 10:51:02.000000000 +0100
> +++ linux/mm/shmem.c	2007-07-03 15:35:32.000000000 +0100
> @@ -1320,6 +1320,36 @@ static struct page *shmem_nopage(struct 
>  	return page;
>  }
>  
> +struct page *shmem_zero_nopage(struct vm_area_struct *vma,
> +				unsigned long address, int *type)
> +{
> +	struct page *page;
> +
> +	page = shmem_nopage(vma, address, type);
> +	if (unlikely(page == NOPAGE_SIGBUS)) {
> +		struct inode *inode = vma->vm_file->f_dentry->d_inode;
> +		struct shmem_inode_info *info = SHMEM_I(inode);
> +		loff_t vm_size, i_size;
> +
> +		/*
> +		 * If mremap has been used to extend the vma,
> +		 * now extend the underlying object to include this page.
> +		 */
> +		vm_size = (PAGE_ALIGN(address) - vma->vm_start) +
> +				((loff_t)vma->vm_pgoff << PAGE_SHIFT);
> +		spin_lock(&info->lock);
> +		i_size = i_size_read(inode);
> +		if (i_size < vm_size && vm_size <= SHMEM_MAX_BYTES &&
> +		    !shmem_acct_size(info->flags, vm_size - i_size)) {
> +			i_size_write(inode, vm_size);
> +			inode->i_ctime = inode->i_mtime = CURRENT_TIME;
> +		}
> +		spin_unlock(&info->lock);
> +		page = shmem_nopage(vma, address, type);
> +	}
> +	return page;
> +}
> +
>  static int shmem_populate(struct vm_area_struct *vma,
>  	unsigned long addr, unsigned long len,
>  	pgprot_t prot, unsigned long pgoff, int nonblock)
> @@ -2471,6 +2501,14 @@ static struct vm_operations_struct shmem
>  #endif
>  };
>  
> +static struct vm_operations_struct shmem_zero_vm_ops = {
> +	.nopage		= shmem_zero_nopage,
> +	.populate	= shmem_populate,
> +#ifdef CONFIG_NUMA
> +	.set_policy     = shmem_set_policy,
> +	.get_policy     = shmem_get_policy,
> +#endif
> +};
>  
>  static int shmem_get_sb(struct file_system_type *fs_type,
>  	int flags, const char *dev_name, void *data, struct vfsmount *mnt)
> @@ -2599,6 +2637,7 @@ int shmem_zero_setup(struct vm_area_stru
>  	if (vma->vm_file)
>  		fput(vma->vm_file);
>  	vma->vm_file = file;
> -	vma->vm_ops = &shmem_vm_ops;
> +	vma->vm_ops = &shmem_zero_vm_ops;
> +	vma->vm_pgoff = 0;
>  	return 0;
>  }
> 

Will this patch be added to stable versions of the linux kernel?
Please let me know.

Sincerely,
William Tambe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ