lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 25 Jul 2011 16:41:48 +0930
From:	Christopher Yeoh <cyeoh@....ibm.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	linux-man@...r.kernel.org, linux-arch@...r.kernel.org
Subject: Re: Cross Memory Attach v3

On Thu, 21 Jul 2011 15:09:08 -0700
Andrew Morton <akpm@...ux-foundation.org> wrote:

> On Tue, 19 Jul 2011 00:35:37 +0930
> Christopher Yeoh <cyeoh@....ibm.com> wrote:
> 
> > +/*
> > + * process_vm_rw_pages - read/write pages from task specified
> > + * @task: task to read/write from
> > + * @mm: mm for task
> > + * @process_pages: struct pages area that can store at least
> > + *  nr_pages_to_copy struct page pointers
> > + * @pa: address of page in task to start copying from/to
> > + * @start_offset: offset in page to start copying from/to
> > + * @len: number of bytes to copy
> > + * @lvec: iovec array specifying where to copy to/from
> > + * @lvec_cnt: number of elements in iovec array
> > + * @lvec_current: index in iovec array we are up to
> > + * @lvec_offset: offset in bytes from current iovec iov_base we
> > are up to
> > + * @vm_write: 0 means copy from, 1 means copy to
> > + * @nr_pages_to_copy: number of pages to copy
> > + */
> > +static ssize_t process_vm_rw_pages(struct task_struct *task,
> > +				   struct mm_struct *mm,
> > +				      - *lvec_offset);
> > .....
> > 
> > +
> > +		target_kaddr = kmap(process_pages[pgs_copied]) +
> > start_offset; +
> > +		if (vm_write)
> > +			ret = copy_from_user(target_kaddr,
> > +
> > lvec[*lvec_current].iov_base
> > +					     + *lvec_offset,
> > +					     bytes_to_copy);
> > +		else
> > +			ret =
> > copy_to_user(lvec[*lvec_current].iov_base
> > +					   + *lvec_offset,
> > +					   target_kaddr,
> > bytes_to_copy);
> > +		kunmap(process_pages[pgs_copied]);
> > +		if (ret) {
> > +			pgs_copied++;
> > +			goto end;
> 
> afacit this will always return -EFAULT, even after copying some
> memory.
> 
> Is this a misdesign?  Would it not be better to return the number of
> bytes actually copied, or -EFAULT is we weren't able to copy any?
> Like read().
> 
> That way, userspace can at least process the data which _was_
> transferred, before retrying and then handling the fault.  With the
> proposed interface this is not possible, and the data is lost.

Perhaps because of how I'm using the interface for MPI I
was thinking that if it fails at this point due to EFAULT then
its an application error and that the partial read/write wouldn't be
used. But I see your point could be true for other uses of the interface
and the next version will return partial read/write information all the
way back to userspace.

> And note that the function's kerneldoc doesn't document the return
> value at all!  kerneldoc sucks that way - there's no formal place in
> the template, so people often ignore this important part of the
> function interface.  Similarly, there's no kerneldoc template for
> preconditions such as irq on/off, locks which must be held, etc.
> So they don't get documented.  But they're part of the interface.
> 

Ok. Will be fixed in next version

> > nr_pages_to_copy);
> > +		start_offset = 0;
> > +
> > +		if (rc < 0)
> > +			return rc;
> 
> It's propagated here.
> 
> (CodingStyleNit: it's conventional to put {} around the single-line
> block in this case, btw).
> 

Fixed now.

> > +
> > +	if (nr_pages == 0)
> > +		return 0;
> > +
> > +	/* For reliability don't try to kmalloc more than 2 pages
> > worth */
> > +	process_pages = kmalloc(min_t(size_t,
> > PVM_MAX_KMALLOC_PAGES,
> > +				      sizeof(struct pages
> > *)*nr_pages),
> > +				GFP_KERNEL);
> 
> You might get some speed benefit by optimising for the small copies
> here.  Define a local on-stack array of N page*'s and point
> process_pages at that if the number of pages is <= N.  Saves a
> malloc/free and is more cache-friendly.  But only if the result is
> measurable!

ok. will do some benchmarking to see if its worth it.

> > +
> > +static ssize_t process_vm_rw_check_iovecs(pid_t pid,
> > +					  const struct iovec
> > __user *lvec,
> > +					  unsigned long liovcnt,
> > +					  const struct iovec
> > __user *rvec,
> > +					  unsigned long riovcnt,
> > +					  unsigned long flags, int
> > vm_write)
> 
> I'm allergic to functions with "check" in their name.  Check for what?
> 
> And one would expect a check_foo() to return a bool or an errno.  This
> one returns a ssize_t!  Weird!  Interface documentation, please. 
> Including return value semantics ;)

That part was split out from the main function because the iovec
checks are different for 32 bit compatibility. Have renamed:

process_vm_rw -> process_vm_rw_core 
process_vm_rw_check_iovecs -> process_vm_rw
compat_process_vm_rw_check_iovecs -> compat_process_vm_rw

...and added more doco - with return value semantics :-)

Chris
-- 
cyeoh@...ibm.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists