lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87k3rcy2y2.fsf@rustcorp.com.au>
Date:	Thu, 17 Jan 2013 12:40:29 +1030
From:	Rusty Russell <rusty@...tcorp.com.au>
To:	"Michael S. Tsirkin" <mst@...hat.com>
Cc:	Sjur Brændeland <sjurbren@...il.com>,
	Linus Walleij <linus.walleij@...aro.org>,
	virtualization@...ts.linux-foundation.org,
	LKML <linux-kernel@...r.kernel.org>,
	Sjur Brændeland <sjur.brandeland@...ricsson.com>,
	Ohad Ben-Cohen <ohad@...ery.com>
Subject: Re: [RFCv2 00/12] Introduce host-side virtio queue and CAIF Virtio.

"Michael S. Tsirkin" <mst@...hat.com> writes:
> On Wed, Jan 16, 2013 at 01:43:32PM +1030, Rusty Russell wrote:
>> "Michael S. Tsirkin" <mst@...hat.com> writes:
>> >> +static int resize_iovec(struct vringh_iov *iov, gfp_t gfp)
>> >> +{
>> >> +	struct iovec *new;
>> >> +	unsigned int new_num = iov->max * 2;
>> >
>> > We must limit this I think, this is coming
>> > from userspace. How about UIO_MAXIOV?
>> 
>> We limit it to the ring size already;
>
> 1. do we limit it in case there's a loop in the descriptor ring?

Yes, we catch loops as per normal (simple counter):

		if (count++ == vrh->vring.num) {
			vringh_bad("Descriptor loop in %p", descs);
			err = -ELOOP;
			goto fail;
		}

> 2. do we limit it in case there are indirect descriptors?
> I guess I missed where we do this could you point this out to me?

Well, the total is limited above, indirect descriptors or no (since we
handle them inline).  Because each indirect descriptor must contain one
descriptor (we always grab descriptor 0), the loop must terminate.

>> UIO_MAXIOV is a weird choice here.
>
> It's kind of forced by the need to pass the iov on to the linux kernel,
> so we know that any guest using more is broken on existing hypervisors.
>
> Ring size is somewhat arbitrary too, isn't it?  A huge ring where we
> post lots of short descriptors (e.g. RX buffers) seems like a valid thing to do.

Sure, but the ring size is a documented limit (even if indirect
descriptors are used).  I hadn't realized we have an
implementation-specific limit of 1024 descriptors: I shall add this.
While noone reasonable will exceed that, we should document it somewhere
in the spec.

>> > I really dislike raw pointers that we must never dereference.
>> > Since we are forcing everything to __user anyway, why don't we
>> > tag all addresses as __user? The kernel users of this API
>> > can cast that away, this will keep the casts to minimum.
>> >
>> > Failing that, we can add our own class
>> > # define __virtio         __attribute__((noderef, address_space(2)))
>> 
>> In this case, perhaps we should leave addr as a u64?
>
> Point being? All users will cast to a pointer.
> It seems at first passing in raw pointers is cleaner,
> but it turns out in the API we are passing iovs around,
> and they are __user anyway.
> So using raw pointers here does not buy us anything,
> so let's use __user and gain extra static checks at no cost.

I resist sprinkling __user everywhere because it's *not* always user
addresses, and it's deeply misleading to anyone reading it.  I'd rather
have it in one place with a big comment.

I can try using a union of kvec and iovec, since they are the same
layout in practice AFAICT.

>> >> +		iov->iov[iov->i].iov_base = (__force __user void *)addr;
>> >> +		iov->iov[iov->i].iov_len = desc.len;
>> >> +		iov->i++;
>> >
>> >
>> > This looks like it won't do the right thing if desc.len spans multiple
>> > ranges. I don't know if this happens in practice but this is something
>> > vhost supports ATM.
>> 
>> Well, kind of.  I assumed that the bool (*getrange)(u64, struct
>> vringh_range *)) callback would meld any adjacent ranges if it needs to.
>
> Confused. If addresses 0 to 0x1000 map to virtual addresses 0 to 0x1000
> and 0x1000 to 0x2000 map to virtual addresses 0x2000 to 0x3000, then
> a single descriptor covering 0 to 0x2000 in guest needs two
> iov entries. What can getrange do about it?

getrange doesn't map virtual to physical, it maps virtual to user.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ