[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091109115548.GA2368@redhat.com>
Date: Mon, 9 Nov 2009 13:55:48 +0200
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Rusty Russell <rusty@...tcorp.com.au>
Cc: netdev@...r.kernel.org, virtualization@...ts.linux-foundation.org,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org, mingo@...e.hu,
linux-mm@...ck.org, akpm@...ux-foundation.org, hpa@...or.com,
gregory.haskins@...il.com, s.hetze@...ux-ag.com,
Daniel Walker <dwalker@...o99.com>,
Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: [PATCHv8 3/3] vhost_net: a kernel-level virtio server
On Mon, Nov 09, 2009 at 04:47:29PM +1030, Rusty Russell wrote:
> Actually, this looks wrong to me:
>
> + case VHOST_SET_VRING_BASE:
> ...
> + vq->avail_idx = vq->last_avail_idx = s.num;
>
> The last_avail_idx is part of the state of the driver. It needs to be saved
> and restored over susp/resume. The only reason it's not in the ring itself
> is because I figured the other side doesn't need to see it (which is true, but
> missed debugging opportunities as well as man-in-the-middle issues like this
> one). I had a patch which put this field at the end of the ring, I might
> resurrect it to avoid this problem. This is backwards compatible with all
> implementations. See patch at end.
>
> I would drop avail_idx altogether: get_user is basically free, and simplifies
> a lot. As most state is in the ring, all you need is an ioctl to save/restore
> the last_avail_idx.
I remembered another reason for caching head in avail_idx. Basically,
avail index could change between when I poll for descriptors and when I
want to notify guest.
So we could have:
- poll descriptors until empty
- notify
detects not empty so does not notify
And the way to solve it would be to return flag from
notify telling us to restart the polling loop.
But, this will be more code, on data path, than
what happens today where I simply keep state
from descriptor polling and use that to notify.
I also suspect that somehow this race in practice can not create
deadlocks ... but I prefer to avoid it, these things are very tricky: if
I see an empty ring, and stop processing descriptors, I want to trigger
notify on empty.
So if we want to avoid keeping "empty" state, IMO the best way would be
to pass a flag to vhost_signal that tells it that ring is empty.
Makes sense?
--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists