[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090622172720.GC15228@redhat.com>
Date: Mon, 22 Jun 2009 20:27:21 +0300
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Gregory Haskins <ghaskins@...ell.com>
Cc: avi@...hat.com, kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
mtosatti@...hat.com, paulmck@...ux.vnet.ibm.com, markmc@...hat.com
Subject: Re: [PATCH RFC] pass write value to in_range pointers
On Mon, Jun 22, 2009 at 12:29:10PM -0400, Gregory Haskins wrote:
> Michael S. Tsirkin wrote:
> > On Mon, Jun 22, 2009 at 11:45:00AM -0400, Gregory Haskins wrote:
> >
> >> Michael S. Tsirkin wrote:
> >>
> >>> It seems that a lot of complexity and trickiness with iosignalfd is
> >>> handling the group/item relationship, which comes about because kvm does
> >>> not currently let a device on the bus claim a write transaction based on the
> >>> value written. This could be greatly simplified if the value written
> >>> was passed to the in_range check for write operation. We could then
> >>> simply make each kvm_iosignalfd a device on the bus.
> >>>
> >>> What does everyone think of the following lightly tested patch?
> >>>
> >>>
> >> Hi Michael,
> >> Its interesting, but I am not convinced its necessary. We created the
> >> group/item layout because iosignalfds are unique in that they are
> >> probably the only IO device that wants to do some kind of address
> >> aliasing.
> >>
> >
> > We actually already have aliasing: is_write flag is used for this
> > purpose.
>
> Yes, but read/write address aliasing is not the same thing is
> multi-match data aliasing.
What's the big difference?
> Besides, your proposal also breaks
s/break/removes limitation/ :)
> some of
> the natural relationship models
> (e.g. all the aliased iosignal_items
> always belong to the same underlying device. io_bus entries have an
> arbitrary topology).
iosignal_item is an artifact, they are not seen by user -
they are just a work around an API limitation.
And they are only grouped if the same PIO offset is used for all accesses.
Why is not always the case. If a device uses several PIO offsets
(as virtio does), you create separate devices for a single guest device too.
>
> > Actually, it's possible to remove is_write by passing
> > a null pointer in write_val for reads. I like this a bit less as
> > the code generated is less compact ... Avi, what do you think?
> >
> >
> >> With what you are proposing here, you are adding aliasing
> >> support to the general infrastructure which I am not (yet) convinced is
> >> necessary.
> >>
> >
> > Infrastructure is a big name for something that adds a total of 10 lines to kvm.
> > And it should at least halve the size of your 450-line patch.
> >
>
> Your patch isn't complete until some critical missing features are added
> to io_bus, though, so its not really just 10 lines.
> For one, it will
> need to support much more than 6 devices.
Isn't this like a #define change? With the item patch we are still
limited in the number of groups we can create.
What we gain is a simple array/list instead of a tree of
linked lists that makes cheshire cheese out of CPU data cache.
> It will also need to support
> multiple matches.
What, signal many fds on the same address/value pair?
I see this as a bug. Why is this a good thing to support?
Just increases the chance of leaking this fd.
> Also you are proposing an general interface change
> that doesn't make sense to all but one device type. So now every
> io-device developer that comes along will scratch their head at what to
> do with that field.
What do they do with is_write now? Ignore it. It's used in a whole
of 1 place.
>
> None of these are insurmountable hurdles, but my point is that today the
> complexity is encapsulated in the proper place IMO.
It's better to get rid of complexity than encapsulate it.
> E.g. The one and
> only device that cares to do this "weird" thing handles it behind an
> interface that makes sense to all parties involved.
> >
> >> If there isn't a use case for other devices to have
> >> aliasing, I would think the logic is best contained in iosignalfd. Do
> >> you have something in mind?
> >>
> >
> > One is enough :)
> >
>
> I am not convinced yet. ;) It appears to me that we are leaking
> iosignalfd-isms into the general code. If there is another device that
> wants to do something similar, ok. But I can't think of any.
You never know. is_write was used by a whole of 1 user: coalesced_mmio,
then your patch comes along ...
> > Seriously, do you see that this saves you all of RCU, linked lists and
> > counters?
>
> Well, also keep in mind we will probably be converting io_bus to RCU
> very soon, so we are going the opposite direction ;)
>
> Kind Regards,
> -Greg
>
Same direction. Let's put RCU in iobus, we don't need another one on
top of it. That's encapsulating complexity.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists