[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100602052907.GT8301@sequoia.sous-sol.org>
Date: Tue, 1 Jun 2010 22:29:07 -0700
From: Chris Wright <chrisw@...s-sol.org>
To: Avi Kivity <avi@...hat.com>
Cc: Tom Lyon <pugs@...n-about.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
chrisw@...s-sol.org, joro@...tes.org, hjk@...utronix.de,
gregkh@...e.de, aafabbri@...co.com, scofeldm@...co.com,
alex.williamson@...hat.com
Subject: Re: [PATCH] VFIO driver: Non-privileged user level PCI drivers
* Avi Kivity (avi@...hat.com) wrote:
> On 06/02/2010 12:26 AM, Tom Lyon wrote:
> >
> >I'm not really opposed to multiple devices per domain, but let me point out how I
> >ended up here. First, the driver has two ways of mapping pages, one based on the
> >iommu api and one based on the dma_map_sg api. With the latter, the system
> >already allocates a domain per device and there's no way to control it. This was
> >presumably done to help isolation between drivers. If there are multiple drivers
> >in the user level, do we not want the same isoation to apply to them?
>
> In the case of kvm, we don't want isolation between devices, because
> that doesn't happen on real hardware.
Sure it does. That's exactly what happens when there's an iommu
involved with bare metal.
> So if the guest programs
> devices to dma to each other, we want that to succeed.
And it will as long as ATS is enabled (this is a basic requirement
for PCIe peer-to-peer traffic to succeed with an iommu involved on
bare metal).
That's how things currently are, i.e. we put all devices belonging to a
single guest in the same domain. However, it can be useful to put each
device belonging to a guest in a unique domain. Especially as qemu
grows support for iommu emulation, and guest OSes begin to understand
how to use a hw iommu.
> >Also, domains are not a very scarce resource - my little core i5 has 256,
> >and the intel architecture goes to 64K.
>
> But there is a 0.2% of mapped memory per domain cost for the page
> tables. For the kvm use case, that could be significant since a
> guest may have large amounts of memory and large numbers of assigned
> devices.
>
> >And then there's the fact that it is possible to have multiple disjoint iommus on a system,
> >so it may not even be possible to bring 2 devices under one domain.
>
> That's indeed a deficiency.
Not sure it's a deficiency. Typically to share page table mappings
across multiple iommu's you just have to do update/invalidate to each
hw iommu that is sharing the mapping. Alternatively, you can use more
memory and build/maintain identical mappings (as Tom alludes to below).
> >Given all that, I am inclined to leave it alone until someone has a real problem.
> >Note that not sharing iommu domains doesn't mean you can't share device memory,
> >just that you have to do multiple mappings
>
> I think we do have a real problem (though a mild one).
>
> The only issue I see with deferring the solution is that the API
> becomes gnarly; both the kernel and userspace will have to support
> both APIs forever. Perhaps we can implement the new API but defer
> the actual sharing until later, don't know how much work this saves.
> Or Alex/Chris can pitch in and help.
It really shouldn't be that complicated to create the API to allow for
flexible device <-> domain mappings, so I agree, makes sense to do it
right up front.
thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists