lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 24 Oct 2022 17:22:24 +0200
From:   Niklas Schnelle <schnelle@...ux.ibm.com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     Matthew Rosato <mjrosato@...ux.ibm.com>, iommu@...ts.linux.dev,
        Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
        Robin Murphy <robin.murphy@....com>,
        Gerd Bayer <gbayer@...ux.ibm.com>,
        Pierre Morel <pmorel@...ux.ibm.com>,
        linux-s390@...r.kernel.org, borntraeger@...ux.ibm.com,
        hca@...ux.ibm.com, gor@...ux.ibm.com,
        gerald.schaefer@...ux.ibm.com, agordeev@...ux.ibm.com,
        svens@...ux.ibm.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/5] iommu/s390: Use RCU to allow concurrent domain_list
 iteration

On Fri, 2022-10-21 at 12:04 -0300, Jason Gunthorpe wrote:
> On Fri, Oct 21, 2022 at 05:01:32PM +0200, Niklas Schnelle wrote:
> > On Fri, 2022-10-21 at 10:36 -0300, Jason Gunthorpe wrote:
> > > On Fri, Oct 21, 2022 at 02:08:02PM +0200, Niklas Schnelle wrote:
> > > > On Thu, 2022-10-20 at 08:05 -0300, Jason Gunthorpe wrote:
> > > > > On Thu, Oct 20, 2022 at 10:51:10AM +0200, Niklas Schnelle wrote:
> > > > > 
> > > > > > Ok that makes sense thanks for the explanation. So yes my assessment is
> > > > > > still that in this situation the IOTLB flush is architected to return
> > > > > > an error that we can ignore. Not the most elegant I admit but at least
> > > > > > it's simple. Alternatively I guess we could use call_rcu() to do the
> > > > > > zpci_unregister_ioat() but I'm not sure how to then make sure that a
> > > > > > subsequent zpci_register_ioat() only happens after that without adding
> > > > > > too much more logic.
> > > > > 
> > > > > This won't work either as the domain could have been freed before the
> > > > > call_rcu() happens, the domain needs to be detached synchronously
> > > > > 
> > > > > Jason
> > > > 
> > > > Yeah right, that is basically the same issue I was thinking of for a
> > > > subsequent zpci_register_ioat(). What about the obvious one. Just call
> > > > synchronize_rcu() before zpci_unregister_ioat()?
> > > 
> > > Ah, it can be done, but be prepared to wait >> 1s for synchronize_rcu
> > > to complete in some cases.
> > > 
> > > What you have seems like it could be OK, just deal with the ugly racy
> > > failure
> > > 
> > > Jason
> > 
> > I'd tend to go with synchronize_rcu(). It won't leave us with spurious
> > error logs for the failed IOTLB flushes and as you said one expects
> > detach to be synchronous. I don't think waiting in it will be a
> > problem. But this is definitely something you're more of an expert on
> > so I'll trust your judgement. Looking at other callers of
> > synchronize_rcu() quite a few of them look to be in similar
> > detach/release kind of situations though not sure how frequent and
> > performance critical IOMMU domain detaching is in comparison.
> 
> I would not do it on domain detaching, that is something triggered by
> userspace through VFIO and it could theoritically happen alot, eg in
> vIOMMU scenarios.
> 
> Jason

Thanks for the explanation, still would like to grok this a bit more if
you don't mind. If I do read things correctly synchronize_rcu() should
run in the conext of the VFIO ioctl in this case and shouldn't block
anything else in the kernel, correct? At least that's how I understand
the synchronize_rcu() comments and the fact that e.g.
net/vmw_vsock/virtio_transport.c:virtio_vsock_remove() also does a
synchronize_rcu() and can be triggered from user-space too.

So we're
more worried about user-space getting slowed down rather than a Denial-
of-Service against other kernel tasks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ