lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 20 Dec 2016 09:42:46 -0500
From:   Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To:     Juergen Gross <jgross@...e.com>
Cc:     Chris Wilson <chris@...is-wilson.co.uk>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        dri-devel@...ts.freedesktop.org,
        intel-gfx <intel-gfx@...ts.freedesktop.org>, airlied@...ux.ie,
        jani.nikula@...ux.intel.com, daniel.vetter@...el.com,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>
Subject: Re: i915 regression in kernel 4.10

On Mon, Dec 19, 2016 at 03:16:44PM +0100, Juergen Gross wrote:
> On 19/12/16 13:29, Chris Wilson wrote:
> > On Mon, Dec 19, 2016 at 12:39:16PM +0100, Juergen Gross wrote:
> >> With recent 4.10 kernel the graphics isn't coming up under Xen. First
> >> failure message is:
> >>
> >> [   46.656649] i915 0000:00:02.0: swiotlb buffer is full (sz: 1630208 bytes)
> > 
> > Do we get a silent failure? i915_gem_gtt_prepare_pages() is where we
> > call dma_map_sg() and pass the sg to swiotlb (in this case) for
> > remapping, and we do check for an error value of 0. After that error,
> > SWIOTLB_MAP_ERROR is propagated back and converted to 0 for
> > dma_map_sg(). That looks valid, and we should report ENOMEM back to the
> > caller.
> > 
> >> Later I see splats like:
> >>
> >> [   49.393583] general protection fault: 0000 [#1] SMP
> > 
> > What was the faulting address? RAX is particularly non-pointer-like so I
> > wonder if we walked onto an uninitialised portion of the sgtable. We may
> > have tripped over a bug in our sg_page iterator.
> 
> During the bisect process there have been either GP or NULL pointer
> dereferences or other page faults. Typical addresses where:
> 
> xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0000000000000018
> xen_swiotlb_unmap_sg_attrs+0x1f/0x50: access to 0000000003020118
> 
> > 
> > The attached patch should prevent an early ENOMEM following the swiotlb
> > allocation failure. But I suspect that we will still be tripping up the
> > failure in the sg walker when binding to the GPU.
> > -Chris
> > 
> 
> The patch is working not too bad. :-)
> 
> Still several "swiotlb buffer is full" messages (some with sz:, most
> without), but no faults any more (neither GP nor NULL pointer
> dereference). Graphical login is working now.


I think I know why. The optimization that was added assumes that
bus addresses is the same as physical address. Hence it packs all
of the virtual addresses in the sg, and hands it off to SWIOTLB
which walks each one and realizes that it has to use the bounce
buffer.

I am wondering if would make sense to pull 'swiotlb_max_size' inside
of SWIOTLB and make it an library-ish - so Xen-SWIOTLB can register
as well and report say that it can only provide one page
(unless it is running under baremtal).

Or make the usage of 'max_segement' and 'page_to_pfn(page) != last_pfn + 1'
in i915_gem_object_Get_pages_gtt use something similar to xen_biovec_phys_mergeable?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ