lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 1 Jan 2018 13:08:54 -0500
From:   Ilia Mirkin <imirkin@...m.mit.edu>
To:     Mike Galbraith <efault@....de>
Cc:     Christian König <christian.koenig@....com>,
        Michel Dänzer <michel@...nzer.net>,
        Tobias Klausmann <tobias.johannes.klausmann@....thm.de>,
        nouveau <nouveau@...ts.freedesktop.org>,
        LKML <linux-kernel@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        Ben Skeggs <bskeggs@...hat.com>
Subject: Re: nouveau. swiotlb: coherent allocation failed for device
 0000:01:00.0 size=2097152

On Sun, Dec 31, 2017 at 3:53 PM, Mike Galbraith <efault@....de> wrote:
> On Sun, 2017-12-31 at 13:27 -0500, Ilia Mirkin wrote:
>> On Tue, Dec 19, 2017 at 8:45 AM, Christian König
>> <ckoenig.leichtzumerken@...il.com> wrote:
>> > Am 19.12.2017 um 11:39 schrieb Michel Dänzer:
>> >>
>> >> On 2017-12-19 11:37 AM, Michel Dänzer wrote:
>> >>>
>> >>> On 2017-12-18 08:01 PM, Tobias Klausmann wrote:
>> >>>>
>> >>>> On 12/18/17 7:06 PM, Mike Galbraith wrote:
>> >>>>>
>> >>>>> Greetings,
>> >>>>>
>> >>>>> Kernel bound workloads seem to trigger the below for whatever reason.
>> >>>>>    I only see this when beating up NFS.  There was a kworker wakeup
>> >>>>> latency issue, but with a bandaid applied to fix that up, I can still
>> >>>>> trigger this.
>> >>>>
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> i have seen this one as well with my system, but i could not find an
>> >>>> easy way to trigger it for bisecting purpose. If you can trigger it
>> >>>> conveniently, a bisect would be nice!
>> >>>
>> >>> I'm seeing this (with the amdgpu and radeon drivers) when restic takes a
>> >>> backup, creating memory pressure. I happen to have just finished
>> >>> bisecting, the result is:
>> >>>
>> >>> 648bc3574716400acc06f99915815f80d9563783 is the first bad commit
>> >>> commit 648bc3574716400acc06f99915815f80d9563783
>> >>> Author: Christian König <christian.koenig@....com>
>> >>> Date:   Thu Jul 6 09:59:43 2017 +0200
>> >>>
>> >>>      drm/ttm: add transparent huge page support for DMA allocations v2
>> >>>
>> >>>      Try to allocate huge pages when it makes sense.
>> >>>
>> >>>      v2: fix comment and use ifdef
>> >>>
>> >>>
>> >> BTW, I haven't noticed any bad effects other than the dmesg splats, so
>> >> maybe it's just noise about transient failures for which there is a
>> >> proper fallback in place.
>> >
>> >
>> > Yeah, I think that is exactly what happens here.
>> >
>> > We try to allocate a huge page, but fail and so fall back to using multiple
>> > 4k pages instead.
>> >
>> > Going to send out a patch to suppress the warning.
>>
>> Hi Christian,
>>
>> Did you ever send out such a patch? I didn't see one on the list, but
>> perhaps I missed it. One definitely hasn't made it upstream yet. (I
>> just hit the issue myself with Linus's tree from last night.)
>
> Actually, that wants a bit more methinks, because while the stack dump
> goes away, you still get spammed, it just comes in smaller chunks.

OK, well this has to either be fixed or reverted. Right now it's
complaining all the time for me after like a day of uptime.

  -ilia

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ