lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKb7Uvj6FFRK6dtvR7xpfZDzs0o3QKHq7ESueBBmr_VnQt6XDQ@mail.gmail.com>
Date:   Sat, 28 Apr 2018 12:30:37 -0400
From:   Ilia Mirkin <imirkin@...m.mit.edu>
To:     Michel Dänzer <michel@...nzer.net>
Cc:     Christian König <christian.koenig@....com>,
        amd-gfx mailing list <amd-gfx@...ts.freedesktop.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer <michel@...nzer.net> wrote:
> From: Michel Dänzer <michel.daenzer@....com>
>
> Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
> try to allocate huge pages. However, not all drivers can take advantage
> of huge pages, but they would incur the overhead for allocating and
> freeing them anyway.
>
> Now, drivers which can take advantage of huge pages need to set the new
> flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
> no longer incur any overhead for allocating or freeing huge pages.
>
> v2:
> * Also guard swapping of consecutive pages in ttm_get_pages
> * Reword commit log, hopefully clearer now
>
> Cc: stable@...r.kernel.org
> Signed-off-by: Michel Dänzer <michel.daenzer@....com>

Both I and lots of other people, based on reports, are still seeing
plenty of issues with this as late as 4.16.4. Admittedly I'm on
nouveau, but others have reported issues with radeon/amdgpu as well.
It's been going on since the feature was merged in v4.15, with what
seems like little investigation from the authors introducing the
feature.

We now have *two* broken releases, v4.15 and v4.16 (anything that
spews error messages and stack traces ad-infinitum in dmesg is, by
definition, broken). You're putting this behind a flag now (finally),
but should it be enabled anywhere? Why is it being flipped on for
amdgpu by default, despite the still-existing problems?

Reverting this feature without just resetting back to the code in
v4.14 is painful, but why make Joe User suffer by enabling it while
you're still working out the kinks?

  -ilia

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ