linux-kernel - Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YrYdVRMjK4YS33hO@phenom.ffwll.local>
Date:   Fri, 24 Jun 2022 22:23:49 +0200
From:   Daniel Vetter <daniel@...ll.ch>
To:     Rob Clark <robdclark@...il.com>
Cc:     Dmitry Osipenko <dmitry.osipenko@...labora.com>,
        David Airlie <airlied@...ux.ie>,
        Gerd Hoffmann <kraxel@...hat.com>,
        Gurchetan Singh <gurchetansingh@...omium.org>,
        Chia-I Wu <olvaffe@...il.com>, Daniel Vetter <daniel@...ll.ch>,
        Daniel Almeida <daniel.almeida@...labora.com>,
        Gert Wollny <gert.wollny@...labora.com>,
        Gustavo Padovan <gustavo.padovan@...labora.com>,
        Daniel Stone <daniel@...ishbar.org>,
        Tomeu Vizoso <tomeu.vizoso@...labora.com>,
        Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
        Maxime Ripard <mripard@...nel.org>,
        Thomas Zimmermann <tzimmermann@...e.de>,
        Rob Herring <robh@...nel.org>,
        Steven Price <steven.price@....com>,
        Alyssa Rosenzweig <alyssa.rosenzweig@...labora.com>,
        Emil Velikov <emil.l.velikov@...il.com>,
        Robin Murphy <robin.murphy@....com>,
        Qiang Yu <yuq825@...il.com>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Christian König <christian.koenig@....com>,
        "Pan, Xinhui" <Xinhui.Pan@....com>,
        Thierry Reding <thierry.reding@...il.com>,
        Tomasz Figa <tfiga@...omium.org>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Alex Deucher <alexander.deucher@....com>,
        Jani Nikula <jani.nikula@...ux.intel.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org,
        Dmitry Osipenko <digetx@...il.com>,
        linux-tegra@...r.kernel.org, linux-media@...r.kernel.org,
        linaro-mm-sig@...ts.linaro.org, amd-gfx@...ts.freedesktop.org,
        intel-gfx@...ts.freedesktop.org, kernel@...labora.com
Subject: Re: [PATCH v6 17/22] drm/shmem-helper: Add generic memory shrinker

On Mon, Jun 20, 2022 at 08:18:04AM -0700, Rob Clark wrote:
> On Mon, Jun 20, 2022 at 7:09 AM Dmitry Osipenko
> <dmitry.osipenko@...labora.com> wrote:
> >
> > On 6/19/22 20:53, Rob Clark wrote:
> > ...
> > >> +static unsigned long
> > >> +drm_gem_shmem_shrinker_count_objects(struct shrinker *shrinker,
> > >> +                                    struct shrink_control *sc)
> > >> +{
> > >> +       struct drm_gem_shmem_shrinker *gem_shrinker = to_drm_shrinker(shrinker);
> > >> +       struct drm_gem_shmem_object *shmem;
> > >> +       unsigned long count = 0;
> > >> +
> > >> +       if (!mutex_trylock(&gem_shrinker->lock))
> > >> +               return 0;
> > >> +
> > >> +       list_for_each_entry(shmem, &gem_shrinker->lru_evictable, madv_list) {
> > >> +               count += shmem->base.size;
> > >> +
> > >> +               if (count >= SHRINK_EMPTY)
> > >> +                       break;
> > >> +       }
> > >> +
> > >> +       mutex_unlock(&gem_shrinker->lock);
> > >
> > > As I mentioned on other thread, count_objects, being approximate but
> > > lockless and fast is the important thing.  Otherwise when you start
> > > hitting the shrinker on many threads, you end up serializing them all,
> > > even if you have no pages to return to the system at that point.
> >
> > Daniel's point for dropping the lockless variant was that we're already
> > in trouble if we're hitting shrinker too often and extra optimizations
> > won't bring much benefits to us.
> 
> At least with zram swap (which I highly recommend using even if you
> are not using a physical swap file/partition), swapin/out is actually
> quite fast.  And if you are leaning on zram swap to fit 8GB of chrome
> browser on a 4GB device, the shrinker gets hit quite a lot.  Lower
> spec (4GB RAM) chromebooks can be under constant memory pressure and
> can quite easily get into a situation where you are hitting the
> shrinker on many threads simultaneously.  So it is pretty important
> for all shrinkers in the system (not just drm driver) to be as
> concurrent as possible.  As long as you avoid serializing reclaim on
> all the threads, performance can still be quite good, but if you don't
> performance will fall off a cliff.
> 
> jfwiw, we are seeing pretty good results (iirc 40-70% increase in open
> tab counts) with the combination of eviction + multigen LRU[1] +
> sizing zram swap to be 2x physical RAM
> 
> [1] https://lwn.net/Articles/856931/
> 
> > Alright, I'll add back the lockless variant (or will use yours
> > drm_gem_lru) in the next revision. The code difference is very small
> > after all.
> >
> > ...
> > >> +               /* prevent racing with the dma-buf importing/exporting */
> > >> +               if (!mutex_trylock(&gem_shrinker->dev->object_name_lock)) {
> > >> +                       *lock_contention |= true;
> > >> +                       goto resv_unlock;
> > >> +               }
> > >
> > > I'm not sure this is a good idea to serialize on object_name_lock.
> > > Purgeable buffers should never be shared (imported or exported).  So
> > > at best you are avoiding evicting and immediately swapping back in, in
> > > a rare case, at the cost of serializing multiple threads trying to
> > > reclaim pages in parallel.
> >
> > The object_name_lock shouldn't cause contention in practice. But objects
> > are also pinned on attachment, hence maybe this lock is indeed
> > unnecessary.. I'll re-check it.
> 
> I'm not worried about contention with export/import/etc, but
> contention between multiple threads hitting the shrinker in parallel.
> I guess since you are using trylock, it won't *block* the other
> threads hitting shrinker, but they'll just end up looping in
> do_shrink_slab() because they are hitting contention.
> 
> I'd have to do some experiments to see how it works out in practice,
> but my gut feel is that it isn't a good idea

Yeah trylock on anything else than the object lock is No Good in the
shrinker. And it really shouldn't be needed, since import/export should
pin stuff as needed. Which should be protected by the dma_resv object
lock. If not, we need to fix that.

Picking a random drm-internal lock like this is definitely no good design.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch