lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YmlbXoeaVds/bjYn@phenom.ffwll.local>
Date:   Wed, 27 Apr 2022 17:03:58 +0200
From:   Daniel Vetter <daniel@...ll.ch>
To:     Dmitry Osipenko <dmitry.osipenko@...labora.com>
Cc:     Thomas Zimmermann <tzimmermann@...e.de>,
        David Airlie <airlied@...ux.ie>,
        Gerd Hoffmann <kraxel@...hat.com>,
        Gurchetan Singh <gurchetansingh@...omium.org>,
        Chia-I Wu <olvaffe@...il.com>, Daniel Vetter <daniel@...ll.ch>,
        Daniel Almeida <daniel.almeida@...labora.com>,
        Gert Wollny <gert.wollny@...labora.com>,
        Gustavo Padovan <gustavo.padovan@...labora.com>,
        Daniel Stone <daniel@...ishbar.org>,
        Tomeu Vizoso <tomeu.vizoso@...labora.com>,
        Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
        Maxime Ripard <mripard@...nel.org>,
        Rob Herring <robh@...nel.org>,
        Steven Price <steven.price@....com>,
        Alyssa Rosenzweig <alyssa.rosenzweig@...labora.com>,
        Rob Clark <robdclark@...il.com>,
        Emil Velikov <emil.l.velikov@...il.com>,
        Robin Murphy <robin.murphy@....com>,
        Dmitry Osipenko <digetx@...il.com>,
        linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH v4 11/15] drm/shmem-helper: Add generic memory shrinker

On Tue, Apr 19, 2022 at 11:40:41PM +0300, Dmitry Osipenko wrote:
> On 4/19/22 10:22, Thomas Zimmermann wrote:
> > Hi
> > 
> > Am 18.04.22 um 00:37 schrieb Dmitry Osipenko:
> >> Introduce a common DRM SHMEM shrinker. It allows to reduce code
> >> duplication among DRM drivers that implement theirs own shrinkers.
> >> This is initial version of the shrinker that covers basic needs of
> >> GPU drivers, both purging and eviction of shmem objects are supported.
> >>
> >> This patch is based on a couple ideas borrowed from Rob's Clark MSM
> >> shrinker and Thomas' Zimmermann variant of SHMEM shrinker.
> >>
> >> In order to start using DRM SHMEM shrinker drivers should:
> >>
> >> 1. Implement new purge(), evict() + swap_in() GEM callbacks.
> >> 2. Register shrinker using drm_gem_shmem_shrinker_register(drm_device).
> >> 3. Use drm_gem_shmem_set_purgeable_and_evictable(shmem) and alike API
> >>     functions to activate shrinking of GEMs.
> >>
> >> Signed-off-by: Daniel Almeida <daniel.almeida@...labora.com>
> >> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@...labora.com>
> >> ---
> >>   drivers/gpu/drm/drm_gem_shmem_helper.c | 765 ++++++++++++++++++++++++-
> >>   include/drm/drm_device.h               |   4 +
> >>   include/drm/drm_gem.h                  |  35 ++
> >>   include/drm/drm_gem_shmem_helper.h     | 105 +++-
> >>   4 files changed, 877 insertions(+), 32 deletions(-)
> ...
> >> @@ -172,6 +172,41 @@ struct drm_gem_object_funcs {
> >>        * This is optional but necessary for mmap support.
> >>        */
> >>       const struct vm_operations_struct *vm_ops;
> >> +
> >> +    /**
> >> +     * @purge:
> >> +     *
> >> +     * Releases the GEM object's allocated backing storage to the
> >> system.
> >> +     *
> >> +     * Returns the number of pages that have been freed by purging
> >> the GEM object.
> >> +     *
> >> +     * This callback is used by the GEM shrinker.
> >> +     */
> >> +    unsigned long (*purge)(struct drm_gem_object *obj);

Hm I feel like drivers shouldn't need to know the difference here?

Like shmem helpers can track what's purgeable, and for eviction/purging
the driver callback should do the same?

The only difference is when we try to re-reserve the backing storage. When
the object has been evicted that should suceed, but when the object is
purged that will fail.

That's the difference between evict and purge for drivers?

> >> +
> >> +    /**
> >> +     * @evict:
> >> +     *
> >> +     * Unpins the GEM object's allocated backing storage, allowing
> >> shmem pages
> >> +     * to be swapped out.
> > 
> > What's the difference to the existing unpin() callback?
> 
> Drivers need to do more than just unpinning pages when GEMs are evicted.
> Unpinning is only a part of the eviction process. I'll improve the
> doc-comment in v5.
> 
> For example, for VirtIO-GPU driver we need to to detach host from the
> guest's memory before pages are evicted [1].
> 
> [1]
> https://gitlab.collabora.com/dmitry.osipenko/linux-kernel-rd/-/blob/932eb03198bce3a21353b09ab71e95f1c19b84c2/drivers/gpu/drm/virtio/virtgpu_object.c#L145
> 
> In case of Panfrost driver, we will need to remove mappings before pages
> are evicted.

It might be good to align this with ttm, otoh that all works quite a bit
differently for ttm since ttm supports buffer moves and a lot more fancy
stuff.

I'm bringing this up since I have this fancy idea that eventually we could
glue shmem helpers into ttm in some cases for managing buffers when they
sit in system memory (as opposed to vram).

> >> +     *
> >> +     * Returns the number of pages that have been unpinned.
> >> +     *
> >> +     * This callback is used by the GEM shrinker.
> >> +     */
> >> +    unsigned long (*evict)(struct drm_gem_object *obj);
> >> +
> >> +    /**
> >> +     * @swap_in:
> >> +     *
> >> +     * Pins GEM object's allocated backing storage if it was
> >> previously evicted,
> >> +     * moving swapped out pages back to memory.
> >> +     *
> >> +     * Returns 0 on success, or -errno on error.
> >> +     *
> >> +     * This callback is used by the GEM shrinker.
> >> +     */
> >> +    int (*swap_in)(struct drm_gem_object *obj);
> > 
> > Why do you need swap_in()? This can be done on-demand as part of a pin
> > or vmap operation.
> 
> Similarly to the unpinning, the pining of pages is only a part of what
> needs to be done for GPU drivers. Besides of returning pages back to
> memory, we also need to make them accessible to GPU and this is a
> driver-specific process. This why we need the additional callbacks.

This is a bit much midlayer. The way this works in ttm is you reserve all
the objects you need (which makes sure they're physically available
again), and then the driver goes through and makes sure the page tables
are all set up again.

Once you get towards gpu vm that's really the only approach, since your
swap_in has no idea for which vm it needs to restore pagetables (and
restoring it for all is a bit meh).

If drivers want to optimize this they can adjust/set any tracking
information from their evict callback as needed.

> 
> >>   };
> >>     /**
> >> diff --git a/include/drm/drm_gem_shmem_helper.h
> >> b/include/drm/drm_gem_shmem_helper.h
> >> index 70889533962a..a65557b446e6 100644
> >> --- a/include/drm/drm_gem_shmem_helper.h
> >> +++ b/include/drm/drm_gem_shmem_helper.h
> >> @@ -6,6 +6,7 @@
> >>   #include <linux/fs.h>
> >>   #include <linux/mm.h>
> >>   #include <linux/mutex.h>
> >> +#include <linux/shrinker.h>
> >>     #include <drm/drm_file.h>
> >>   #include <drm/drm_gem.h>
> >> @@ -15,8 +16,18 @@
> >>   struct dma_buf_attachment;
> >>   struct drm_mode_create_dumb;
> >>   struct drm_printer;
> >> +struct drm_device;
> >>   struct sg_table;
> >>   +enum drm_gem_shmem_pages_state {
> >> +    DRM_GEM_SHMEM_PAGES_STATE_PURGED = -2,
> >> +    DRM_GEM_SHMEM_PAGES_STATE_EVICTED = -1,
> >> +    DRM_GEM_SHMEM_PAGES_STATE_UNPINNED = 0,
> >> +    DRM_GEM_SHMEM_PAGES_STATE_PINNED = 1,
> >> +    DRM_GEM_SHMEM_PAGES_STATE_EVICTABLE = 2,
> >> +    DRM_GEM_SHMEM_PAGES_STATE_PURGEABLE = 3,
> >> +};
> > 
> > These states can be detected by looking at the vmap and pin refcounts.
> > No need to store them explicitly.
> 
> I'll try to revisit this, but I was finding that it's much more
> difficult to follow and debug code without the explicit states.

purgeable/purged needs some state, but pinned shouldn't be duplicated, so
I concur here a bit.

> > In your patch, they also come with a
> > big zoo of trivial helpers. None of that seems necessary AFAICT.
> 
> There are couple functions which could be squashed, although this may
> hurt readability of the code a tad. I'll try to take another look at
> this for v5.
> 
> > What's the difference between purge and evict BTW?
> 
> The evicted pages are moved out from memory to a SWAP partition or file.
> 
> The purged pages are destroyed permanently.
> 
> >> +
> >>   /**
> >>    * struct drm_gem_shmem_object - GEM object backed by shmem
> >>    */
> >> @@ -43,8 +54,8 @@ struct drm_gem_shmem_object {
> >>        * @madv: State for madvise
> >>        *
> >>        * 0 is active/inuse.
> >> +     * 1 is not-needed/can-be-purged
> >>        * A negative value is the object is purged.
> >> -     * Positive values are driver specific and not used by the helpers.
> >>        */
> >>       int madv;
> >>   @@ -91,6 +102,40 @@ struct drm_gem_shmem_object {
> >>        * @map_wc: map object write-combined (instead of using shmem
> >> defaults).
> >>        */
> >>       bool map_wc;
> >> +
> >> +    /**
> >> +     * @eviction_disable_count:
> >> +     *
> >> +     * The shmem pages are disallowed to be evicted by the memory
> >> shrinker
> >> +     * while count is non-zero. Used internally by memory shrinker.
> >> +     */
> >> +    unsigned int eviction_disable_count;
> >> +
> >> +    /**
> >> +     * @purging_disable_count:
> >> +     *
> >> +     * The shmem pages are disallowed to be purged by the memory
> >> shrinker
> >> +     * while count is non-zero. Used internally by memory shrinker.
> >> +     */
> >> +    unsigned int purging_disable_count;

What are these disable counts for?

The way purgeable works in other drivers is that userspace sets purgeable
or not, and it's up to userspace to not make a mess of this.

There's also some interactions, and I guess a bunch of drivers get this
wrong in funny ways. Not sure how to best clean this up.

- Once you have a shrinker/dynamic memory management you should _not_ pin
  pages, except when it's truly permanent like for scanout. Instead
  drivers should attach dma_fence to the dma_resv to denote in-flight
  access.

- A pinned buffer object is not allowed to be put into purgeable state,
  and a bo in purgeable state should not be allowed to be pinned.

- Drivers need to hold dma_resv_lock for long enough in their command
  submission, i.e. from the point where the reserve the buffers and make
  sure that mappings exists, to the point where the request is submitted
  to hw or drm/sched and fences are installed.

But I think a lot of current shmem users just pin as part of execbuf, so
this won't work quite so well right out of the box.

Anyway with that design I don't think there should ever be a need to
disable shrinking.

> >> +
> >> +    /**
> >> +     * @pages_state: Current state of shmem pages. Used internally by
> >> +     * memory shrinker.
> >> +     */
> >> +    enum drm_gem_shmem_pages_state pages_state;
> >> +
> >> +    /**
> >> +     * @evicted: True if shmem pages were evicted by the memory
> >> shrinker.
> >> +     * Used internally by memory shrinker.
> >> +     */
> >> +    bool evicted;
> >> +
> >> +    /**
> >> +     * @pages_shrinkable: True if shmem pages can be evicted or purged
> >> +     * by the memory shrinker. Used internally by memory shrinker.
> >> +     */
> >> +    bool pages_shrinkable;
> > 
> > As commented before, this state can be foundby looking at existing
> > fields. No need to store it separately.
> 
> When we're transitioning from "evictable" to a "purgeable" state, we
> must not add pages twice to the "shrinkable_count" variable. Hence this
> is not a state, but a variable which prevents the double accounting of
> the pages. Please see drm_gem_shmem_add_pages_to_shrinker() in this patch.
> 
> Perhaps something like "pages_accounted_by_shrinker" could be a better
> name for the variable. I'll revisit this for v5.

Hm not sure we need to account this? Usually the shrinker just counts when
it's asked to do so, not practively maintain that count. Once you start
shrinking burning cpu time is generally not too terrible.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ