linux-kernel - Re: [PATCH v10 01/11] drm/msm/gem: Prevent blocking within shrinker loop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4d6e096b-4f04-5e17-ff23-4842b69fdc95@collabora.com>
Date:   Mon, 27 Feb 2023 07:27:51 +0300
From:   Dmitry Osipenko <dmitry.osipenko@...labora.com>
To:     Thomas Zimmermann <tzimmermann@...e.de>,
        David Airlie <airlied@...il.com>,
        Gerd Hoffmann <kraxel@...hat.com>,
        Gurchetan Singh <gurchetansingh@...omium.org>,
        Chia-I Wu <olvaffe@...il.com>, Daniel Vetter <daniel@...ll.ch>,
        Daniel Almeida <daniel.almeida@...labora.com>,
        Gustavo Padovan <gustavo.padovan@...labora.com>,
        Daniel Stone <daniel@...ishbar.org>,
        Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
        Maxime Ripard <mripard@...nel.org>,
        Rob Clark <robdclark@...il.com>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Christian König <christian.koenig@....com>,
        Qiang Yu <yuq825@...il.com>,
        Steven Price <steven.price@....com>,
        Alyssa Rosenzweig <alyssa.rosenzweig@...labora.com>,
        Rob Herring <robh@...nel.org>, Sean Paul <sean@...rly.run>,
        Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
        Abhinav Kumar <quic_abhinavk@...cinc.com>
Cc:     dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        kernel@...labora.com, virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH v10 01/11] drm/msm/gem: Prevent blocking within shrinker
 loop

On 2/17/23 15:02, Thomas Zimmermann wrote:
> Hi
> 
> Am 08.01.23 um 22:04 schrieb Dmitry Osipenko:
>> Consider this scenario:
>>
>> 1. APP1 continuously creates lots of small GEMs
>> 2. APP2 triggers `drop_caches`
>> 3. Shrinker starts to evict APP1 GEMs, while APP1 produces new purgeable
>>     GEMs
>> 4. msm_gem_shrinker_scan() returns non-zero number of freed pages
>>     and causes shrinker to try shrink more
>> 5. msm_gem_shrinker_scan() returns non-zero number of freed pages again,
>>     goto 4
>> 6. The APP2 is blocked in `drop_caches` until APP1 stops producing
>>     purgeable GEMs
>>
>> To prevent this blocking scenario, check number of remaining pages
>> that GPU shrinker couldn't release due to a GEM locking contention
>> or shrinking rejection. If there are no remaining pages left to shrink,
>> then there is no need to free up more pages and shrinker may break out
>> from the loop.
>>
>> This problem was found during shrinker/madvise IOCTL testing of
>> virtio-gpu driver. The MSM driver is affected in the same way.
>>
>> Reviewed-by: Rob Clark <robdclark@...il.com>
>> Fixes: b352ba54a820 ("drm/msm/gem: Convert to using drm_gem_lru")
>> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@...labora.com>
>> ---
>>   drivers/gpu/drm/drm_gem.c              | 9 +++++++--
>>   drivers/gpu/drm/msm/msm_gem_shrinker.c | 8 ++++++--
>>   include/drm/drm_gem.h                  | 4 +++-
>>   3 files changed, 16 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>> index 59a0bb5ebd85..c6bca5ac6e0f 100644
>> --- a/drivers/gpu/drm/drm_gem.c
>> +++ b/drivers/gpu/drm/drm_gem.c
>> @@ -1388,10 +1388,13 @@ EXPORT_SYMBOL(drm_gem_lru_move_tail);
>>    *
>>    * @lru: The LRU to scan
>>    * @nr_to_scan: The number of pages to try to reclaim
>> + * @remaining: The number of pages left to reclaim
>>    * @shrink: Callback to try to shrink/reclaim the object.
>>    */
>>   unsigned long
>> -drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan,
>> +drm_gem_lru_scan(struct drm_gem_lru *lru,
>> +         unsigned int nr_to_scan,
>> +         unsigned long *remaining,
>>            bool (*shrink)(struct drm_gem_object *obj))
>>   {
>>       struct drm_gem_lru still_in_lru;
>> @@ -1430,8 +1433,10 @@ drm_gem_lru_scan(struct drm_gem_lru *lru,
>> unsigned nr_to_scan,
>>            * hit shrinker in response to trying to get backing pages
>>            * for this obj (ie. while it's lock is already held)
>>            */
>> -        if (!dma_resv_trylock(obj->resv))
>> +        if (!dma_resv_trylock(obj->resv)) {
>> +            *remaining += obj->size >> PAGE_SHIFT;
>>               goto tail;
>> +        }
>>             if (shrink(obj)) {
>>               freed += obj->size >> PAGE_SHIFT;
>> diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c
>> b/drivers/gpu/drm/msm/msm_gem_shrinker.c
>> index 051bdbc093cf..b7c1242014ec 100644
>> --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
>> +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
>> @@ -116,12 +116,14 @@ msm_gem_shrinker_scan(struct shrinker *shrinker,
>> struct shrink_control *sc)
>>       };
>>       long nr = sc->nr_to_scan;
>>       unsigned long freed = 0;
>> +    unsigned long remaining = 0;
>>         for (unsigned i = 0; (nr > 0) && (i < ARRAY_SIZE(stages)); i++) {
>>           if (!stages[i].cond)
>>               continue;
>>           stages[i].freed =
>> -            drm_gem_lru_scan(stages[i].lru, nr, stages[i].shrink);
>> +            drm_gem_lru_scan(stages[i].lru, nr, &remaining,
> 
> This function relies in remaining being pre-initialized. That's not
> obvious and error prone. At least, pass-in something like
> &stages[i].remaining that is then initialized internally by
> drm_gem_lru_scan() to zero. And similar to freed, sum up the individual
> stages' remaining here.
> 
> TBH I somehow don't like the overall design of how all these functions
> interact with each other. But I also can't really point to the actual
> problem. So it's best to take what you have here; maybe with the change
> I proposed.
> 
> Reviewed-by: Thomas Zimmermann <tzimmermann@...e.de>

I had to keep to the remaining being pre-initialized because moving the
initialization was hurting the rest of the code. Though, updated the MSM
patch to use &stages[i].remaining

-- 
Best regards,
Dmitry