lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230914102737.08e61498@collabora.com>
Date:   Thu, 14 Sep 2023 10:27:37 +0200
From:   Boris Brezillon <boris.brezillon@...labora.com>
To:     Dmitry Osipenko <dmitry.osipenko@...labora.com>
Cc:     David Airlie <airlied@...il.com>,
        Gerd Hoffmann <kraxel@...hat.com>,
        Gurchetan Singh <gurchetansingh@...omium.org>,
        Chia-I Wu <olvaffe@...il.com>, Daniel Vetter <daniel@...ll.ch>,
        Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
        Maxime Ripard <mripard@...nel.org>,
        Thomas Zimmermann <tzimmermann@...e.de>,
        Christian König <christian.koenig@....com>,
        Qiang Yu <yuq825@...il.com>,
        Steven Price <steven.price@....com>,
        Emma Anholt <emma@...olt.net>, Melissa Wen <mwen@...lia.com>,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        kernel@...labora.com, virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH v16 15/20] drm/shmem-helper: Add memory shrinker

On Thu, 14 Sep 2023 10:50:32 +0300
Dmitry Osipenko <dmitry.osipenko@...labora.com> wrote:

> On 9/14/23 10:36, Boris Brezillon wrote:
> > On Thu, 14 Sep 2023 07:02:52 +0300
> > Dmitry Osipenko <dmitry.osipenko@...labora.com> wrote:
> >   
> >> On 9/13/23 10:48, Boris Brezillon wrote:  
> >>> On Wed, 13 Sep 2023 03:56:14 +0300
> >>> Dmitry Osipenko <dmitry.osipenko@...labora.com> wrote:
> >>>     
> >>>> On 9/5/23 11:03, Boris Brezillon wrote:    
> >>>>>>                * But
> >>>>>> +		 * acquiring the obj lock in drm_gem_shmem_release_pages_locked() can
> >>>>>> +		 * cause a locking order inversion between reservation_ww_class_mutex
> >>>>>> +		 * and fs_reclaim.
> >>>>>> +		 *
> >>>>>> +		 * This deadlock is not actually possible, because no one should
> >>>>>> +		 * be already holding the lock when drm_gem_shmem_free() is called.
> >>>>>> +		 * Unfortunately lockdep is not aware of this detail.  So when the
> >>>>>> +		 * refcount drops to zero, don't touch the reservation lock.
> >>>>>> +		 */
> >>>>>> +		if (shmem->got_pages_sgt &&
> >>>>>> +		    refcount_dec_and_test(&shmem->pages_use_count)) {
> >>>>>> +			drm_gem_shmem_do_release_pages_locked(shmem);
> >>>>>> +			shmem->got_pages_sgt = false;
> >>>>>>  		}      
> >>>>> Leaking memory is the right thing to do if pages_use_count > 1 (it's
> >>>>> better to leak than having someone access memory it no longer owns), but
> >>>>> I think it's worth mentioning in the above comment.      
> >>>>
> >>>> It's unlikely that it will be only a leak without a following up
> >>>> use-after-free. Neither is acceptable.    
> >>>
> >>> Not necessarily, if you have a page leak, it could be that the GPU has
> >>> access to those pages, but doesn't need the GEM object anymore
> >>> (pages are mapped by the iommu, which doesn't need shmem->sgt or
> >>> shmem->pages after the mapping is created). Without a WARN_ON(), this
> >>> can go unnoticed and lead to memory corruptions/information leaks.
> >>>     
> >>>>
> >>>> The drm_gem_shmem_free() could be changed such that kernel won't blow up
> >>>> on a refcnt bug, but that's not worthwhile doing because drivers
> >>>> shouldn't have silly bugs.    
> >>>
> >>> We definitely don't want to fix that, but we want to complain loudly
> >>> (WARN_ON()), and make sure the risk is limited (preventing memory from
> >>> being re-assigned to someone else by not freeing it).    
> >>
> >> That's what the code did and continues to do here. Not exactly sure what
> >> you're trying to say. I'm going to relocate the comment in v17 to
> >> put_pages(), we can continue discussing it there if I'm missing yours point.
> >>  
> > 
> > I'm just saying it would be worth mentioning that we're intentionally
> > leaking memory if shmem->pages_use_count > 1. Something like:
> > 
> > 	/**
> > 	 * shmem->pages_use_count should be 1 when ->sgt != NULL and
> > 	 * zero otherwise. If some users still hold a pages reference
> > 	 * that's a bug, and we intentionally leak the pages so they
> > 	 * can't be re-allocated to someone else while the GPU/CPU
> > 	 * still have access to it.
> > 	 */
> > 	drm_WARN_ON(drm,
> > 		    refcount_read(&shmem->pages_use_count) == (shmem->sgt ? 1 : 0));
> > 	if (shmem->sgt && refcount_dec_and_test(&shmem->pages_use_count))
> > 		drm_gem_shmem_free_pages(shmem);  
> 
> That may be acceptable, but only once there will a driver using this
> feature.

Which feature? That's not related to a specific feature, that's just
how drm_gem_shmem_get_pages_sgt() works, it takes a pages ref that can
only be released in drm_gem_shmem_free(), because sgt users are not
refcounted and the sgt stays around until the GEM object is freed or
its pages are evicted. The only valid cases we have at the moment are:

- pages_use_count == 1 && sgt != NULL
- pages_use_count == 0

any other situations are buggy.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ