linux-kernel - Re: [BUG] fs/super: a possible sleep-in-atomic bug in put

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171008154746.bgtkxir2pytftef3@esperanza>
Date:   Sun, 8 Oct 2017 18:47:46 +0300
From:   Vladimir Davydov <vdavydov.dev@...il.com>
To:     Al Viro <viro@...IV.linux.org.uk>
Cc:     Michal Hocko <mhocko@...nel.org>,
        Jia-Ju Bai <baijiaju1990@....com>, torbjorn.lindh@...ta.se,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [BUG] fs/super: a possible sleep-in-atomic bug in put_super

On Sun, Oct 08, 2017 at 03:03:32AM +0100, Al Viro wrote:
> On Sun, Oct 08, 2017 at 01:56:08AM +0100, Al Viro wrote:
> 
> > What's more, we need to be careful about resize vs. drain.  Right now it's
> > on list_lrus_mutex, but if we drop that around actual resize of an individual
> > list_lru, we'll need something else.  Would there be any problem if we
> > took memcg_cache_ids_sem shared in memcg_offline_kmem()?
> > 
> > The first problem is not fatal - we can e.g. use the sign of the field used
> > to store the number of ->memcg_lrus elements (i.e. stashed value of
> > memcg_nr_cache_ids at allocation or last resize) to indicate that actual
> > freeing is left for resizer...
> 
> Ugh.  That spinlock would have to be held over too much work, or bounced back
> and forth a lot on memcg shutdowns ;-/  Gets especially nasty if we want
> list_lru_destroy() callable from rcu callbacks.  Oh, well...
> 
> I still suspect that locking there is too heavy, but it looks like I don't have
> a better replacement.
> 
> What are the realistic numbers of memcg on a big system?

Several thousand. I guess we could turn list_lrus_mutex into a spin lock
by making resize/drain procedures handle list_lru destruction as you
suggested above, but list_lru_destroy() would still have to iterate over
all elements of list_lru_node->memcg_lrus array to free per-memcg
objects, which is too heavy to be performed under sb_lock IMHO.

Thanks,
Vladimir