lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <700b63d1-9b77-83be-6dab-cce9438f8ae2@virtuozzo.com>
Date:   Thu, 9 Aug 2018 13:22:12 +0300
From:   Kirill Tkhai <ktkhai@...tuozzo.com>
To:     Dave Chinner <david@...morbit.com>,
        Michal Hocko <mhocko@...nel.org>
Cc:     akpm@...ux-foundation.org, gregkh@...uxfoundation.org,
        rafael@...nel.org, viro@...iv.linux.org.uk,
        darrick.wong@...cle.com, paulmck@...ux.vnet.ibm.com,
        josh@...htriplett.org, rostedt@...dmis.org,
        mathieu.desnoyers@...icios.com, jiangshanlai@...il.com,
        hughd@...gle.com, shuah@...nel.org, robh@...nel.org,
        ulf.hansson@...aro.org, aspriel@...il.com,
        vivek.gautam@...eaurora.org, robin.murphy@....com, joe@...ches.com,
        heikki.krogerus@...ux.intel.com, sfr@...b.auug.org.au,
        vdavydov.dev@...il.com, chris@...is-wilson.co.uk,
        penguin-kernel@...ove.SAKURA.ne.jp, aryabinin@...tuozzo.com,
        willy@...radead.org, ying.huang@...el.com, shakeelb@...gle.com,
        jbacik@...com, mingo@...nel.org, mhiramat@...nel.org,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH RFC 01/10] rcu: Make CONFIG_SRCU unconditionally enabled

On 09.08.2018 00:31, Dave Chinner wrote:
> On Wed, Aug 08, 2018 at 12:27:34PM +0200, Michal Hocko wrote:
>> [CC Josh - the whole series is
>> http://lkml.kernel.org/r/153365347929.19074.12509495712735843805.stgit@localhost.localdomain]
>>
>> On Wed 08-08-18 13:17:44, Kirill Tkhai wrote:
>>> On 08.08.2018 10:20, Michal Hocko wrote:
>>>> On Tue 07-08-18 18:37:36, Kirill Tkhai wrote:
>>>>> This patch kills all CONFIG_SRCU defines and
>>>>> the code under !CONFIG_SRCU.
>>>>
>>>> The last time somebody tried to do this there was a pushback due to
>>>> kernel tinyfication. So this should really give some numbers about the
>>>> code size increase. Also why can't we make this depend on MMU. Is
>>>> anybody else than the reclaim asking for unconditional SRCU usage?
>>>
>>> I don't know one. The size numbers (sparc64) are:
>>>
>>> $ size image.srcu.disabled 
>>>    text	   data	    bss	    dec	    hex	filename
>>> 5117546	8030506	1968104	15116156	 e6a77c	image.srcu.disabled
>>> $ size image.srcu.enabled
>>>    text	   data	    bss	    dec	    hex	filename
>>> 5126175	8064346	1968104	15158625	 e74d61	image.srcu.enabled
>>> The difference is: 15158625-15116156 = 42469 ~41Kb
>>>
>>> Please, see the measurement details to my answer to Stephen.
>>>
>>>> Btw. I totaly agree with Steven. This is a very poor changelog. It is
>>>> trivial to see what the patch does but it is far from clear why it is
>>>> doing that and why we cannot go other ways.
>>> We possibly can go another way, and there is comment to [2/10] about this.
>>> Percpu rwsem may be used instead, the only thing, it is worse, is it will
>>> make shrink_slab() wait unregistering shrinkers, while srcu-based
>>> implementation does not require this.
>>
>> Well, if unregisterring doesn't do anything subtle - e.g. an allocation
>> or take locks which depend on allocation - and we can guarantee that
>> then blocking shrink_slab shouldn't be a big deal.
> 
> unregister_shrinker() already blocks shrink_slab - taking a rwsem in
> write mode blocks all readers - so using a per-cpu rwsem doesn't
> introduce anything new or unexpected. I'd like to see numbers of the
> different methods before anything else.

The difference is percpu_rw_semaphore makes readers to wait till RCU
grace period is finished. Sometimes this takes unpredictable time on
big machines with many CPUs, which is not good.
 
> IMO, the big deal is that the split unregister mechanism seems to
> imply superblock shrinkers can be called during sb teardown or
> /after/ the filesystem has been torn down in memory (i.e. after
> ->put_super() is called). That's a change of behaviour, but it's
> left to the filesystem to detect and handle that condition. That's
> exceedingly subtle and looks like a recipe for disaster to me. I
> note that XFS hasn't been updated to detect and avoid this landmine.
> 
> And, FWIW, filesystems with multiple shrinkers (e.g. XFS as 3 per
> mount) will take the SCRU penalty multiple times during unmount, and
> potentially be exposed to multiple different "use during/after
> teardown" race conditions.
> 
>> It is subtle though.
>> Maybe subtle enough to make unconditional SRCU worth it. This all should
>> be in the changelog though.
> 
> IMO, we've had enough recent bugs to deal with from shrinkers being
> called before the filesystem is set up and from trying to handle
> allocation errors during setup. Do we really want to make shrinker
> shutdown just as prone to mismanagement and subtle, hard to hit
> bugs? I don't think we do - unmount is simply not a critical
> performance path.

There are possible different situations, people use linux like they want.
Imagine, you want to reboot NFS server, but you want to enter clients
and umount them over ssh, and the time is critical. Something like this.
I believe there are many examples, people need this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ