lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Sep 2023 10:21:24 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Wei Xu <weixugc@...gle.com>
Cc:     Michal Hocko <mhocko@...e.com>,
        Yosry Ahmed <yosryahmed@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <muchun.song@...ux.dev>,
        Ivan Babrou <ivan@...udflare.com>,
        Michal Koutný <mkoutny@...e.com>,
        Waiman Long <longman@...hat.com>, linux-mm@...ck.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
        Greg Thelen <gthelen@...gle.com>
Subject: Re: [PATCH v4 4/4] mm: memcg: use non-unified stats flushing for
 userspace reads

Hello,

On Mon, Sep 11, 2023 at 01:01:25PM -0700, Wei Xu wrote:
> Yes, it is the same test (10K contending readers). The kernel change
> is to remove stats_user_flush_mutex from mem_cgroup_user_flush_stats()
> so that the concurrent mem_cgroup_user_flush_stats() requests directly
> contend on cgroup_rstat_lock in cgroup_rstat_flush().

I don't think it'd be a good idea to twist rstat and other kernel internal
code to accommodate 10k parallel readers. If we want to support that, let's
explicitly support that by implementing better batching in the read path.
The only guarantee you need is that there has been at least one flush since
the read attempt started, so we can do sth like the following in the read
path:

1. Grab a waiter lock. Remember the current timestamp.

2. Try lock flush mutex. If obtained, drop the waiter lock, flush. Regrab
   the waiter lock, update the latest flush time to my start time, wake up
   waiters on the waitqueue (maybe do custom wakeups based on start time?).

3. Release the waiter lock and sleep on the waitqueue.

4. When woken up, regarb the waiter lock, compare whether the latest flush
   timestamp is later than my start time, if so, return the latest result.
   If not go back to #2.

Maybe the above isn't the best way to do it but you get the general idea.
When you have that many concurrent readers, most of them won't need to
actually flush.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ