linux-kernel - Re: [PATCH v1 1/3] mm: zswap: fix global shrinker memcg iteration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkZTSGz1bpo-pMNP_=11O-7RrhubWonqhUJwrt+TB=Ougg@mail.gmail.com>
Date: Wed, 12 Jun 2024 19:18:05 -0700
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Takero Funaki <flintglass@...il.com>
Cc: Nhat Pham <nphamcs@...il.com>, Johannes Weiner <hannes@...xchg.org>, 
	Chengming Zhou <chengming.zhou@...ux.dev>, Jonathan Corbet <corbet@....net>, 
	Andrew Morton <akpm@...ux-foundation.org>, 
	Domenico Cerasuolo <cerasuolodomenico@...il.com>, linux-mm@...ck.org, linux-doc@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Shakeel Butt <shakeel.butt@...ux.dev>
Subject: Re: [PATCH v1 1/3] mm: zswap: fix global shrinker memcg iteration

On Wed, Jun 12, 2024 at 7:13 PM Takero Funaki <flintglass@...il.com> wrote:
>
> 2024年6月13日(木) 3:28 Yosry Ahmed <yosryahmed@...gle.com>:
> >
> > On Wed, Jun 12, 2024 at 11:16 AM Takero Funaki <flintglass@...il.com> wrote:
> > >
> > > 2024年6月12日(水) 3:26 Nhat Pham <nphamcs@...il.com>:
> > >
> > > >
> > > > As I have noted in v0, I think this is unnecessary and makes it more confusing.
> > > >
> > >
> > > Does spin_lock() ensure that compiler optimizations do not remove
> > > memory access to an external variable? I think we need to use
> > > READ_ONCE/WRITE_ONCE for shared variable access even under a spinlock.
> > > For example,
> > > https://elixir.bootlin.com/linux/latest/source/mm/mmu_notifier.c#L234
> >
> > In this example, it seems like mmu_interval_set_seq() updates
> > interval_sub->invalidate_seq locklessly using WRITE_ONCE(). I think
> > this is why READ_ONCE() is required in that particular case.
> >
> > >
> > > isn't this a common use case of READ_ONCE?
> > > ```c
> > > bool shared_flag = false;
> > > spinlock_t flag_lock;
> > >
> > > void somefunc(void) {
> > >     for (;;) {
> > >         spin_lock(&flag_lock);
> > >         /* check external updates */
> > >         if (READ_ONCE(shared_flag))
> > >             break;
> > >         /* do something */
> > >         spin_unlock(&flag_lock);
> > >     }
> > >     spin_unlock(&flag_lock);
> > > }
> > > ```
> > > Without READ_ONCE, the check can be extracted from the loop by optimization.
> >
> > According to Documentation/memory-barriers.txt, lock acquiring
> > functions are implicit memory barriers. Otherwise, the compiler would
> > be able to pull any memory access outside of the lock critical section
> > and locking wouldn't be reliable.
>
> Ah, I understand now. The implicit barrier is sufficient as long as
> all memory access occurs within the lock. It’s a fundamental rule of
> locking—facepalm.
>
> I misread a module code, like in the link, where a lockless write
> could invade a critical section. My example was in the opposite
> direction, just wrong. Thank you so much for clarifying my
> misunderstanding.
>
> For now checking the patch, I suppose the locking mechanism itself is
> not affected by my misunderstanding of READ_ONCE.
>
> The corrected version of the cleaner should be:
> ```c
> void zswap_memcg_offline_cleanup(struct mem_cgroup *memcg)
> {
>         /* lock out zswap shrinker walking memcg tree */
>         spin_lock(&zswap_shrink_lock);
>         if (zswap_next_shrink == memcg) {
>                 do {
>                         zswap_next_shrink = mem_cgroup_iter(NULL,
>                                         zswap_next_shrink, NULL);
>                         spin_unlock(&zswap_shrink_lock);
>                         spin_lock(&zswap_shrink_lock);
>                         if (!zswap_next_shrink)
>                                 break;
>                 } while (!mem_cgroup_online(zswap_next_shrink));
>         }
>         spin_unlock(&zswap_shrink_lock);
> }
> ```

Is the idea here to avoid moving the iterator to another offline memcg
that zswap_memcg_offline_cleanup() was already called for, to avoid
holding a ref on that memcg until the next run of zswap shrinking?

If yes, I think it's probably worth doing. But why do we need to
release and reacquire the lock in the loop above?