lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANOLnONQaHXOp1z1rNum74N2b=Ub7t5NsGHqPdHUQL4+4YYEQg@mail.gmail.com>
Date:   Wed, 17 Aug 2022 23:12:33 +0300
From:   GraÅžvydas Ignotas <notasas@...il.com>
To:     Wei Wang <weiwan@...gle.com>
Cc:     Shakeel Butt <shakeelb@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Eric Dumazet <edumazet@...gle.com>,
        netdev <netdev@...r.kernel.org>, Michal Hocko <mhocko@...e.com>,
        Roman Gushchin <guro@...com>, Linux MM <linux-mm@...ck.org>,
        Cgroups <cgroups@...r.kernel.org>
Subject: Re: UDP rx packet loss in a cgroup with a memory limit

On Wed, Aug 17, 2022 at 9:16 PM Wei Wang <weiwan@...gle.com> wrote:
>
> On Wed, Aug 17, 2022 at 10:37 AM Shakeel Butt <shakeelb@...gle.com> wrote:
> >
> > + Eric and netdev
> >
> > On Wed, Aug 17, 2022 at 10:13 AM Johannes Weiner <hannes@...xchg.org> wrote:
> > >
> > > This is most likely a regression caused by this patch:
> > >
> > > commit 4b1327be9fe57443295ae86fe0fcf24a18469e9f
> > > Author: Wei Wang <weiwan@...gle.com>
> > > Date:   Tue Aug 17 12:40:03 2021 -0700
> > >
> > >     net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem()
> > >
> > >     Add gfp_t mask as an input parameter to mem_cgroup_charge_skmem(),
> > >     to give more control to the networking stack and enable it to change
> > >     memcg charging behavior. In the future, the networking stack may decide
> > >     to avoid oom-kills when fallbacks are more appropriate.
> > >
> > >     One behavior change in mem_cgroup_charge_skmem() by this patch is to
> > >     avoid force charging by default and let the caller decide when and if
> > >     force charging is needed through the presence or absence of
> > >     __GFP_NOFAIL.
> > >
> > >     Signed-off-by: Wei Wang <weiwan@...gle.com>
> > >     Reviewed-by: Shakeel Butt <shakeelb@...gle.com>
> > >     Signed-off-by: David S. Miller <davem@...emloft.net>
> > >
> > > We never used to fail these allocations. Cgroups don't have a
> > > kswapd-style watermark reclaimer, so the network relied on
> > > force-charging and leaving reclaim to allocations that can block.
> > > Now it seems network packets could just fail indefinitely.
> > >
> > > The changelog is a bit terse given how drastic the behavior change
> > > is. Wei, Shakeel, can you fill in why this was changed? Can we revert
> > > this for the time being?
> >
> > Does reverting the patch fix the issue? However I don't think it will.
> >
> > Please note that we still have the force charging as before this
> > patch. Previously when mem_cgroup_charge_skmem() force charges, it
> > returns false and __sk_mem_raise_allocated takes suppress_allocation
> > code path. Based on some heuristics, it may allow it or it may
> > uncharge and return failure.
>
> The force charging logic in __sk_mem_raise_allocated only gets
> considered on tx path for STREAM socket. So it probably does not take
> effect on UDP path. And, that logic is NOT being altered in the above
> patch.
> So specifically for UDP receive path, what happens in
> __sk_mem_raise_allocated() BEFORE the above patch is:
> - mem_cgroup_charge_skmem() gets called:
>     - try_charge() with GFP_NOWAIT gets called and  failed
>     - try_charge() with __GFP_NOFAIL
>     - return false
> - goto suppress_allocation:
>     - mem_cgroup_uncharge_skmem() gets called
> - return 0 (which means failure)
>
> AFTER the above patch, what happens in __sk_mem_raise_allocated() is:
> - mem_cgroup_charge_skmem() gets called:
>     - try_charge() with GFP_NOWAIT gets called and failed
>     - return false
> - goto suppress_allocation:
>     - We no longer calls mem_cgroup_uncharge_skmem()
> - return 0
>
> So I agree with Shakeel, that this change shouldn't alter the behavior
> of the above call path in such a situation.
> But do let us know if reverting this change has any effect on your test.

The problem is still there (the kernel wasn't compiling after revert,
had to adjust another seemingly unrelated callsite). It's hard to tell
if it's better or worse since it happens so randomly.

>
> >
> > The given patch has not changed any heuristic. It has only changed
> > when forced charging happens. After the path the initial call
> > mem_cgroup_charge_skmem() can fail and we take suppress_allocation
> > code path and if heuristics allow, we force charge with __GFP_NOFAIL.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ