lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bffca2e0-d36d-0b56-bb1b-a7e96d9493aa@proxmox.com>
Date:   Thu, 2 Jul 2020 10:12:19 +0200
From:   Thomas Lamprecht <t.lamprecht@...xmox.com>
To:     Cong Wang <xiyou.wangcong@...il.com>, Roman Gushchin <guro@...com>
Cc:     Cameron Berkenpas <cam@...-zeon.de>, Zefan Li <lizefan@...wei.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Peter Geis <pgwipeout@...il.com>,
        Lu Fengqi <lufq.fnst@...fujitsu.com>,
        Daniƫl Sonck <dsonck92@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Tejun Heo <tj@...nel.org>
Subject: Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

On 02.07.20 06:48, Cong Wang wrote:
> On Tue, Jun 30, 2020 at 3:48 PM Roman Gushchin <guro@...com> wrote:
>>
>> Btw if we want to backport the problem but can't blame a specific commit,
>> we can always use something like "Cc: <stable@...r.kernel.org>    [3.1+]".
> 
> Sure, but if we don't know which is the right commit to blame, then how
> do we know which stable version should the patch target? :)

We run into a similar issue here once we made an update from the 5.4.41 to the
5.4.44 stable kernel. This patch addresses the issue, at least we are running
stable at >17 hours uptime with this patch, whereas we ran into issues normally
at <6 hour uptime without this patch.

That update included newly the commit 090e28b229af92dc5b ("netprio_cgroup: Fix
unlimited memory leak of v2 cgroups") which this patch originally mentions as
"Fixes", whereas the other mentioned possible culprit 4bfc0bb2c60e2f4c ("bpf:
decouple the lifetime of cgroup_bpf from cgroup itself") was included with 5.2
here, and did *not* made problems here.

So, while the real culprit may be something else, a mix of them, or even more
complex, the race is at least triggered way more frequently with the
090e28b229af92dc5b ("netprio_cgroup: Fix unlimited memory leak of v2 cgroups")
one or, for the sake of mentioning, possibly also something else from the
v5.4.41..v5.4.44 commit range - I did not looked into that in detail yet.

> 
> I am open to all options here, including not backporting to stable at all.

As said, the stable-5.4.y tree profits from having this patch here, so there's
that.

Also, FWIW:
Tested-by: Thomas Lamprecht <t.lamprecht@...xmox.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ