[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e0387ee2-56b8-c780-8d33-c477a875e2df@roeck-us.net>
Date: Thu, 9 Jul 2020 12:13:06 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
Cameron Berkenpas <cam@...-zeon.de>,
Peter Geis <pgwipeout@...il.com>,
Lu Fengqi <lufq.fnst@...fujitsu.com>,
Daniƫl Sonck <dsonck92@...il.com>,
Zhang Qiang <qiang.zhang@...driver.com>,
Thomas Lamprecht <t.lamprecht@...xmox.com>,
Daniel Borkmann <daniel@...earbox.net>,
Zefan Li <lizefan@...wei.com>, Tejun Heo <tj@...nel.org>,
Roman Gushchin <guro@...com>
Subject: Re: [Patch net v2] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()
On 7/9/20 11:51 AM, Cong Wang wrote:
> On Thu, Jul 9, 2020 at 10:10 AM Guenter Roeck <linux@...ck-us.net> wrote:
>>
>> Something seems fishy with the use of skcd->val on big endian systems.
>>
>> Some debug output:
>>
>> [ 22.643703] sock: ##### sk_alloc(sk=000000001be28100): Calling cgroup_sk_alloc(000000001be28550)
>> [ 22.643807] cgroup: ##### cgroup_sk_alloc(skcd=000000001be28550): cgroup_sk_alloc_disabled=0, in_interrupt: 0
>> [ 22.643886] cgroup: #### cgroup_sk_alloc(skcd=000000001be28550): cset->dfl_cgrp=0000000001224040, skcd->val=0x1224040
>> [ 22.643957] cgroup: ###### cgroup_bpf_get(cgrp=0000000001224040)
>> [ 22.646451] sock: ##### sk_prot_free(sk=000000001be28100): Calling cgroup_sk_free(000000001be28550)
>> [ 22.646607] cgroup: #### sock_cgroup_ptr(skcd=000000001be28550) -> 0000000000014040 [v=14040, skcd->val=14040]
>> [ 22.646632] cgroup: ####### cgroup_sk_free(): skcd=000000001be28550, cgrp=0000000000014040
>> [ 22.646739] cgroup: ####### cgroup_sk_free(): skcd->no_refcnt=0
>> [ 22.646814] cgroup: ####### cgroup_sk_free(): Calling cgroup_bpf_put(cgrp=0000000000014040)
>> [ 22.646886] cgroup: ###### cgroup_bpf_put(cgrp=0000000000014040)
>
> Excellent debugging! I thought it was a double put, but it seems to
> be an endian issue. I didn't realize the bit endian machine actually
> packs bitfields in a big endian way too...
>
> Does the attached patch address this?
>
Partially. I don't see the crash anymore, but something is still odd - some of my
tests require a retry with this patch applied, which previously never happened.
I don't know if this is another problem with this patch, or a different problem.
Unfortunately, I'll be unable to debug this further until next Tuesday.
Guenter
Powered by blists - more mailing lists