netdev - Re: [PATCH v2 net] net/smc: postpone sk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20220524121143.635372-1-liuyacan@corp.netease.com>
Date:   Tue, 24 May 2022 20:11:43 +0800
From:   liuyacan@...p.netease.com
To:     tonylu@...ux.alibaba.com
Cc:     davem@...emloft.net, edumazet@...gle.com, kgraul@...ux.ibm.com,
        kuba@...nel.org, linux-kernel@...r.kernel.org,
        linux-s390@...r.kernel.org, liuyacan@...p.netease.com,
        netdev@...r.kernel.org, pabeni@...hat.com, ubraun@...ux.ibm.com
Subject: Re: [PATCH v2 net] net/smc: postpone sk_refcnt increment in connect()

> > > >> This is a rather unusual problem that can come up when fallback=true BEFORE smc_connect()
> > > >> is called. But nevertheless, it is a problem.
> > > >>
> > > >> Right now I am not sure if it is okay when we NOT hold a ref to smc->sk during all fallback
> > > >> processing. This change also conflicts with a patch that is already on net-next (3aba1030).
> > > > 
> > > > Do you mean put the ref to smc->sk during all fallback processing unconditionally and remove 
> > > > the fallback branch sock_put() in __smc_release()?
> > > 
> > > What I had in mind was to eventually call sock_put() in __smc_release() even if sk->sk_state == SMC_INIT
> > > (currently the extra check in the if() for sk->sk_state != SMC_INIT prevents the sock_put()), but only
> > > when it is sure that we actually reached the sock_hold() in smc_connect() before.
> > > 
> > > But maybe we find out that the sock_hold() is not needed for fallback sockets, I don't know...
> > 
> > I do think the sock_hold()/sock_put() for smc->sk is a bit complicated, Emm, I'm not sure if it 
> > can be simplified..
> > 
> > In fact, I'm sure there must be another ref count issue in my environment,but I haven't caught it yet.
> 
> I am wondering the issue of this ref count. If it is convenient, would
> you like to provide some more details?
> 
> syzkaller has reported some issues about ref count, but syzkaller and
> others' bot don't have RDMA devices, they cannot cover most of the code
> routines in SMC. We are working on it to provide SMC fuzz test with RDMA
> environment. So it's very nice to have real world issues.
> 
> Thanks,
> Tony Lu

I have encountered two types of problems. However, I cannot reproduce it stably.

case 1. After closing the app (>> TIME_WAIT), 'lsmod' shows that the smc module ref count is still greater than 0.
case 2 [rare]. 'lsmod' shows smc module ref count is less than 0.

Some clues of case 2 are as follows:

kernel: [67166.688386] ------------[ cut here ]------------
  kernel: [67166.693658] cache_from_obj: Wrong slab cache. SMC but object is from SMC
  kernel: [67166.701136] WARNING: CPU: 47 PID: 176961 at mm/slab.h:469 kmem_cache_free+0x329/0x410
  ......
  kernel: [67166.846819] CPU: 47 PID: 176961 Comm: redis-server Kdump: loaded Tainted: G  R B      OE     5.10.0-0.bpo.9-amd64 #1 Debian 5.10.70-1~bpo10+1
  kernel: [67166.860915] Hardware name: Inspur SA5280M6/SA5280M6, BIOS 06.00.01 10/09/2021
  kernel: [67166.868747] RIP: 0010:kmem_cache_free+0x329/0x410
  kernel: [67166.874168] Code: ff 0f 0b 48 8d b8 f0 9d 02 00 e9 e4 fe ff ff 48 8b 57 60 49 8b 4f 60 48 c7 c6 30 86 63 a4 48 c7 c7 f8 e6 8f a4 e8 89 63 5c 00 <0f> 0b 48 89 de 4c
89 ff e8 1a ad ff ff 48 8b 0d 63 34 ef 00 e9 49
  kernel: [67166.894360] RSP: 0018:ffffbd450f527e18 EFLAGS: 00010286
  kernel: [67166.900306] RAX: 0000000000000000 RBX: ffffa00fa4548d00 RCX: 0000000000000000
  kernel: [67166.908169] RDX: ffffa04c7f7e8760 RSI: ffffa04c7f7d8a00 RDI: ffffa04c7f7d8a00
  kernel: [67166.916027] RBP: ffffa01024548d00 R08: 0000000000000000 R09: c0000000ffffbfff
  kernel: [67166.923860] R10: 0000000000000001 R11: ffffbd450f527c20 R12: 0000000000000000
  kernel: [67166.931713] R13: 0000000000000000 R14: ffffa00fa4548f28 R15: ffffa02d3366bf00
  kernel: [67166.939564] FS:  00007fe131c80f40(0000) GS:ffffa04c7f7c0000(0000) knlGS:0000000000000000
  kernel: [67166.948361] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  kernel: [67166.954817] CR2: 00007fe12f477000 CR3: 00000004874be003 CR4: 0000000000770ee0
  kernel: [67166.962662] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  kernel: [67166.970498] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  kernel: [67166.978306] PKRU: 55555554
  kernel: [67166.981695] Call Trace:
  kernel: [67166.985017]  __sk_destruct+0x12c/0x1e0
  kernel: [67166.989449]  smc_release+0x19a/0x230 [smc]
  kernel: [67166.994325]  __sock_release+0x3d/0xa0
  kernel: [67166.998656]  sock_close+0x11/0x20
  kernel: [67167.002637]  __fput+0x93/0x240
  kernel: [67167.006347]  task_work_run+0x76/0xb0
  kernel: [67167.010569]  exit_to_user_mode_prepare+0x129/0x130
  kernel: [67167.016000]  syscall_exit_to_user_mode+0x28/0x140
  kernel: [67167.021339]  entry_SYSCALL_64_after_hwframe+0x44/0xa9