[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL87dS1XL832QXZsQWnA0i5H-_WKibwHJwHjh5W02i8U4Ndotw@mail.gmail.com>
Date: Mon, 17 Jul 2023 10:35:26 +0800
From: mingkun bian <bianmingkun@...il.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: netdev@...r.kernel.org
Subject: Re: [ISSUE] suspicious sock leak
Hi,
I have use cookie as key instead of sock*, then use lru map, it works ok.
But now I basically confirmed the reason, There is a very small
probability that kprobe will fail come into kprobe handler in another
experiment about tcp timer.
ftrace:
trace_timer_expire_entry、trace_timer_expire_exit of call_timer_fn.
trace_timer_start of debug_activate
trace_timer_cancel of debug_deactivate
Then I kprobe "call_timer_fn", there will be a small probability
that the timer has been executed, but the kprobe has not been
executed.
There is a high probability that the stack will appear when the
kprobe hander(kprobe_ftrace_handler) is interrupted by a hard
interrupt(irq_exit).
nginx-10064 [000] d.s. 154816.275735: timer_cancel: timer=00000000fb259de9
nginx-10064 [000] d.s. 154816.275743: <stack trace>
=> run_timer_softirq
=> __do_softirq
=> irq_exit
=> smp_apic_timer_interrupt
=> apic_timer_interrupt
=> trace_call_bpf
=> kprobe_perf_func
=> kprobe_ftrace_handler
=> ftrace_ops_assist_func
=> 0xffffffffc078a0bf
=> tcp_send_fin
=> tcp_close
=> inet_release
=> __sock_release
=> sock_close
=> __fput
=> task_work_run
=> exit_to_usermode_loop
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
nginx-10064 [000] ..s. 154816.275744: timer_expire_entry:
timer=00000000fb259de9 function=tcp_write_timer now=4449486051
nginx-10064 [000] ..s. 154816.275747: <stack trace>
=> call_timer_fn
=> run_timer_softirq
=> __do_softirq
=> irq_exit
=> smp_apic_timer_interrupt
=> apic_timer_interrupt
=> trace_call_bpf
=> kprobe_perf_func
=> kprobe_ftrace_handler
=> ftrace_ops_assist_func
=> 0xffffffffc078a0bf
=> tcp_send_fin
=> tcp_close
=> inet_release
=> __sock_release
=> sock_close
=> __fput
=> task_work_run
=> exit_to_usermode_loop
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
nginx-10064 [000] ..s. 154816.275748: timer_expire_exit: timer=00000000fb259de9
nginx-10064 [000] ..s. 154816.275750: <stack trace>
=> call_timer_fn
=> run_timer_softirq
=> __do_softirq
=> irq_exit
=> smp_apic_timer_interrupt
=> apic_timer_interrupt
=> trace_call_bpf
=> kprobe_perf_func
=> kprobe_ftrace_handler
=> ftrace_ops_assist_func
=> 0xffffffffc078a0bf
=> tcp_send_fin
=> tcp_close
=> inet_release
=> __sock_release
=> sock_close
=> __fput
=> task_work_run
=> exit_to_usermode_loop
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
Thanks.
On Mon, 14 Nov 2022 at 13:39, mingkun bian <bianmingkun@...il.com> wrote:
>
> Cong Wang <xiyou.wangcong@...il.com> 于2022年11月14日周一 09:25写道:
> >
> > On Sun, Nov 13, 2022 at 06:22:22PM +0800, mingkun bian wrote:
> > > Hi,
> > >
> > > bpf map1:
> > > key: cookie
> > > value: addr daddr sport dport cookie sock*
> > >
> > > bpf map2:
> > > key: sock*
> > > value: addr daddr sport dport cookie sock*
> >
> > So none of them is sockmap? Why not use sockmap which takes care
> > of sock refcnt for you?
> >
> > >
> > > 1. Recv a "HTTP GET" request in user applicatoin
> > > map1.insert(cookie, value)
> > > map2.insert(sock*, value)
> > >
> > > 1. kprobe inet_csk_destroy_sock:
> > > sk->sk_wmem_queued is 0
> > > sk->sk_wmem_alloc is 4201
> > > sk->sk_refcnt is 2
> > > sk->__sk_common.skc_cookie is 173585924
> > > saddr daddr sport dport is 192.168.10.x 80
> > >
> > > 2. kprobe __sk_free
> > > can not find the "saddr daddr sport dport 192.168.10.x 80" in kprobe __sk_free
> > >
> > > 3. kprobe __sk_free
> > > after a while, "kprobe __sk_free" find the "saddr daddr sport dport
> > > 127.0.0.1 xx"' info
> > > value = map2.find(sock*)
> > > value1 = map1.find(sock->cookie)
> > > if (value) {
> > > map2.delete(sock) //print value info, find "saddr daddr sport
> > > dport" is "192.168.10.x 80“, and value->cookie is 173585924, which is
> > > the same as "192.168.10.x 80" 's cookie.
> > > }
> > >
> > > if (value1) {
> > > map1.delete(sock->cookie)
> > > }
> > >
> > > Here is my test flow, commented lines represents that sock of ”saddr
> > > daddr sport dport 192.168.10.x 80“ does not come in __sk_free, but it
> > > is reused by ” saddr daddr sport dport 127.0.0.1 xx"
> >
> > I don't see this is a problem yet, the struct sock may be still referenced
> > by the kernel even after you close its corresponding struct socket from
> > user-space. And TCP sockets have timewait too, so...
> >
> > I suggest you try sockmap to store sockets instead.
> >
> > Thanks.
>
> Hi,
>
> I do not use sockmap in this scenario.
>
> Traffic model is about 20Gbps external traffic and 80Gbps lo traffic,
> only external traffic can insert bpf map.
> The old sock will be reused only if the old sock exec "__sock_free"
> whether referenced or not by the kernel, but my test result is not
> so.
> And TIME_WAIT state still release the sock immediately, then create
> tcp_timewait_sock instead of sock in function 'tcp_time_wait'.
>
> My kernel is 4.18.0.
>
> Thanks.
Powered by blists - more mailing lists