lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250519200003.46elezpkkfx5grl4@gmail.com>
Date: Mon, 19 May 2025 13:00:03 -0700
From: John Fastabend <john.fastabend@...il.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: Jiayuan Chen <jiayuan.chen@...ux.dev>, bpf@...r.kernel.org,
	Jakub Sitnicki <jakub@...udflare.com>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>,
	Cong Wang <cong.wang@...edance.com>,
	Alexei Starovoitov <ast@...nel.org>, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v1] bpf, sockmap: Fix concurrency issues between
 memory charge and uncharge

On 2025-05-18 11:48:31, Cong Wang wrote:
> On Thu, May 08, 2025 at 02:24:22PM +0800, Jiayuan Chen wrote:
> > Triggering WARN_ON_ONCE(sk->sk_forward_alloc) by running the following
> > command, followed by pressing Ctrl-C after 2 seconds:
> > ./bench sockmap -c 2 -p 1 -a --rx-verdict-ingress
> > '''
> > ------------[ cut here ]------------
> > WARNING: CPU: 2 PID: 40 at net/ipv4/af_inet.c inet_sock_destruct
> > 
> > Call Trace:
> > <TASK>
> > __sk_destruct+0x46/0x222
> > sk_psock_destroy+0x22f/0x242
> > process_one_work+0x504/0x8a8
> > ? process_one_work+0x39d/0x8a8
> > ? __pfx_process_one_work+0x10/0x10
> > ? worker_thread+0x44/0x2ae
> > ? __list_add_valid_or_report+0x83/0xea
> > ? srso_return_thunk+0x5/0x5f
> > ? __list_add+0x45/0x52
> > process_scheduled_works+0x73/0x82
> > worker_thread+0x1ce/0x2ae
> > '''
> > 
> > Reason:
> > When we are in the backlog process, we allocate sk_msg and then perform
> > the charge process. Meanwhile, in the user process context, the recvmsg()
> > operation performs the uncharge process, leading to concurrency issues
> > between them.
> > 
> > The charge process (2 functions):
> > 1. sk_rmem_schedule(size) -> sk_forward_alloc increases by PAGE_SIZE
> >                              multiples
> > 2. sk_mem_charge(size)    -> sk_forward_alloc -= size
> > 
> > The uncharge process (sk_mem_uncharge()):
> > 3. sk_forward_alloc += size
> > 4. check if sk_forward_alloc > PAGE_SIZE
> > 5. reclaim    -> sk_forward_alloc decreases, possibly becoming 0
> > 
> > Because the sk performing charge and uncharge is not locked
> > (mainly because the backlog process does not lock the socket), therefore,
> > steps 1 to 5 will execute concurrently as follows:
> > 
> > cpu0                                cpu1
> > 1
> >                                     3
> >                                     4   --> sk_forward_alloc >= PAGE_SIZE
> >                                     5   --> reclaim sk_forward_alloc
> > 2 --> sk_forward_alloc may
> >       become negative
> > 
> > Solution:
> > 1. Add locking to the kfree_sk_msg() process, which is only called in the
> >    user process context.
> > 2. Integrate the charge process into sk_psock_create_ingress_msg() in the
> >    backlog process and add locking.
> > 3. Reuse the existing psock->ingress_lock.
> 
> Reusing the psock->ingress_lock looks weird to me, as it is intended for
> locking ingress queue, at least at the time it was introduced.
> 
> And technically speaking, it is the sock lock which is supposed to serialize
> socket charging.
> 
> So is there any better solution here?

Agree I would be more apt to add the sock_lock back to the backlog then
to punish fast path this way.

Holding the ref cnt on the psock stops blocks the sk_psock_destroy() in
backlog now so is this still an issue?

Thanks,
John

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ