lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z1ZIn7+mBZdM5VNt@perf>
Date: Mon, 9 Dec 2024 10:32:15 +0900
From: Youngmin Nam <youngmin.nam@...sung.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Youngmin Nam <youngmin.nam@...sung.com>, Jakub Kicinski
	<kuba@...nel.org>, Neal Cardwell <ncardwell@...gle.com>,
	davem@...emloft.net, dsahern@...nel.org, pabeni@...hat.com,
	horms@...nel.org, dujeong.lee@...sung.com, guo88.liu@...sung.com,
	yiwang.cai@...sung.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, joonki.min@...sung.com,
	hajun.sung@...sung.com, d7271.choe@...sung.com, sw.ju@...sung.com
Subject: Re: [PATCH] tcp: check socket state before calling WARN_ON

On Fri, Dec 06, 2024 at 10:08:17AM +0100, Eric Dumazet wrote:
> On Fri, Dec 6, 2024 at 9:58 AM Youngmin Nam <youngmin.nam@...sung.com> wrote:
> >
> > On Fri, Dec 06, 2024 at 09:35:32AM +0100, Eric Dumazet wrote:
> > > On Fri, Dec 6, 2024 at 6:50 AM Youngmin Nam <youngmin.nam@...sung.com> wrote:
> > > >
> > > > On Wed, Dec 04, 2024 at 08:13:33AM +0100, Eric Dumazet wrote:
> > > > > On Wed, Dec 4, 2024 at 4:35 AM Youngmin Nam <youngmin.nam@...sung.com> wrote:
> > > > > >
> > > > > > On Tue, Dec 03, 2024 at 06:18:39PM -0800, Jakub Kicinski wrote:
> > > > > > > On Tue, 3 Dec 2024 10:34:46 -0500 Neal Cardwell wrote:
> > > > > > > > > I have not seen these warnings firing. Neal, have you seen this in the past ?
> > > > > > > >
> > > > > > > > I can't recall seeing these warnings over the past 5 years or so, and
> > > > > > > > (from checking our monitoring) they don't seem to be firing in our
> > > > > > > > fleet recently.
> > > > > > >
> > > > > > > FWIW I see this at Meta on 5.12 kernels, but nothing since.
> > > > > > > Could be that one of our workloads is pinned to 5.12.
> > > > > > > Youngmin, what's the newest kernel you can repro this on?
> > > > > > >
> > > > > > Hi Jakub.
> > > > > > Thank you for taking an interest in this issue.
> > > > > >
> > > > > > We've seen this issue since 5.15 kernel.
> > > > > > Now, we can see this on 6.6 kernel which is the newest kernel we are running.
> > > > >
> > > > > The fact that we are processing ACK packets after the write queue has
> > > > > been purged would be a serious bug.
> > > > >
> > > > > Thus the WARN() makes sense to us.
> > > > >
> > > > > It would be easy to build a packetdrill test. Please do so, then we
> > > > > can fix the root cause.
> > > > >
> > > > > Thank you !
> > > > >
> > > >
> > > > Hi Eric.
> > > >
> > > > Unfortunately, we are not familiar with the Packetdrill test.
> > > > Refering to the official website on Github, I tried to install it on my device.
> > > >
> > > > Here is what I did on my local machine.
> > > >
> > > > $ mkdir packetdrill
> > > > $ cd packetdrill
> > > > $ git clone https://protect2.fireeye.com/v1/url?k=746d28f3-15e63dd6-746ca3bc-74fe485cbff6-e405b48a4881ecfc&q=1&e=ca164227-d8ec-4d3c-bd27-af2d38964105&u=https%3A%2F%2Fgithub.com%2Fgoogle%2Fpacketdrill.git .
> > > > $ cd gtests/net/packetdrill/
> > > > $./configure
> > > > $ make CC=/home/youngmin/Downloads/arm-gnu-toolchain-13.3.rel1-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-gcc
> > > >
> > > > $ adb root
> > > > $ adb push packetdrill /data/
> > > > $ adb shell
> > > >
> > > > And here is what I did on my device
> > > >
> > > > erd9955:/data/packetdrill/gtests/net # ./packetdrill/run_all.py -S -v -L -l tcp/
> > > > /system/bin/sh: ./packetdrill/run_all.py: No such file or directory
> > > >
> > > > I'm not sure if this procedure is correct.
> > > > Could you help us run the Packetdrill on an Android device ?
> > >
> > > packetdrill can run anywhere, for instance on your laptop, no need to
> > > compile / install it on Android
> > >
> > > Then you can run single test like
> > >
> > > # packetdrill gtests/net/tcp/sack/sack-route-refresh-ip-tos.pkt
> > >
> >
> > You mean.. To test an Android device, we need to run packetdrill on laptop, right ?
> >
> > Laptop(run packetdrill script) <--------------------------> Android device
> >
> > By the way, how can we test the Android device (DUT) from packetdrill which is running on Laptop?
> > I hope you understand that I am aksing this question because we are not familiar with the packetdrill.
> > Thanks.
> 
> packetdrill does not need to run on a physical DUT, it uses a software
> stack : TCP and tun device.
> 
> You have a kernel tree, compile it and run a VM, like virtme-ng
> 
> vng -bv
> 
> We use this to run kernel selftests in which we started adding
> packetdrill tests (in recent kernel tree)
> 
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-4pkt.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_client.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_batch.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-after-win-update.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-fq-ack-per-2pkt.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_maxfrags.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_inq_server.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_epoll_exclusive.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_basic.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_small.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-app-limited-9-packets-out.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-2pkt.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_epoll_oneshot.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_fastopen-server.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_inq_client.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_epoll_edge.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-app-limited.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_fastopen-client.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_zerocopy_closed.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-1pkt.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-after-idle.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-2pkt-send-5pkt.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_slow_start_slow-start-ack-per-2pkt-send-6pkt.pkt
> ./tools/testing/selftests/net/packetdrill/tcp_md5_md5-only-on-client-ack.pkt
> ./tools/testing/selftests/net/netfilter/packetdrill/conntrack_synack_old.pkt
> ./tools/testing/selftests/net/netfilter/packetdrill/conntrack_syn_challenge_ack.pkt
> ./tools/testing/selftests/net/netfilter/packetdrill/conntrack_inexact_rst.pkt
> ./tools/testing/selftests/net/netfilter/packetdrill/conntrack_synack_reuse.pkt
> ./tools/testing/selftests/net/netfilter/packetdrill/conntrack_rst_invalid.pkt
> ./tools/testing/selftests/net/netfilter/packetdrill/conntrack_ack_loss_stall.pkt
> 

You mean we should run our kernel in a virtual machine environment instead of on a real device.
Actually, we don't have a virtual environment for the Android kernel.
Additionally, I'm not sure if a virtual environment for an Android device is available.
Anyway, we are going to reproduce this issue using our stability stress test.

Thanks.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ