[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211018105831.77cde2ad@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Mon, 18 Oct 2021 10:58:31 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Toke Høiland-Jørgensen <toke@...e.dk>
Cc: Vlad Buslov <vladbu@...dia.com>, Paolo Abeni <pabeni@...hat.com>,
Daniel Borkmann <daniel@...earbox.net>,
syzbot <syzbot+62e474dd92a35e3060d8@...kaller.appspotmail.com>,
andrii@...nel.org, ast@...nel.org, bpf@...r.kernel.org,
davem@...emloft.net, hawk@...nel.org, john.fastabend@...il.com,
kafai@...com, kpsingh@...nel.org, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, songliubraving@...com,
syzkaller-bugs@...glegroups.com, yhs@...com, joamaki@...il.com,
Saeed Mahameed <saeedm@...dia.com>,
Maxim Mikityanskiy <maximmi@...dia.com>
Subject: Re: [syzbot] BUG: corrupted list in netif_napi_add
On Mon, 18 Oct 2021 19:40:40 +0200 Toke Høiland-Jørgensen wrote:
> Jakub Kicinski <kuba@...nel.org> writes:
>
> > On Mon, 18 Oct 2021 17:04:19 +0300 Vlad Buslov wrote:
> >> We got a use-after-free with very similar trace [0] during nightly
> >> regression. The issue happens when ip link up/down state is flipped
> >> several times in loop and doesn't reproduce for me manually. The fact
> >> that it didn't reproduce for me after running test ten times suggests
> >> that it is either very hard to reproduce or that it is a result of some
> >> interaction between several tests in our suite.
> >>
> >> [0]:
> >>
> >> [ 3187.779569] mlx5_core 0000:08:00.0 enp8s0f0: Link up
> >> [ 3187.890694] ==================================================================
> >> [ 3187.892518] BUG: KASAN: use-after-free in __list_add_valid+0xc3/0xf0
> >> [ 3187.894132] Read of size 8 at addr ffff8881150b3fb8 by task ip/119618
> >
> > Hm, not sure how similar it is. This one looks like channel was freed
> > without deleting NAPI. Do you have list debug enabled?
>
> Well, the other report[0] also kinda looks like the NAPI thread keeps
> running after it should have been disabled, so maybe they are in fact
> related?
>
> [0] https://lore.kernel.org/r/000000000000c1524005cdeacc5f@google.com
Could be, if napi->state gets corrupted it may lose NAPI_STATE_LISTED.
719c57197010 ("net: make napi_disable() symmetric with enable")
3765996e4f0b ("napi: fix race inside napi_enable")
is the only thing that comes to mind, but they look fine to me.
Powered by blists - more mailing lists