[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=hENdbCdjPCEnfz0-to81qGGAN4ONkHdrhQEPc1bC+-peYMQ@mail.gmail.com>
Date: Tue, 5 Oct 2021 21:11:12 +0800
From: Zhu Yanjun <zyjzyj2000@...il.com>
To: Haakon Bugge <haakon.bugge@...cle.com>
Cc: Dmitry Vyukov <dvyukov@...gle.com>, Jason Gunthorpe <jgg@...pe.ca>,
Doug Ledford <dledford@...hat.com>,
syzbot <syzbot+3a992c9e4fd9f0e6fd0e@...kaller.appspotmail.com>,
Leon Romanovsky <leon@...nel.org>,
OFED mailing list <linux-rdma@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
"syzkaller-bugs@...glegroups.com" <syzkaller-bugs@...glegroups.com>
Subject: Re: [syzbot] BUG: RESTRACK detected leak of resources
On Tue, Oct 5, 2021 at 1:56 AM Haakon Bugge <haakon.bugge@...cle.com> wrote:
>
>
>
> > On 4 Oct 2021, at 15:22, Dmitry Vyukov <dvyukov@...gle.com> wrote:
> >
> > On Mon, 4 Oct 2021 at 15:15, Jason Gunthorpe <jgg@...pe.ca> wrote:
> >>
> >> On Mon, Oct 04, 2021 at 02:42:11PM +0200, Dmitry Vyukov wrote:
> >>> On Mon, 4 Oct 2021 at 12:45, syzbot
> >>> <syzbot+3a992c9e4fd9f0e6fd0e@...kaller.appspotmail.com> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> syzbot found the following issue on:
> >>>>
> >>>> HEAD commit: c7b4d0e56a1d Add linux-next specific files for 20210930
> >>>> git tree: linux-next
> >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=104be6cb300000
> >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=c9a1f6685aeb48bd
> >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=3a992c9e4fd9f0e6fd0e
> >>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> >>>>
> >>>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>>
> >>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>>> Reported-by: syzbot+3a992c9e4fd9f0e6fd0e@...kaller.appspotmail.com
> >>>
> >>> +RESTRACK maintainers
> >>>
> >>> (it would also be good if RESTRACK would print a more standard oops
> >>> with stack/filenames, so that testing systems can attribute issues to
> >>> files/maintainers).
> >>
> >> restrack certainly should trigger a WARN_ON to stop the kernel.. But I
> >> don't know what stack track would be useful here. The culprit is
> >> always the underlying driver, not the core code..
> >
> > There seems to be a significant overlap between
> > drivers/infiniband/core/restrack.c and drivers/infiniband/sw/rxe/rxe.c
> > maintainers, so perhaps restrack.c is good enough approximation to
> > extract relevant people (definitely better then no CC at all :))
>
> Looks to me as this is rxe:
>
> [ 1892.778632][ T8958] BUG: KASAN: use-after-free in __rxe_drop_index_locked+0xb5/0x100
> [snip]
> [ 1892.822375][ T8958] Call Trace:
> [ 1892.825655][ T8958] <TASK>
> [ 1892.828594][ T8958] dump_stack_lvl+0xcd/0x134
> [ 1892.833273][ T8958] print_address_description.constprop.0.cold+0x6c/0x30c
> [ 1892.840316][ T8958] ? __rxe_drop_index_locked+0xb5/0x100
> [ 1892.845864][ T8958] ? __rxe_drop_index_locked+0xb5/0x100
> [ 1892.851424][ T8958] kasan_report.cold+0x83/0xdf
> [ 1892.856200][ T8958] ? __rxe_drop_index_locked+0xb5/0x100
> [ 1892.861761][ T8958] kasan_check_range+0x13d/0x180
> [ 1892.866780][ T8958] __rxe_drop_index_locked+0xb5/0x100
> [ 1892.872164][ T8958] __rxe_drop_index+0x3f/0x60
> [ 1892.876850][ T8958] rxe_dereg_mr+0x14b/0x240
> [ 1892.881381][ T8958] ib_dealloc_pd_user+0x96/0x230
> [ 1892.886566][ T8958] rds_ib_dev_free+0xd4/0x3a0
>
> So, RDS de-allocs its PD, ib core must first de-register the PD's local MR, calls rxe_dereg_mr(), ...
int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
{
struct rxe_mr *mr = to_rmr(ibmr);
if (atomic_read(&mr->num_mw) > 0) {
pr_warn("%s: Attempt to deregister an MR while bound to MWs\n",
__func__);
return -EINVAL;
}
mr->state = RXE_MR_STATE_ZOMBIE;
rxe_drop_ref(mr_pd(mr));
rxe_drop_index(mr); <-------This is call trace beginning.
rxe_drop_ref(mr);
return 0;
}
struct rxe_mr {
struct rxe_pool_entry pelem; <-----A ref_cnt in this struct.
struct ib_mr ibmr;
struct ib_umem *umem;
struct rxe_pool_entry {
struct rxe_pool *pool;
struct kref ref_cnt; <-------This ref_cnt may help.
struct list_head list;
Zhu Yanjun
>
>
> Thxs, HÃ¥kon
>
>
> >
> >> Anyhow, this report is either rxe or rds by the look of it.
> >>
> >> Jason
>
Powered by blists - more mailing lists