[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <OS3PR01MB9865474CFAC55AEB91D5A502E58CA@OS3PR01MB9865.jpnprd01.prod.outlook.com>
Date: Thu, 14 Dec 2023 05:55:52 +0000
From: "Daisuke Matsuda (Fujitsu)" <matsuda-daisuke@...itsu.com>
To: 'Zhu Yanjun' <yanjun.zhu@...ux.dev>,
Jason Gunthorpe <jgg@...dia.com>
CC: "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"leon@...nel.org" <leon@...nel.org>,
"zyjzyj2000@...il.com" <zyjzyj2000@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"rpearsonhpe@...il.com" <rpearsonhpe@...il.com>,
"Xiao Yang (Fujitsu)" <yangx.jy@...itsu.com>,
"Zhijian Li (Fujitsu)" <lizhijian@...itsu.com>,
"Yasunori Gotou (Fujitsu)" <y-goto@...itsu.com>
Subject: RE: [PATCH for-next v7 0/7] On-Demand Paging on SoftRoCE
On Wed, Dec 13, 2023 3:08 AM Zhu Yanjun wrote:
> 在 2023/12/7 14:37, Daisuke Matsuda (Fujitsu) 写道:
> > On Tue, Dec 5, 2023 10:51 AM Zhu Yanjun wrote:
> >>
> >> 在 2023/12/5 8:11, Jason Gunthorpe 写道:
> >>> On Thu, Nov 09, 2023 at 02:44:45PM +0900, Daisuke Matsuda wrote:
> >>>>
> >>>> Daisuke Matsuda (7):
> >>>> RDMA/rxe: Always defer tasks on responder and completer to workqueue
> >>>> RDMA/rxe: Make MR functions accessible from other rxe source code
> >>>> RDMA/rxe: Move resp_states definition to rxe_verbs.h
> >>>> RDMA/rxe: Add page invalidation support
> >>>> RDMA/rxe: Allow registering MRs for On-Demand Paging
> >>>> RDMA/rxe: Add support for Send/Recv/Write/Read with ODP
> >>>> RDMA/rxe: Add support for the traditional Atomic operations with ODP
> >>>
> >>> What is the current situation with rxe? I don't recall seeing the bugs
> >>> that were reported get fixed?
> >
> > Well, I suppose Jason is mentioning "blktests srp/002 hang".
> > cf. https://lore.kernel.org/linux-rdma/dsg6rd66tyiei32zaxs6ddv5ebefr5vtxjwz6d2ewqrcwisogl@ge7jzan7dg5u/T/
> >
> > It is likely to be a timing issue. Bob reported that "siw hangs with the debug kernel",
> > so the hang looks not specific to rxe.
> > cf. https://lore.kernel.org/all/53ede78a-f73d-44cd-a555-f8ff36bd9c55@acm.org/T/
> > I think we need to decide whether to continue to block patches to rxe since nobody has successfully fixed the issue.
> >
> >
> > There is another issue that causes kernel panic.
> > [bug report][bisected] rdma_rxe: blktests srp lead kernel panic with 64k page size
> > cf. https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@mail.gmail.com/T/
> >
> > https://patchwork.kernel.org/project/linux-rdma/list/?series=798592&state=*
> > Zhijian has submitted patches to fix this, and he got some comments.
> > It looks he is involved in CXL driver intensively these days.
> > I guess he is still working on it.
> >
> >>
> >> Exactly. A problem is reported in the link
> >> https://www.spinics.net/lists/linux-rdma/msg120947.html
> >>
> >> It seems that a variable 'entry' set but not used
> >> [-Wunused-but-set-variable]
> >
> > Yeah, I can revise the patch anytime.
> >
> >>
> >> And ODP is an important feature. Should we suggest to add a test case
> >> about this ODP in rdma-core to verify this ODP feature?
> >
> > Rxe can share the same tests with mlx5.
> > I added test cases for Write, Read and Atomic operations with ODP,
> > and we can add more tests if there are any suggestions.
> > Cf. https://github.com/linux-rdma/rdma-core/blob/master/tests/test_odp.py
>
> Thanks a lot.
> Do you make tests with blktests after your patches are applied with the
> latest kernel?
I have not done that yet, but I agree I should do it.
I will try to take time for the test before submitting v8
Thanks,
Daisuke Matsuda
>
> Zhu Yanjun
>
> >
> > Thanks,
> > Daisuke Matsuda
> >
> >>
> >> Zhu Yanjun
> >>
> >>>
> >>> I'm reluctant to dig a deeper hold until it is done?
> >>>
> >>> Thanks,
> >>> Jason
> >
>
Powered by blists - more mailing lists