[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAFqH_51Ajj=Jv+6ySdQrVCWAKkVFmEbQCRadEm0OeQdaV=6xLQ@mail.gmail.com>
Date: Wed, 6 Oct 2021 13:04:47 +0200
From: Enric Balletbo Serra <eballetbo@...il.com>
To: Chun-Kuang Hu <chunkuang.hu@...nel.org>
Cc: "jason-jh.lin" <jason-jh.lin@...iatek.com>,
Philipp Zabel <p.zabel@...gutronix.de>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
Matthias Brugger <matthias.bgg@...il.com>,
Yongqiang Niu <yongqiang.niu@...iatek.com>,
dri-devel <dri-devel@...ts.freedesktop.org>,
"moderated list:ARM/Mediatek SoC support"
<linux-mediatek@...ts.infradead.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Hsin-Yi Wang <hsinyi@...omium.org>, fshao@...omium.org,
"Nancy.Lin" <nancy.lin@...iatek.com>, singo.chang@...iatek.com
Subject: Re: [v2 PATCH 1/3] drm/mediatek: Fix crash at using pkt->cl->chan in cmdq_pkt_finalize
Hi Chun-Kuang,
Missatge de Chun-Kuang Hu <chunkuang.hu@...nel.org> del dia dv., 1
d’oct. 2021 a les 17:52:
>
> Hi, Enric:
>
> Enric Balletbo Serra <eballetbo@...il.com> 於 2021年9月30日 週四 下午9:48寫道:
> >
> > Hi Chun-Kuang,
> >
> > Missatge de Chun-Kuang Hu <chunkuang.hu@...nel.org> del dia dj., 30 de
> > set. 2021 a les 15:11:
> > >
> > > Hi, Enric:
> > >
> > > Enric Balletbo Serra <eballetbo@...il.com> 於 2021年9月30日 週四 下午3:12寫道:
> > > >
> > > > Hi Jason,
> > > >
> > > >
> > > > Missatge de jason-jh.lin <jason-jh.lin@...iatek.com> del dia dj., 30
> > > > de set. 2021 a les 4:47:
> > > > >
> > > > > Because mtk_drm_crtc_create_pkt didn't assign pkt->cl, it will
> > > > > crash at using pkt->cl->chan in cmdq_pkt_finalize.
> > > > >
> > > > > So add struct cmdq_client and let mtk_drm_crtc instance define
> > > > > cmdq_client as:
> > > > >
> > > > > struct mtk_drm_crtc {
> > > > > /* client instance data */
> > > > > struct cmdq_client cmdq_client;
> > > > > };
> > > > >
> > > > > and in rx_callback function can use pkt->cl to get
> > > > > struct cmdq_client.
> > > > >
> > > > > Fixes: f4be17cd5b14 ("drm/mediatek: Remove struct cmdq_client")
> > > >
> > > > Looking at this patchset looks like you're fixing the above commit by
> > > > reintroducing the 'struct cmdq_client' again, which makes the above
> > > > commit as a non-sense commit. That's confusing and not clear. I'm
> > > > wondering if it wouldn't be more clear if you can just revert that
> > > > patch. Then if there are more changes that need to be done do it with
> > > > a follow up patch and really explain why these changes are needed.
> > >
> > > The patch f4be17cd5b14 ("drm/mediatek: Remove struct cmdq_client")
> > > does two things. One is to remove struct cmdq_client, another one is
> > > to embed cmdq_cl
> >
> > Then it should have been two patches, one thing for patch really
> > helps, specially when something breaks and you try to bisect it.
> >
> > > in mtk_drm_crtc (This means the pointer of cmdq_cl could be used to
> > > find the pointer of mtk_drm_crtc). The correct way to fix that patch
> > > is to remove the access to cmdq_client in cmdq_pkt_finalize(), but
> > > that would be a long term process. The simple way is to revert that
> > > patch, but the other patches depend on embedding cmdq_cl in
> > > mtk_drm_crtc. So this patch just revert the removing of struct
> > > cmdq_client but keep embedding cmdq_cl in mtk_drm_crtc.
> > >
> >
> > Yes, I know and I suffered that when bisecting and I ended to revert
> > the full series in my local tree, although I figured out that the
> > problem was this specific patch.
> >
> > The following series landed during -rc1 cycle and break the Acer Chromebook R13
> >
> > 9efb16c2fdd6 ("drm/mediatek: Clear pending flag when cmdq packet is done")
> > bc9241be73d9 ("drm/mediatek: Add cmdq_handle in mtk_crtc")
> > 8cdcb3653424 ("drm/mediatek: Detect CMDQ execution timeout")
> > f4be17cd5b14 ("drm/mediatek: Remove struct cmdq_client")
> > c1ec54b7b5af ("drm/mediatek: Use mailbox rx_callback instead of cmdq_task_cb")
> >
> > Apart from that it was a pain bisecting and introduced different
> > behaviours between patches, all the above commits have a follow-up
> > patch (see [1] and [2]) as a fix for the landed series. That makes me
> > think that were no stable enough. As we're in the rc, and as you said
> > this is not the correct way to fix it, and the landed patches seems
> > more a cleanup that really solving a real problem I'd consider to just
> > revert the full series and resubmit again for next release with these
> > fixes squashed. IMO that will also help to no miss anything when
> > someone would backport all this to the stable versions and understand
> > better the history.
> >
> > Just my 5 cents. In any case, I can confirm that applying the full
> > series solves the current problems that I have with my Acer Chromebook
> > R13.
>
> OK, that series depend on an WARN_ON fixes in mailbox driver, and need
> a better solution in cmdq helper, so let's revert that series first.
> Would you like to send the revert patches? Or I send the revert
> patches and let you test?
>
I'll let you to send the revert patches :-)
Thanks,
Enric
> Regards,
> Chun-Kuang.
>
> >
> > Thanks,
> > Enric
> >
> > [1] https://patchwork.kernel.org/project/linux-mediatek/list/?series=555383
> > [2] https://patchwork.kernel.org/project/linux-mediatek/list/?series=554767
> >
> >
> >
> > > Regards,
> > > Chun-Kuang.
> > >
> > > >
> > > > Thanks,
> > > > Enric
> > > >
> > > >
> > > > > Signed-off-by: jason-jh.lin <jason-jh.lin@...iatek.com>
> > > > > ---
> > > > > drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 73 +++++++++++++------------
> > > > > 1 file changed, 38 insertions(+), 35 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > > > > index 5f81489fc60c..411d99fcbb8f 100644
> > > > > --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > > > > +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > > > > @@ -52,8 +52,7 @@ struct mtk_drm_crtc {
> > > > > bool pending_async_planes;
> > > > >
> > > > > #if IS_REACHABLE(CONFIG_MTK_CMDQ)
> > > > > - struct mbox_client cmdq_cl;
> > > > > - struct mbox_chan *cmdq_chan;
> > > > > + struct cmdq_client cmdq_client;
> > > > > struct cmdq_pkt cmdq_handle;
> > > > > u32 cmdq_event;
> > > > > u32 cmdq_vblank_cnt;
> > > > > @@ -227,8 +226,8 @@ struct mtk_ddp_comp *mtk_drm_ddp_comp_for_plane(struct drm_crtc *crtc,
> > > > > }
> > > > >
> > > > > #if IS_REACHABLE(CONFIG_MTK_CMDQ)
> > > > > -static int mtk_drm_cmdq_pkt_create(struct mbox_chan *chan, struct cmdq_pkt *pkt,
> > > > > - size_t size)
> > > > > +static int mtk_drm_cmdq_pkt_create(struct cmdq_client *client, struct cmdq_pkt *pkt,
> > > > > + size_t size)
> > > > > {
> > > > > struct device *dev;
> > > > > dma_addr_t dma_addr;
> > > > > @@ -239,8 +238,9 @@ static int mtk_drm_cmdq_pkt_create(struct mbox_chan *chan, struct cmdq_pkt *pkt,
> > > > > return -ENOMEM;
> > > > > }
> > > > > pkt->buf_size = size;
> > > > > + pkt->cl = (void *)client;
> > > > >
> > > > > - dev = chan->mbox->dev;
> > > > > + dev = client->chan->mbox->dev;
> > > > > dma_addr = dma_map_single(dev, pkt->va_base, pkt->buf_size,
> > > > > DMA_TO_DEVICE);
> > > > > if (dma_mapping_error(dev, dma_addr)) {
> > > > > @@ -255,9 +255,11 @@ static int mtk_drm_cmdq_pkt_create(struct mbox_chan *chan, struct cmdq_pkt *pkt,
> > > > > return 0;
> > > > > }
> > > > >
> > > > > -static void mtk_drm_cmdq_pkt_destroy(struct mbox_chan *chan, struct cmdq_pkt *pkt)
> > > > > +static void mtk_drm_cmdq_pkt_destroy(struct cmdq_pkt *pkt)
> > > > > {
> > > > > - dma_unmap_single(chan->mbox->dev, pkt->pa_base, pkt->buf_size,
> > > > > + struct cmdq_client *client = (struct cmdq_client *)pkt->cl;
> > > > > +
> > > > > + dma_unmap_single(client->chan->mbox->dev, pkt->pa_base, pkt->buf_size,
> > > > > DMA_TO_DEVICE);
> > > > > kfree(pkt->va_base);
> > > > > kfree(pkt);
> > > > > @@ -265,8 +267,9 @@ static void mtk_drm_cmdq_pkt_destroy(struct mbox_chan *chan, struct cmdq_pkt *pk
> > > > >
> > > > > static void ddp_cmdq_cb(struct mbox_client *cl, void *mssg)
> > > > > {
> > > > > - struct mtk_drm_crtc *mtk_crtc = container_of(cl, struct mtk_drm_crtc, cmdq_cl);
> > > > > struct cmdq_cb_data *data = mssg;
> > > > > + struct cmdq_client *cmdq_cl = container_of(cl, struct cmdq_client, client);
> > > > > + struct mtk_drm_crtc *mtk_crtc = container_of(cmdq_cl, struct mtk_drm_crtc, cmdq_client);
> > > > > struct mtk_crtc_state *state;
> > > > > unsigned int i;
> > > > >
> > > > > @@ -299,7 +302,7 @@ static void ddp_cmdq_cb(struct mbox_client *cl, void *mssg)
> > > > > }
> > > > >
> > > > > mtk_crtc->cmdq_vblank_cnt = 0;
> > > > > - mtk_drm_cmdq_pkt_destroy(mtk_crtc->cmdq_chan, data->pkt);
> > > > > + mtk_drm_cmdq_pkt_destroy(data->pkt);
> > > > > }
> > > > > #endif
> > > > >
> > > > > @@ -550,24 +553,24 @@ static void mtk_drm_crtc_update_config(struct mtk_drm_crtc *mtk_crtc,
> > > > > mtk_mutex_release(mtk_crtc->mutex);
> > > > > }
> > > > > #if IS_REACHABLE(CONFIG_MTK_CMDQ)
> > > > > - if (mtk_crtc->cmdq_chan) {
> > > > > - mbox_flush(mtk_crtc->cmdq_chan, 2000);
> > > > > + if (mtk_crtc->cmdq_client.chan) {
> > > > > + mbox_flush(mtk_crtc->cmdq_client.chan, 2000);
> > > > > cmdq_handle->cmd_buf_size = 0;
> > > > > cmdq_pkt_clear_event(cmdq_handle, mtk_crtc->cmdq_event);
> > > > > cmdq_pkt_wfe(cmdq_handle, mtk_crtc->cmdq_event, false);
> > > > > mtk_crtc_ddp_config(crtc, cmdq_handle);
> > > > > cmdq_pkt_finalize(cmdq_handle);
> > > > > - dma_sync_single_for_device(mtk_crtc->cmdq_chan->mbox->dev,
> > > > > - cmdq_handle->pa_base,
> > > > > - cmdq_handle->cmd_buf_size,
> > > > > - DMA_TO_DEVICE);
> > > > > + dma_sync_single_for_device(mtk_crtc->cmdq_client.chan->mbox->dev,
> > > > > + cmdq_handle->pa_base,
> > > > > + cmdq_handle->cmd_buf_size,
> > > > > + DMA_TO_DEVICE);
> > > > > /*
> > > > > * CMDQ command should execute in next vblank,
> > > > > * If it fail to execute in next 2 vblank, timeout happen.
> > > > > */
> > > > > mtk_crtc->cmdq_vblank_cnt = 2;
> > > > > - mbox_send_message(mtk_crtc->cmdq_chan, cmdq_handle);
> > > > > - mbox_client_txdone(mtk_crtc->cmdq_chan, 0);
> > > > > + mbox_send_message(mtk_crtc->cmdq_client.chan, cmdq_handle);
> > > > > + mbox_client_txdone(mtk_crtc->cmdq_client.chan, 0);
> > > > > }
> > > > > #endif
> > > > > mtk_crtc->config_updating = false;
> > > > > @@ -581,7 +584,7 @@ static void mtk_crtc_ddp_irq(void *data)
> > > > > struct mtk_drm_private *priv = crtc->dev->dev_private;
> > > > >
> > > > > #if IS_REACHABLE(CONFIG_MTK_CMDQ)
> > > > > - if (!priv->data->shadow_register && !mtk_crtc->cmdq_chan)
> > > > > + if (!priv->data->shadow_register && !mtk_crtc->cmdq_client.chan)
> > > > > mtk_crtc_ddp_config(crtc, NULL);
> > > > > else if (mtk_crtc->cmdq_vblank_cnt > 0 && --mtk_crtc->cmdq_vblank_cnt == 0)
> > > > > DRM_ERROR("mtk_crtc %d CMDQ execute command timeout!\n",
> > > > > @@ -924,20 +927,20 @@ int mtk_drm_crtc_create(struct drm_device *drm_dev,
> > > > > mutex_init(&mtk_crtc->hw_lock);
> > > > >
> > > > > #if IS_REACHABLE(CONFIG_MTK_CMDQ)
> > > > > - mtk_crtc->cmdq_cl.dev = mtk_crtc->mmsys_dev;
> > > > > - mtk_crtc->cmdq_cl.tx_block = false;
> > > > > - mtk_crtc->cmdq_cl.knows_txdone = true;
> > > > > - mtk_crtc->cmdq_cl.rx_callback = ddp_cmdq_cb;
> > > > > - mtk_crtc->cmdq_chan =
> > > > > - mbox_request_channel(&mtk_crtc->cmdq_cl,
> > > > > - drm_crtc_index(&mtk_crtc->base));
> > > > > - if (IS_ERR(mtk_crtc->cmdq_chan)) {
> > > > > + mtk_crtc->cmdq_client.client.dev = mtk_crtc->mmsys_dev;
> > > > > + mtk_crtc->cmdq_client.client.tx_block = false;
> > > > > + mtk_crtc->cmdq_client.client.knows_txdone = true;
> > > > > + mtk_crtc->cmdq_client.client.rx_callback = ddp_cmdq_cb;
> > > > > + mtk_crtc->cmdq_client.chan =
> > > > > + mbox_request_channel(&mtk_crtc->cmdq_client.client,
> > > > > + drm_crtc_index(&mtk_crtc->base));
> > > > > + if (IS_ERR(mtk_crtc->cmdq_client.chan)) {
> > > > > dev_dbg(dev, "mtk_crtc %d failed to create mailbox client, writing register by CPU now\n",
> > > > > drm_crtc_index(&mtk_crtc->base));
> > > > > - mtk_crtc->cmdq_chan = NULL;
> > > > > + mtk_crtc->cmdq_client.chan = NULL;
> > > > > }
> > > > >
> > > > > - if (mtk_crtc->cmdq_chan) {
> > > > > + if (mtk_crtc->cmdq_client.chan) {
> > > > > ret = of_property_read_u32_index(priv->mutex_node,
> > > > > "mediatek,gce-events",
> > > > > drm_crtc_index(&mtk_crtc->base),
> > > > > @@ -945,17 +948,17 @@ int mtk_drm_crtc_create(struct drm_device *drm_dev,
> > > > > if (ret) {
> > > > > dev_dbg(dev, "mtk_crtc %d failed to get mediatek,gce-events property\n",
> > > > > drm_crtc_index(&mtk_crtc->base));
> > > > > - mbox_free_channel(mtk_crtc->cmdq_chan);
> > > > > - mtk_crtc->cmdq_chan = NULL;
> > > > > + mbox_free_channel(mtk_crtc->cmdq_client.chan);
> > > > > + mtk_crtc->cmdq_client.chan = NULL;
> > > > > } else {
> > > > > - ret = mtk_drm_cmdq_pkt_create(mtk_crtc->cmdq_chan,
> > > > > - &mtk_crtc->cmdq_handle,
> > > > > - PAGE_SIZE);
> > > > > + ret = mtk_drm_cmdq_pkt_create(&mtk_crtc->cmdq_client,
> > > > > + &mtk_crtc->cmdq_handle,
> > > > > + PAGE_SIZE);
> > > > > if (ret) {
> > > > > dev_dbg(dev, "mtk_crtc %d failed to create cmdq packet\n",
> > > > > drm_crtc_index(&mtk_crtc->base));
> > > > > - mbox_free_channel(mtk_crtc->cmdq_chan);
> > > > > - mtk_crtc->cmdq_chan = NULL;
> > > > > + mbox_free_channel(mtk_crtc->cmdq_client.chan);
> > > > > + mtk_crtc->cmdq_client.chan = NULL;
> > > > > }
> > > > > }
> > > > > }
> > > > > --
> > > > > 2.18.0
> > > > >
Powered by blists - more mailing lists