[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA8EJpp1tcxKzftYot9td7_dBQwCLai_zWCB_AWYbZreAJbM6Q@mail.gmail.com>
Date: Sat, 31 Aug 2024 01:57:32 +0300
From: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
To: Jessica Zhang <quic_jesszhan@...cinc.com>
Cc: Rob Clark <robdclark@...il.com>, quic_abhinavk@...cinc.com,
Sean Paul <sean@...rly.run>, Marijn Suijten <marijn.suijten@...ainline.org>,
David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>, quic_ebharadw@...cinc.com,
linux-arm-msm@...r.kernel.org, dri-devel@...ts.freedesktop.org,
freedreno@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
Rob Clark <robdclark@...omium.org>
Subject: Re: [PATCH 15/21] drm/msm/dpu: Configure CWB in writeback encoder
On Fri, 30 Aug 2024 at 23:28, Jessica Zhang <quic_jesszhan@...cinc.com> wrote:
>
>
>
> On 8/30/2024 10:47 AM, Dmitry Baryshkov wrote:
> > On Thu, Aug 29, 2024 at 01:48:36PM GMT, Jessica Zhang wrote:
> >> Cache the CWB block mask in the DPU virtual encoder and configure CWB
> >> according to the CWB block mask within the writeback phys encoder
> >>
> >> Signed-off-by: Jessica Zhang <quic_jesszhan@...cinc.com>
> >> ---
> >> drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 29 +++++++++++++++
> >> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 43 ++++++++++++++++++++++
> >> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h | 9 ++++-
> >> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h | 18 ++++++++-
> >> .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 43 +++++++++++++++++++++-
> >> 5 files changed, 139 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> >> index 99eaaca405a4..c8ef59af444c 100644
> >> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> >> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> >> @@ -1236,6 +1236,33 @@ static bool dpu_crtc_has_valid_clones(struct drm_crtc *crtc,
> >> return true;
> >> }
> >>
> >> +static void dpu_crtc_set_encoder_cwb_mask(struct drm_crtc *crtc,
> >> + struct drm_crtc_state *crtc_state)
> >> +{
> >> + struct drm_encoder *drm_enc;
> >> + struct dpu_crtc_state *cstate = to_dpu_crtc_state(crtc_state);
> >> + int cwb_idx = 0;
> >> + u32 cwb_mask = 0;
> >> +
> >> + /*
> >> + * Since there can only be one CWB session at a time, if the CRTC LM
> >> + * starts with an even index we start with CWB_0. If the LM index is
> >> + * odd, we start with CWB_1
> >
> > I'd prefer to get indices from RM. They can be set during mode_set /
> > atomic_check calls.
>
> Hi Dmitry,
>
> Do you mean having CWB being managed by RM or using rm's
> get_assigned_blks() to grab the LM indices?
I was thinking of using either get_assigned_blks() or a new
CWB-specific API. This way such odd-even requirements get abstracted
in the RM.
> >> + */
> >> + if (cstate->mixers[0].hw_lm)
> >> + cwb_idx = (cstate->mixers[0].hw_lm->idx - LM_0) % 2;
> >> +
> >> + if (drm_crtc_in_clone_mode(crtc_state)) {
> >> + for (int i = 0; i < cstate->num_mixers; i++) {
> >> + cwb_mask |= (1 << cwb_idx);
> >> + cwb_idx++;
> >> + }
> >> + }
> >> +
> >> + drm_for_each_encoder_mask(drm_enc, crtc->dev, crtc_state->encoder_mask)
> >> + dpu_encoder_set_cwb_mask(drm_enc, cwb_mask);
> >
> > Ugh, no. This function writes to dpu_enc, however there is being called
> > from atomic_check(). You can not write non-state variables here as the
> > commit can get completely dropped.
>
> Ack -- will change this API to `dpu_crtc_get_cwb_mask()` and call it
> from dpu_encoder_virt_mode_set() instead.
LGTM.
>
> >
> >> +}
> >> +
> >> static int dpu_crtc_assign_resources(struct drm_crtc *crtc, struct drm_crtc_state *crtc_state)
> >> {
> >> struct dpu_hw_blk *hw_ctl[MAX_CHANNELS_PER_CRTC];
> >> @@ -1329,6 +1356,8 @@ static int dpu_crtc_atomic_check(struct drm_crtc *crtc,
> >> !dpu_crtc_has_valid_clones(crtc, crtc_state))
> >> return -EINVAL;
> >>
> >> + dpu_crtc_set_encoder_cwb_mask(crtc, crtc_state);
> >> +
> >> if (!crtc_state->enable || !drm_atomic_crtc_effectively_active(crtc_state)) {
> >> DRM_DEBUG_ATOMIC("crtc%d -> enable %d, active %d, skip atomic_check\n",
> >> crtc->base.id, crtc_state->enable,
> >> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> index f1bd14d1f89e..0f8f6c0182d5 100644
> >> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> @@ -162,6 +162,7 @@ enum dpu_enc_rc_states {
> >> * clks and resources after IDLE_TIMEOUT time.
> >> * @topology: topology of the display
> >> * @idle_timeout: idle timeout duration in milliseconds
> >> + * @cwb_mask: current encoder is in clone mode
> >> * @wide_bus_en: wide bus is enabled on this interface
> >> * @dsc: drm_dsc_config pointer, for DSC-enabled encoders
> >> */
> >> @@ -202,6 +203,7 @@ struct dpu_encoder_virt {
> >> struct msm_display_topology topology;
> >>
> >> u32 idle_timeout;
> >> + u32 cwb_mask;
> >>
> >> bool wide_bus_en;
> >>
> >> @@ -638,6 +640,33 @@ bool dpu_encoder_needs_modeset(struct drm_encoder *drm_enc, struct drm_atomic_st
> >> return false;
> >> }
> >>
> >> +void dpu_encoder_set_cwb_mask(struct drm_encoder *enc, u32 cwb_mask)
> >> +{
> >> + struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(enc);
> >> +
> >> + dpu_enc->cwb_mask = cwb_mask;
> >> +}
> >> +
> >> +u32 dpu_encoder_get_cwb_mask(struct drm_encoder *enc)
> >> +{
> >> + struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(enc);
> >> +
> >> + if (!dpu_enc)
> >> + return 0;
> >> +
> >> + return dpu_enc->cwb_mask;
> >> +}
> >> +
> >> +bool dpu_encoder_in_clone_mode(struct drm_encoder *enc)
> >> +{
> >> + struct dpu_encoder_virt *dpu_enc = to_dpu_encoder_virt(enc);
> >> +
> >> + if (!dpu_enc)
> >> + return 0;
> >> +
> >> + return dpu_enc->cwb_mask != 0;
> >> +}
> >> +
> >> struct drm_dsc_config *dpu_encoder_get_dsc_config(struct drm_encoder *drm_enc)
> >> {
> >> struct msm_drm_private *priv = drm_enc->dev->dev_private;
> >> @@ -2019,6 +2048,7 @@ void dpu_encoder_helper_phys_cleanup(struct dpu_encoder_phys *phys_enc)
> >> struct dpu_hw_ctl *ctl = phys_enc->hw_ctl;
> >> struct dpu_hw_intf_cfg intf_cfg = { 0 };
> >> int i;
> >> + enum dpu_cwb cwb_idx;
> >> struct dpu_encoder_virt *dpu_enc;
> >>
> >> dpu_enc = to_dpu_encoder_virt(phys_enc->parent);
> >> @@ -2040,6 +2070,19 @@ void dpu_encoder_helper_phys_cleanup(struct dpu_encoder_phys *phys_enc)
> >> /* mark WB flush as pending */
> >> if (phys_enc->hw_ctl->ops.update_pending_flush_wb)
> >> phys_enc->hw_ctl->ops.update_pending_flush_wb(ctl, phys_enc->hw_wb->idx);
> >> +
> >> + if (dpu_enc->cwb_mask) {
> >> + for (i = 0; i < hweight32(dpu_enc->cwb_mask); i++) {
> >
> > This will break if cwb_mask starts from bit 1: hweight will count bits,
> > so the loop will not cover the highest bit set.
>
> Ah, good point. I should be doing `hweight() + 1` here instead.
Not quite. I think it might be easier to loop up from CWB_0 to
CWB_MAX. The optimization gain from using hweight32 should be
negligible and using CWB constants will improve readability.
>
> >
> >> + if (!(dpu_enc->cwb_mask & (1 << i)))
> >> + continue;
> >> +
> >> + cwb_idx = i + CWB_0;
> >> +
> >> + if (phys_enc->hw_wb->ops.setup_input_ctrl)
> >> + phys_enc->hw_wb->ops.setup_input_ctrl(phys_enc->hw_wb,
> >> + cwb_idx, PINGPONG_NONE);
> >> + }
> >> + }
> >> } else {
> >> for (i = 0; i < dpu_enc->num_phys_encs; i++) {
> >> if (dpu_enc->phys_encs[i] && phys_enc->hw_intf->ops.bind_pingpong_blk)
> >> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> >> index 0d27e50384f0..131bb8b2c0ee 100644
> >> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> >> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> >> @@ -1,6 +1,6 @@
> >> /* SPDX-License-Identifier: GPL-2.0-only */
> >> /*
> >> - * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
> >> + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
> >> * Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
> >> * Copyright (C) 2013 Red Hat
> >> * Author: Rob Clark <robdclark@...il.com>
> >> @@ -188,6 +188,13 @@ void dpu_encoder_update_topology(struct drm_encoder *drm_enc,
> >> */
> >> bool dpu_encoder_needs_modeset(struct drm_encoder *drm_enc, struct drm_atomic_state *state);
> >>
> >> +/**
> >> + * dpu_encoder_set_cwb_mask - set the CWB blocks mask for the encoder
> >> + * @drm_enc: Pointer to previously created drm encoder structure
> >> + * @cwb_mask: CWB blocks mask to set for the encoder
> >> + */
> >> +void dpu_encoder_set_cwb_mask(struct drm_encoder *drm_enc, u32 cwb_mask);
> >> +
> >> /**
> >> * dpu_encoder_prepare_wb_job - prepare writeback job for the encoder.
> >> * @drm_enc: Pointer to previously created drm encoder structure
> >> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >> index e77ebe3a68da..f0e5c5b073e5 100644
> >> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >> @@ -1,6 +1,6 @@
> >> /* SPDX-License-Identifier: GPL-2.0-only */
> >> /*
> >> - * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
> >> + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
> >> * Copyright (c) 2015-2018 The Linux Foundation. All rights reserved.
> >> */
> >>
> >> @@ -339,6 +339,22 @@ static inline enum dpu_3d_blend_mode dpu_encoder_helper_get_3d_blend_mode(
> >> */
> >> unsigned int dpu_encoder_helper_get_dsc(struct dpu_encoder_phys *phys_enc);
> >>
> >> +/**
> >> + * dpu_encoder_get_cwb_mask - get CWB blocks mask for DPU encoder
> >> + * This helper function is used by physical encoders to get the CWB blocks
> >> + * mask used for this encoder.
> >> + * @enc: Pointer to DRM encoder structure
> >> + */
> >> +u32 dpu_encoder_get_cwb_mask(struct drm_encoder *enc);
> >> +
> >> +/**
> >> + * dpu_encoder_in_clone_mode - determine if DPU encoder is in clone mode
> >> + * This helper is used by physical encoders to determine if the encoder is in
> >> + * clone mode.
> >> + * @enc: Pointer to DRM encoder structure
> >> + */
> >> +bool dpu_encoder_in_clone_mode(struct drm_encoder *enc);
> >> +
> >> /**
> >> * dpu_encoder_get_dsc_config - get DSC config for the DPU encoder
> >> * This helper function is used by physical encoder to get DSC config
> >> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >> index 882c717859ce..e1ec64ffc742 100644
> >> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >> @@ -1,6 +1,6 @@
> >> // SPDX-License-Identifier: GPL-2.0-only
> >> /*
> >> - * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
> >> + * Copyright (c) 2022-2024 Qualcomm Innovation Center, Inc. All rights reserved.
> >> */
> >>
> >> #define pr_fmt(fmt) "[drm:%s:%d] " fmt, __func__, __LINE__
> >> @@ -264,6 +264,45 @@ static void dpu_encoder_phys_wb_setup_ctl(struct dpu_encoder_phys *phys_enc)
> >> }
> >> }
> >>
> >> +static void dpu_encoder_phys_wb_setup_cwb(struct dpu_encoder_phys *phys_enc)
> >> +{
> >> + struct dpu_hw_wb *hw_wb;
> >> + struct dpu_hw_ctl *hw_ctl;
> >> + struct dpu_hw_pingpong *hw_pp;
> >> + u32 cwb_mask, cwb_idx;
> >> +
> >> + if (!phys_enc)
> >> + return;
> >> +
> >> + hw_wb = phys_enc->hw_wb;
> >> + hw_ctl = phys_enc->hw_ctl;
> >> + hw_pp = phys_enc->hw_pp;
> >> +
> >> + if (!hw_wb || !hw_ctl || !hw_pp) {
> >> + DPU_DEBUG("[wb:%d] no ctl or pp assigned\n", hw_wb->idx - WB_0);
> >> + return;
> >> + }
> >> +
> >> + cwb_mask = dpu_encoder_get_cwb_mask(phys_enc->parent);
> >> +
> >> + for (int i = 0; i < hweight32(cwb_mask); i++) {
> >> + if (!(cwb_mask & (1 << i)))
> >> + continue;
> >> +
> >> + cwb_idx = i + CWB_0;
> >> +
> >> + if (hw_wb->ops.setup_input_ctrl)
> >> + hw_wb->ops.setup_input_ctrl(hw_wb, cwb_idx, hw_pp->idx + i);
>
> Just wanted to note that the value being programmed here is incorrect.
>
> Instead of passing in the index of the dedicated cwb pingpong, it should
> instead be the index of the *real time display* pingpong.
>
> To grab the correct value, I'm thinking of adding a DPU encoder API to
> query the RM for (non-dedicated CWB) HW_BLK_PINGPONG to grab the
> pingpong indices for the real-time encoder. I'll then call that API here
> and pass the resulting real-time hw_pp index.
Hmm. Having SDM845 and below and LM_2 / LM_5 pair in mind I thought
that we might need to get a list of PPs. But then I noticed that this
pair uses PP_2 / PP_3 pair of pingpong blocks. Yes, the proposed API
sounds good.
>
> Please let me know if you have any feedback on this proposal or
> alternate solutions.
>
> Thanks,
>
> Jessica Zhang
>
> >> +
> >> + /*
> >> + * The CWB mux supports using LM or DSPP as tap points. For now,
> >> + * always use DSPP tap point
> >> + */
> >> + if (hw_wb->ops.setup_input_mode)
> >> + hw_wb->ops.setup_input_mode(hw_wb, cwb_idx, INPUT_MODE_DSPP_OUT);
> >> + }
> >> +}
> >> +
> >> /**
> >> * _dpu_encoder_phys_wb_update_flush - flush hardware update
> >> * @phys_enc: Pointer to physical encoder
> >> @@ -342,6 +381,8 @@ static void dpu_encoder_phys_wb_setup(
> >>
> >> dpu_encoder_helper_phys_setup_cdm(phys_enc, dpu_fmt, CDM_CDWN_OUTPUT_WB);
> >>
> >> + dpu_encoder_phys_wb_setup_cwb(phys_enc);
> >> +
> >> dpu_encoder_phys_wb_setup_ctl(phys_enc);
> >> }
> >>
> >>
> >> --
> >> 2.34.1
> >>
> >
> > --
> > With best wishes
> > Dmitry
--
With best wishes
Dmitry
Powered by blists - more mailing lists