linux-kernel - RE: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM6PR04MB65755C69AD3D64BC5B1E93D8FC6A9@DM6PR04MB6575.namprd04.prod.outlook.com>
Date:   Wed, 17 Mar 2021 14:22:42 +0000
From:   Avri Altman <Avri.Altman@....com>
To:     Can Guo <cang@...eaurora.org>
CC:     "James E . J . Bottomley" <jejb@...ux.vnet.ibm.com>,
        "Martin K . Petersen" <martin.petersen@...cle.com>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        Bart Van Assche <bvanassche@....org>,
        yongmyung lee <ymhungry.lee@...sung.com>,
        Daejun Park <daejun7.park@...sung.com>,
        "alim.akhtar@...sung.com" <alim.akhtar@...sung.com>,
        "asutoshd@...eaurora.org" <asutoshd@...eaurora.org>,
        Zang Leigang <zangleigang@...ilicon.com>,
        Avi Shchislowski <Avi.Shchislowski@....com>,
        Bean Huo <beanhuo@...ron.com>,
        "stanley.chu@...iatek.com" <stanley.chu@...iatek.com>
Subject: RE: [PATCH v5 06/10] scsi: ufshpb: Add hpb dev reset response

 
> 
> On 2021-03-17 20:22, Avri Altman wrote:
> >>
> >> On 2021-03-17 19:23, Avri Altman wrote:
> >> >>
> >> >> On 2021-03-02 21:24, Avri Altman wrote:
> >> >> > The spec does not define what is the host's recommended response
> when
> >> >> > the device send hpb dev reset response (oper 0x2).
> >> >> >
> >> >> > We will update all active hpb regions: mark them and do that on the
> >> >> > next
> >> >> > read.
> >> >> >
> >> >> > Signed-off-by: Avri Altman <avri.altman@....com>
> >> >> > ---
> >> >> >  drivers/scsi/ufs/ufshpb.c | 47
> >> >> ++++++++++++++++++++++++++++++++++++---
> >> >> >  drivers/scsi/ufs/ufshpb.h |  2 ++
> >> >> >  2 files changed, 46 insertions(+), 3 deletions(-)
> >> >> >
> >> >> > diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
> >> >> > index 0744feb4d484..0034fa03fdc6 100644
> >> >> > --- a/drivers/scsi/ufs/ufshpb.c
> >> >> > +++ b/drivers/scsi/ufs/ufshpb.c
> >> >> > @@ -642,7 +642,8 @@ int ufshpb_prep(struct ufs_hba *hba, struct
> >> >> > ufshcd_lrb *lrbp)
> >> >> >               if (rgn->reads == ACTIVATION_THRESHOLD)
> >> >> >                       activate = true;
> >> >> >               spin_unlock_irqrestore(&rgn->rgn_lock, flags);
> >> >> > -             if (activate) {
> >> >> > +             if (activate ||
> >> >> > +                 test_and_clear_bit(RGN_FLAG_UPDATE, &rgn->rgn_flags)) {
> 
> Other than this place, do we also need to clear this bit in places like
> ufshpb_map_req_compl_fn() and/or ufshpb_cleanup_lru_info()? Otherwise,
> this flag may be left there even after the rgn is inactivated.
I don't think so - may cause a race if device reset arrives when map request just finished.
Better to be in one place.

> 
> >> >> >                       spin_lock_irqsave(&hpb->rsp_list_lock, flags);
> >> >> >                       ufshpb_update_active_info(hpb, rgn_idx, srgn_idx);
> >> >> >                       hpb->stats.rb_active_cnt++;
> >> >> > @@ -1480,6 +1481,20 @@ void ufshpb_rsp_upiu(struct ufs_hba
> *hba,
> >> >> > struct ufshcd_lrb *lrbp)
> >> >> >       case HPB_RSP_DEV_RESET:
> >> >> >               dev_warn(&hpb->sdev_ufs_lu->sdev_dev,
> >> >> >                        "UFS device lost HPB information during PM.\n");
> >> >> > +
> >> >> > +             if (hpb->is_hcm) {
> >> >> > +                     struct scsi_device *sdev;
> >> >> > +
> >> >> > +                     __shost_for_each_device(sdev, hba->host) {
> >> >> > +                             struct ufshpb_lu *h = sdev->hostdata;
> >> >> > +
> >> >> > +                             if (!h)
> >> >> > +                                     continue;
> >> >> > +
> >> >> > +                             schedule_work(&hpb->ufshpb_lun_reset_work);
> >> >> > +                     }
> >> >> > +             }
> >> >> > +
> >> >> >               break;
> >> >> >       default:
> >> >> >               dev_notice(&hpb->sdev_ufs_lu->sdev_dev,
> >> >> > @@ -1594,6 +1609,25 @@ static void
> >> >> > ufshpb_run_inactive_region_list(struct ufshpb_lu *hpb)
> >> >> >       spin_unlock_irqrestore(&hpb->rsp_list_lock, flags);
> >> >> >  }
> >> >> >
> >> >> > +static void ufshpb_reset_work_handler(struct work_struct *work)
> >> >>
> >> >> Just curious, directly doing below things inside ufshpb_rsp_upiu()
> >> >> does
> >> >> not
> >> >> seem a problem to me, does this really deserve a separate work?
> >> > I don't know, I never even consider of doing this.
> >> > The active region list may contain up to few thousands of regions -
> >> > It is not rare to see configurations that covers the entire device.
> >> >
> >>
> >> Yes, true, it can be a huge list. But what does the ops
> >> "HPB_RSP_DEV_RESET"
> >> really mean? The specs says "Device reset HPB Regions information",
> >> but
> >> I
> >> don't know what is really happening. Could you please elaborate?
> > It means that the device informs the host that the L2P cache is no
> > longer valid.
> > The spec doesn't say what to do in that case.
> 
> Then it means that all the clean (without DIRTY flag set) HPB entries
> (ppns)
> in active rgns in host memory side may not be valid to the device
> anymore.
> Please correct me if I am wrong.
> 
> > We thought that in host mode, it make sense to update all the active
> > regions.
> 
> But current logic does not set the state of the sub-regions (in active
> regions) to
> INVALID, it only marks all active regions as UPDATE.
> 
> Although one of subsequent read cmds shall put the sub-region back to
> activate_list,
> ufshpb_test_ppn_dirty() can still return false, thus these read cmds
> still think the
> ppns are valid and they shall move forward to send HPB Write Buffer
> (buffer id = 0x2,
> in case of HPB2.0) and HPB Read cmds.
> 
> HPB Read cmds with invalid ppns will be treated as normal Read cmds by
> device as the
> specs says, but what would happen to HPB Write Buffer cmds (buffer id =
> 0x2, in case
> of HPB2.0) with invalid ppns? Can this be a real problem?
No need to control the ppn dirty / invalid state for this case.
The device send device reset so it is aware that all the L2P cache is invalid.
Any HPB_READ is treated like normal READ10.

Only once HPB-READ-BUFFER is completed,
the device will relate back to the physical address.

> 
> >
> > I think I will go with your suggestion.
> > Effectively, in host mode, since it is deactivating "cold" regions,
> > the lru list is kept relatively small, and contains only "hot" regions.
> 
> hmm... I don't really have a idea on this, please go with whatever you
> and Daejun think is fine here.
I will take your advice and remove the worker.


Thanks,
Avri

> 
> Thanks,
> Can Guo.
> 
> >
> > Thanks,
> > Avri
> >
> >>
> >> Thanks,
> >> Can Guo.
> >>
> >> > But yes, I can do that.
> >> > Better to get ack from Daejun first.
> >> >
> >> > Thanks,
> >> > Avri
> >> >
> >> >>
> >> >> Thanks,
> >> >> Can Guo.
> >> >>
> >> >> > +{
> >> >> > +     struct ufshpb_lu *hpb;
> >> >> > +     struct victim_select_info *lru_info;
> >> >> > +     struct ufshpb_region *rgn;
> >> >> > +     unsigned long flags;
> >> >> > +
> >> >> > +     hpb = container_of(work, struct ufshpb_lu,
> ufshpb_lun_reset_work);
> >> >> > +
> >> >> > +     lru_info = &hpb->lru_info;
> >> >> > +
> >> >> > +     spin_lock_irqsave(&hpb->rgn_state_lock, flags);
> >> >> > +
> >> >> > +     list_for_each_entry(rgn, &lru_info->lh_lru_rgn, list_lru_rgn)
> >> >> > +             set_bit(RGN_FLAG_UPDATE, &rgn->rgn_flags);
> >> >> > +
> >> >> > +     spin_unlock_irqrestore(&hpb->rgn_state_lock, flags);
> >> >> > +}
> >> >> > +
> >> >> >  static void ufshpb_normalization_work_handler(struct work_struct
> >> >> > *work)
> >> >> >  {
> >> >> >       struct ufshpb_lu *hpb;
> >> >> > @@ -1798,6 +1832,8 @@ static int ufshpb_alloc_region_tbl(struct
> >> >> > ufs_hba *hba, struct ufshpb_lu *hpb)
> >> >> >               } else {
> >> >> >                       rgn->rgn_state = HPB_RGN_INACTIVE;
> >> >> >               }
> >> >> > +
> >> >> > +             rgn->rgn_flags = 0;
> >> >> >       }
> >> >> >
> >> >> >       return 0;
> >> >> > @@ -2012,9 +2048,12 @@ static int ufshpb_lu_hpb_init(struct
> ufs_hba
> >> >> > *hba, struct ufshpb_lu *hpb)
> >> >> >       INIT_LIST_HEAD(&hpb->list_hpb_lu);
> >> >> >
> >> >> >       INIT_WORK(&hpb->map_work, ufshpb_map_work_handler);
> >> >> > -     if (hpb->is_hcm)
> >> >> > +     if (hpb->is_hcm) {
> >> >> >               INIT_WORK(&hpb->ufshpb_normalization_work,
> >> >> >                         ufshpb_normalization_work_handler);
> >> >> > +             INIT_WORK(&hpb->ufshpb_lun_reset_work,
> >> >> > +                       ufshpb_reset_work_handler);
> >> >> > +     }
> >> >> >
> >> >> >       hpb->map_req_cache = kmem_cache_create("ufshpb_req_cache",
> >> >> >                         sizeof(struct ufshpb_req), 0, 0, NULL);
> >> >> > @@ -2114,8 +2153,10 @@ static void ufshpb_discard_rsp_lists(struct
> >> >> > ufshpb_lu *hpb)
> >> >> >
> >> >> >  static void ufshpb_cancel_jobs(struct ufshpb_lu *hpb)
> >> >> >  {
> >> >> > -     if (hpb->is_hcm)
> >> >> > +     if (hpb->is_hcm) {
> >> >> > +             cancel_work_sync(&hpb->ufshpb_lun_reset_work);
> >> >> >               cancel_work_sync(&hpb->ufshpb_normalization_work);
> >> >> > +     }
> >> >> >       cancel_work_sync(&hpb->map_work);
> >> >> >  }
> >> >> >
> >> >> > diff --git a/drivers/scsi/ufs/ufshpb.h b/drivers/scsi/ufs/ufshpb.h
> >> >> > index 84598a317897..37c1b0ea0c0a 100644
> >> >> > --- a/drivers/scsi/ufs/ufshpb.h
> >> >> > +++ b/drivers/scsi/ufs/ufshpb.h
> >> >> > @@ -121,6 +121,7 @@ struct ufshpb_region {
> >> >> >       struct list_head list_lru_rgn;
> >> >> >       unsigned long rgn_flags;
> >> >> >  #define RGN_FLAG_DIRTY 0
> >> >> > +#define RGN_FLAG_UPDATE 1
> >> >> >
> >> >> >       /* region reads - for host mode */
> >> >> >       spinlock_t rgn_lock;
> >> >> > @@ -217,6 +218,7 @@ struct ufshpb_lu {
> >> >> >       /* for selecting victim */
> >> >> >       struct victim_select_info lru_info;
> >> >> >       struct work_struct ufshpb_normalization_work;
> >> >> > +     struct work_struct ufshpb_lun_reset_work;
> >> >> >
> >> >> >       /* pinned region information */
> >> >> >       u32 lu_pinned_start;