[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MW2PR2101MB1052D1FD22F1A91082843EC0D7760@MW2PR2101MB1052.namprd21.prod.outlook.com>
Date: Thu, 23 Jul 2020 02:26:00 +0000
From: Michael Kelley <mikelley@...rosoft.com>
To: "boqun.feng@...il.com" <boqun.feng@...il.com>
CC: "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-input@...r.kernel.org" <linux-input@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
KY Srinivasan <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Wei Liu <wei.liu@...nel.org>, Jiri Kosina <jikos@...nel.org>,
Benjamin Tissoires <benjamin.tissoires@...hat.com>,
Dmitry Torokhov <dmitry.torokhov@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
"James E.J. Bottomley" <jejb@...ux.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>
Subject: RE: [RFC 11/11] scsi: storvsc: Support PAGE_SIZE larger than 4K
From: boqun.feng@...il.com <boqun.feng@...il.com> Sent: Wednesday, July 22, 2020 6:52 PM
>
> On Thu, Jul 23, 2020 at 12:13:07AM +0000, Michael Kelley wrote:
> > From: Boqun Feng <boqun.feng@...il.com> Sent: Monday, July 20, 2020 6:42 PM
> > >
> > > Hyper-V always use 4k page size (HV_HYP_PAGE_SIZE), so when
> > > communicating with Hyper-V, a guest should always use HV_HYP_PAGE_SIZE
> > > as the unit for page related data. For storvsc, the data is
> > > vmbus_packet_mpb_array. And since in scsi_cmnd, sglist of pages (in unit
> > > of PAGE_SIZE) is used, we need convert pages in the sglist of scsi_cmnd
> > > into Hyper-V pages in vmbus_packet_mpb_array.
> > >
> > > This patch does the conversion by dividing pages in sglist into Hyper-V
> > > pages, offset and indexes in vmbus_packet_mpb_array are recalculated
> > > accordingly.
> > >
> > > Signed-off-by: Boqun Feng <boqun.feng@...il.com>
> > > ---
> > > drivers/scsi/storvsc_drv.c | 27 +++++++++++++++++++++------
> > > 1 file changed, 21 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> > > index fb41636519ee..c54d25f279bc 100644
> > > --- a/drivers/scsi/storvsc_drv.c
> > > +++ b/drivers/scsi/storvsc_drv.c
> > > @@ -1561,7 +1561,7 @@ static int storvsc_queuecommand(struct Scsi_Host *host,
> struct
> > > scsi_cmnd *scmnd)
> > > struct hv_host_device *host_dev = shost_priv(host);
> > > struct hv_device *dev = host_dev->dev;
> > > struct storvsc_cmd_request *cmd_request = scsi_cmd_priv(scmnd);
> > > - int i;
> > > + int i, j, k;
> > > struct scatterlist *sgl;
> > > unsigned int sg_count = 0;
> > > struct vmscsi_request *vm_srb;
> > > @@ -1569,6 +1569,8 @@ static int storvsc_queuecommand(struct Scsi_Host *host,
> struct
> > > scsi_cmnd *scmnd)
> > > struct vmbus_packet_mpb_array *payload;
> > > u32 payload_sz;
> > > u32 length;
> > > + int subpage_idx = 0;
> > > + unsigned int hvpg_count = 0;
> > >
> > > if (vmstor_proto_version <= VMSTOR_PROTO_VERSION_WIN8) {
> > > /*
> > > @@ -1643,23 +1645,36 @@ static int storvsc_queuecommand(struct Scsi_Host *host,
> struct
> > > scsi_cmnd *scmnd)
> > > payload_sz = sizeof(cmd_request->mpb);
> > >
> > > if (sg_count) {
> > > - if (sg_count > MAX_PAGE_BUFFER_COUNT) {
> > > + hvpg_count = sg_count * (PAGE_SIZE / HV_HYP_PAGE_SIZE);
> >
> > The above calculation doesn't take into account the offset in the
> > first sglist or the overall length of the transfer, so the value of hvpg_count
> > could be quite a bit bigger than it needs to be. For example, with a 64K
> > page size and an 8 Kbyte transfer size that starts at offset 60K in the
> > first page, hvpg_count will be 32 when it really only needs to be 2.
> >
> > The nested loops below that populate the pfn_array take the
> > offset into account when starting, so that's good. But it will potentially
> > leave allocated entries unused. Furthermore, the nested loops could
> > terminate early when enough Hyper-V size pages are mapped to PFNs
> > based on the length of the transfer, even if all of the last guest size
> > page has not been mapped to PFNs. Like the offset at the beginning of
> > first guest size page in the sglist, there's potentially an unused portion
> > at the end of the last guest size page in the sglist.
> >
>
> Good point. I think we could calculate the exact hvpg_count as follow:
>
> hvpg_count = 0;
> cur_sgl = sgl;
>
> for (i = 0; i < sg_count; i++) {
> hvpg_count += HVPFN_UP(cur_sg->length)
> cur_sgl = sg_next(cur_sgl);
> }
>
The downside would be going around that loop a lot of times when
the page size is 4K bytes and the I/O transfer size is something like
256K bytes. I think this gives the right result in constant time: the
starting offset within a Hyper-V page, plus the transfer length,
rounded up to a Hyper-V page size, and divided by the Hyper-V
page size.
> > > + if (hvpg_count > MAX_PAGE_BUFFER_COUNT) {
> > >
> > > - payload_sz = (sg_count * sizeof(u64) +
> > > + payload_sz = (hvpg_count * sizeof(u64) +
> > > sizeof(struct vmbus_packet_mpb_array));
> > > payload = kzalloc(payload_sz, GFP_ATOMIC);
> > > if (!payload)
> > > return SCSI_MLQUEUE_DEVICE_BUSY;
> > > }
> > >
> > > + /*
> > > + * sgl is a list of PAGEs, and payload->range.pfn_array
> > > + * expects the page number in the unit of HV_HYP_PAGE_SIZE (the
> > > + * page size that Hyper-V uses, so here we need to divide PAGEs
> > > + * into HV_HYP_PAGE in case that PAGE_SIZE > HV_HYP_PAGE_SIZE.
> > > + */
> > > payload->range.len = length;
> > > - payload->range.offset = sgl[0].offset;
> > > + payload->range.offset = sgl[0].offset & ~HV_HYP_PAGE_MASK;
> > > + subpage_idx = sgl[0].offset >> HV_HYP_PAGE_SHIFT;
> > >
> > > cur_sgl = sgl;
> > > + k = 0;
> > > for (i = 0; i < sg_count; i++) {
> > > - payload->range.pfn_array[i] =
> > > - page_to_pfn(sg_page((cur_sgl)));
> > > + for (j = subpage_idx; j < (PAGE_SIZE / HV_HYP_PAGE_SIZE); j++) {
> >
> > In the case where PAGE_SIZE == HV_HYP_PAGE_SIZE, would it help the compiler
> > eliminate the loop if local variable j is declared as unsigned? In that case the test in the
> > for statement will always be false.
> >
>
> Good point! I did the following test:
>
> test.c:
>
> int func(unsigned int input, int *arr)
> {
> unsigned int i;
> int result = 0;
>
> for (i = input; i < 1; i++)
> result += arr[i];
>
> return result;
> }
>
> if I define i as "int", I got:
>
> 0000000000000000 <func>:
> 0: 85 ff test %edi,%edi
> 2: 7f 2c jg 30 <func+0x30>
> 4: 48 63 d7 movslq %edi,%rdx
> 7: f7 df neg %edi
> 9: 45 31 c0 xor %r8d,%r8d
> c: 89 ff mov %edi,%edi
> e: 48 8d 04 96 lea (%rsi,%rdx,4),%rax
> 12: 48 01 d7 add %rdx,%rdi
> 15: 48 8d 54 be 04 lea 0x4(%rsi,%rdi,4),%rdx
> 1a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
> 20: 44 03 00 add (%rax),%r8d
> 23: 48 83 c0 04 add $0x4,%rax
> 27: 48 39 d0 cmp %rdx,%rax
> 2a: 75 f4 jne 20 <func+0x20>
> 2c: 44 89 c0 mov %r8d,%eax
> 2f: c3 retq
> 30: 45 31 c0 xor %r8d,%r8d
> 33: 44 89 c0 mov %r8d,%eax
> 36: c3 retq
>
> and when I define i as "unsigned int", I got:
>
> 0000000000000000 <func>:
> 0: 85 ff test %edi,%edi
> 2: 75 03 jne 7 <func+0x7>
> 4: 8b 06 mov (%rsi),%eax
> 6: c3 retq
> 7: 31 c0 xor %eax,%eax
> 9: c3 retq
>
> So clearly it helps, I will change this in the next version.
Wow! The compiler is good ....
>
> Regards,
> Boqun
>
> > > + payload->range.pfn_array[k] =
> > > + page_to_hvpfn(sg_page((cur_sgl))) + j;
> > > + k++;
> > > + }
> > > cur_sgl = sg_next(cur_sgl);
> > > + subpage_idx = 0;
> > > }
> > > }
> > >
> > > --
> > > 2.27.0
> >
Powered by blists - more mailing lists