linux-kernel - Re: [PATCH 1/2] PCI: hv: Use IDR to generate transaction IDs for VMBus hardening

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220322125122.GA2158@anparri>
Date:   Tue, 22 Mar 2022 13:51:22 +0100
From:   Andrea Parri <parri.andrea@...il.com>
To:     "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
Cc:     KY Srinivasan <kys@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>,
        Wei Hu <weh@...rosoft.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Rob Herring <robh@...nel.org>,
        Krzysztof Wilczynski <kw@...ux.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] PCI: hv: Use IDR to generate transaction IDs for
 VMBus hardening

> > I think I should elaborate on the design underlying this submission;
> > roughly, the present solution diverges from the 'generic' requestor
> > mechanism you mentioned above in two main aspects:
> > 
> >   A) it 'moves' the ID removal into hv_compose_msi_msg() and other
> >      functions,
> 
> Right.  A key implication is that this patch allows the completion
> function to be called multiple times, if Hyper-V were to be malicious
> and send multiple responses with the same requestID.  This is OK as
> long as the completion functions are idempotent, which after looking,
> I think they are in this driver.
> 
> Furthermore, this patch allows the completion function to run anytime
> between when the requestID is created and when it is deleted.  This
> patch creates the requestID just before calling vmbus_sendpacket(),
> which is good.  The requestID is deleted later in the various functions.
> I saw only one potential problem, which is in new_pcichild_device(),
> where the new hpdev is added to a global list before the requestID is
> deleted. There's a window where the completion function could run
> and update the probed_bar[] values asynchronously after the hpdev is
> on the global list.  I don't know if this is a problem or not, but it could
> be prevented by deleting the requestID a little earlier in the function.
> 
> > 
> >   B) it adopts some ad-hoc locking scheme in the channel callback.
> > 
> > AFAICT, such changes preserve the 'confidentiality' and correctness
> > guarantees of the generic approach (modulo the issue discussed here
> > with Saurabh).
> 
> Yes, I agree, assuming the current functionality of the completion
> functions.
> 
> > 
> > These changes are justified by the bug/fix discussed in 2/2.  For
> > concreteness, consider a solution based on the VMbus requestor as
> > reported at the end of this email.
> > 
> > AFAICT, this solution can't fix the bug discussed in 2/2.  Moreover
> > (and looking back at (A-B)), we observe that:
> > 
> >   1) locking in the channel callback is not quite as desired: we'd
> >      want a request_addr_callback_nolock() say and 'protected' it
> >      together with ->completion_func();
> 
> I'm not understanding this point.  Could you clarify?

Basically (on top of the previous diff):

@@ -2700,6 +2725,7 @@ static void hv_pci_onchannelcallback(void *context)
 	int ret;
 	struct hv_pcibus_device *hbus = context;
 	struct vmbus_channel *chan = hbus->hdev->channel;
+	struct vmbus_requestor *rqstor = &chan->requestor;
 	u32 bytes_recvd;
 	u64 req_id, req_addr;
 	struct vmpacket_descriptor *desc;
@@ -2713,6 +2739,7 @@ static void hv_pci_onchannelcallback(void *context)
 	struct pci_dev_inval_block *inval;
 	struct pci_dev_incoming *dev_message;
 	struct hv_pci_dev *hpdev;
+	unsigned long flags;
 
 	buffer = kmalloc(bufferlen, GFP_ATOMIC);
 	if (!buffer)
@@ -2747,8 +2774,10 @@ static void hv_pci_onchannelcallback(void *context)
 		switch (desc->type) {
 		case VM_PKT_COMP:
 
-			req_addr = chan->request_addr_callback(chan, req_id);
+			spin_lock_irqsave(&rqstor->req_lock, flags);
+			req_addr = __hv_pci_request_addr(chan, req_id);
 			if (!req_addr || req_addr == VMBUS_RQST_ERROR) {
+				spin_unlock_irqrestore(&rqstor->req_lock, flags);
 				dev_warn_ratelimited(&hbus->hdev->device,
 						     "Invalid request ID\n");
 				break;
@@ -2758,6 +2787,7 @@ static void hv_pci_onchannelcallback(void *context)
 			comp_packet->completion_func(comp_packet->compl_ctxt,
 						     response,
 						     bytes_recvd);
+			spin_unlock_irqrestore(&rqstor->req_lock, flags);
 			break;
 
 		case VM_PKT_DATA_INBAND:


where I renamed request_addr_callback_nolock() to __hv_pci_request_addr()
(this being as in vmbus_request_addr() but without the requestor lock).
A "locked" callback would still be wanted and used in, e.g., the failure
path of hv_ringbuffer_write().


> >   2) hv_compose_msi_msg() doesn't know the value of the request ID
> >      it has allocated (hv_compose_msi_msg() -> vmbus_sendpacket();
> >      cf. also remove_request_id() in the current submission).
> 
> Agreed.  This would have to be addressed by adding another version of
> vmbus_sendpacket() that returns the request ID.

Indeed...  Some care would be needed to make sure that that "ID removal"
can't "race" with hv_pci_onchannelcallback() (which could have removed
the ID now), but yes...


> > Hope this helps clarify the problems at stake, and move forward to a
> > 'final' solution...
> 
> I think there's a reasonable way for the vmbus_next_request_id()
> mechanism to solve the problem in Patch 2/2 (if a new version of
> vmbus_sendpacket is added).  To me, that mechanism seems safer
> in that it restricts the completion function to running just once
> per requestID.  With this patch, we must remember that the
> completion functions must remain idempotent.

Fair enough.  Thank you for bearing with me and patiently reviewing these
matters.  Working out the details...

Thanks,
  Andrea