lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAE9FiQXDHHJ0rG6hd6a3B+f0ZyJmjJ3rJGNWTsCwn21o+84C_Q@mail.gmail.com>
Date:	Mon, 23 Jan 2012 11:36:21 -0800
From:	Yinghai Lu <yinghai@...nel.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Jesse Barnes <jbarnes@...tuousgeek.org>,
	Kenji Kaneshige <kaneshige.kenji@...fujitsu.com>,
	Matthew Wilcox <matthew@....cx>, linux-pci@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/7] PCI: Make sriov work with hotplug remove

On Mon, Jan 23, 2012 at 10:45 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Mon, Jan 23, 2012 at 10:30 AM, Yinghai Lu <yinghai@...nel.org> wrote:
>> in list_for_each_safe()
>>
>> #define list_for_each_safe(pos, n, head) \
>>        for (pos = (head)->next, n = pos->next; pos != (head); \
>>                pos = n, n = pos->next)
>>
>> n is saved before, and safe only mean pos could be freed from the
>> list, but n still can be used for next loop.
>>
>> in our case, the list have PF and several VFs, when
>> pci_stop_bus_device() is called for PF, pos are still valid, but
>> VFs are removed from the list. so n will not be valid.
>
> That still doesn't explain anything.
>
> The whole point of the safe version is that if the entry that is being
> worked on is removed, then we can still use the next one.
>
> Why isn't this magically true in this case? If some *other* random
> entry than the one that is being iterated over can magically be
> removed, then the whole thing is just pure and utter crap, and no
> amount of list maintenance can ever fix it.
>
> So explain.

Now current design with SRIOV is:
when driver for PF is loaded, it will enable sriov on PF, and VFs are
created and added to
bus->devices list.

when driver for PF is removed in pci_stop_bus_device(), it will
disable sriov on PF, and VFs are removed for the list.

>
>> https://lkml.org/lkml/2011/10/15/141
>> or from attached one.
>
> This still doesn't make sense. Why do that stupid allocation? Why not
> just move the entry? Why doesn't just the sane approach work?
>
> Exactly why does not the obvious
>
>    /* stop bus device for phys_fn at first */
>    list_for_each_safe(l, n, &bus->devices) {
>        struct pci_dev *dev = pci_dev_b(l);
>        if (!dev->is_virtfn)
>            pci_stop_bus_device(dev);
>    }
>
> work? What the f*^& does that pci_stop_bus_device() function do to
> that bus->devices list that isn't just a "list_del()" of that single
> entry? And if it does anything else, it should damn well stop doing
> that.

it does not work.  because pci_stop_bus_device for PF will remove VF,
and n is not
valid.

>
> The *exact* same loop is then apparently working for the virtual
> device case, so what the hell is going on that it wouldn't work for
> the physical one?
>
> What's the confusion? Why all the crazy idiotic code that makes no sense?

Maybe we can put VF and PF in bus->devices like:
VF come first than PF?

something like:

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 0321fa3..3f63b55 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -116,11 +116,11 @@ static int virtfn_add(struct pci_dev *dev, int
id, int reset)
 	if (reset)
 		__pci_reset_function(virtfn);

+	virtfn->is_virtfn = 1;
 	pci_device_add(virtfn, virtfn->bus);
 	mutex_unlock(&iov->dev->sriov->lock);

 	virtfn->physfn = pci_dev_get(dev);
-	virtfn->is_virtfn = 1;

 	rc = pci_bus_add_device(virtfn);
 	if (rc)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 7cc9e2f..e43d81c 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1218,7 +1218,10 @@ void pci_device_add(struct pci_dev *dev, struct
pci_bus *bus)
 	 * and the bus list for fixup functions, etc.
 	 */
 	down_write(&pci_bus_sem);
-	list_add_tail(&dev->bus_list, &bus->devices);
+	if (dev->is_virtfn)
+		list_add(&dev->bus_list, &bus->devices);
+	else
+		list_add_tail(&dev->bus_list, &bus->devices);
 	up_write(&pci_bus_sem);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ