[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1tzlqj5kz.fsf@ebiederm.dsl.xmission.com>
Date: Mon, 07 Jan 2008 03:25:00 -0700
From: ebiederm@...ssion.com (Eric W. Biederman)
To: bcrl@...ck.org
Cc: linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
David Miller <davem@...emloft.net>
Subject: Re: regression: sysctl_check changes in 2.6.24 are O(n) resulting in slow creation of 10000 network interfaces
David Miller <davem@...emloft.net> writes:
> From: ebiederm@...ssion.com (Eric W. Biederman)
> Date: Sun, 06 Jan 2008 23:57:57 -0700
>
>> Why do we need 10000 interfaces? Why isn't network device creation a
>> slow path?
>
> Because people create virtual devices like mad.
>
>> So is this a bug report telling me that there are users with
>> 10k or 100k interfaces that care. So we need to fix sysctl.
>
> Unquestionably, we do, it's a major regression.
>
> People create thousands of VLAN devices, as one of many examples, all
> the time.
>
> That's why we even bother hashing network devices in the network code.
Cool thanks. Although I think that was only a 256 way hash. So it
is a bit stretched at 10,000 chain length of 39 and approaching ugly
at 100,000. Still it should perform much better the sysctl.
I think someone failed to notice that using /proc/sys slowed to a crawl
in that event, and now that I am doing a lookup on register it seems to
showing up in the benchmarks.
At 256 or fewer that we that the network device hash is optimized for
the sysctl data structures didn't look to be falling over to badly.
2s vs .2s if I read Benjamin numbers right.
Given that it is late in the release cycle (so we can't do the
surgery that it appears the internal sysctl data structures need)
I propose doing something like the patch below.
Benjamin can you test the patch below and tell me if it also
keeps the network device performance at acceptable levels.
I really don't want to remove the check for invalid binary sysctl names
or the other ones if I can help it. But that should be a small constant
cost and not make things progressively worse.
Eric
diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index a68425a..d69ef6d 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -1343,6 +1343,7 @@ static void sysctl_repair_table(struct ctl_table *table)
}
}
+#if 0
static struct ctl_table *sysctl_check_lookup(struct ctl_table *table)
{
struct ctl_table_header *head;
@@ -1385,6 +1386,7 @@ out:
sysctl_head_finish(head);
return ref;
}
+#endif
static void set_fail(const char **fail, struct ctl_table *table, const char *str)
{
@@ -1397,6 +1399,10 @@ static void set_fail(const char **fail, struct ctl_table *table, const char *str
*fail = str;
}
+#if 0
+/* Temporarily disabled to improve network device creation speed
+ * Reenable after we have fixed the sysctl data structures.
+ */
static int sysctl_check_dir(struct ctl_table *table)
{
struct ctl_table *ref;
@@ -1436,6 +1442,15 @@ static void sysctl_check_leaf(struct ctl_table *table, const char **fail)
if (ref && (ref != table))
set_fail(fail, table, "Sysctl already exists");
}
+#else
+static int sysctl_check_dir(struct ctl_table *table)
+{
+ return 0;
+}
+static void sysctl_check_leaf(struct ctl_table *table, const char **fail)
+{
+}
+#endif
static void sysctl_check_bin_path(struct ctl_table *table, const char **fail)
{
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists