lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20100629202643.00a80cfb@leibniz>
Date:	Tue, 29 Jun 2010 20:26:43 +0400
From:	Dan Kruchinin <kruchinin@...ell.ru>
To:	LKML <linux-kernel@...r.kernel.org>
Cc:	Steffen Klassert <steffen.klassert@...unet.com>,
	Herbert Xu <herbert@...dor.apana.org.au>
Subject: [PATCH 0/2] padata: Separate cpumasks for cb_cpus and parallel
 workers

Hello.

The main point of my patches is to make two separate cpumasks. One for
parallel and another for serial workers(callback cpus). It'll perform to
bind non-intersecting groups of CPUs for serial and parallel workers and
do more thin tuning of padata subsystem.

My tests shows that proper configuration of serial and parallel cpu
masks gives a bit better performance. For example (aes-asm,
sha1-generic. Two 16-core machines):
1) 1 point-to-point connection:
Non-modified padata gives ~650Mbit of TCP and ~780Mbit of UDP
When I exclude callback CPUs from parallel cpumask padata gives
~750Mbit of TCP and ~900Mbit of UDP.
2) 2 IPSEC tunnels between 16-core machines and 4 clients
communicating via tunnels with each-other
Non-modified padata gives ~1.5Gbit of UDP
padata with non-intersecting cpumasks for parallel and serial workers
gives ~1.8Gbit

Besides the performance growth, there may be situations when serial job
takes a lot of time. For example if I add several dozens of firewall
rules, serial worker will work slower and padata_do_parallel will
continue to enqueue requests into the queue of CPU serial worker
executes on.
It may significantly slow down parallelization and reordering because
one CPU(that is shared by both parallel and serial workers) will always
have more requests in its parallel queue than others CPUs(because
serialization takes a lot of time). In such cases user may exclude
callback CPUs from cpumask for parallel workers.

-- 
W.B.R.
Dan Kruchinin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ