lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1395753505-13180-1-git-send-email-amirv@mellanox.com>
Date:	Tue, 25 Mar 2014 15:18:23 +0200
From:	Amir Vadai <amirv@...lanox.com>
To:	"David S. Miller" <davem@...emloft.net>
Cc:	linux-pm@...r.kernel.org, netdev@...r.kernel.org,
	Pavel Machek <pavel@....cz>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Len Brown <len.brown@...el.com>, yuvali@...lanox.com,
	Or Gerlitz <ogerlitz@...lanox.com>,
	Yevgeny Petrilin <yevgenyp@...lanox.com>, idos@...lanox.com,
	Amir Vadai <amirv@...lanox.com>
Subject: [RFC 0/2] pm,net: Introduce QoS requests per CPU

Hi,

This patch is a preliminary work to add power management reqeusts per
core. The patch does compile, but I still need to work on it to make it
a non RFC. Please look at it as a reference, I would like to prepare the
final implementation after having a discussion in the community.

The problem
-----------
I'm maintaining Mellanox's network driver (mlx4_en) in the kernel.

The current pm_qos implementation has a problem. During a short pause in a high
bandwidth traffic, the kernel can lower the c-state to preserve energy.
When the pause ends, and the traffic resumes, the NIC hardware buffers may be
overflowed before the CPU starts to process the traffic due to the CPU wake-up
latency.

The driver can add a request to have constraint on the c-state during high
bandwidth traffic - but pm_qos only allows a global constraint for all the
CPU's.  While this fixes the problem of the wakeup latency, it is bad for power
consumption of the server.

Suggested solution
------------------
The idea is to extend the current pm_qos_request API - to have pm_qos_request
per core.
The networking driver will add a request on the CPU which handles the traffic,
to prevent it from getting into a low c-state. The request will be deleted once
there is no need to keep the CPU active.

The governor select the next idle state by looking at the target value of the
specific core in addition to the global target value, instead of using only the
global one.
If a global request is added/removed/updated, the target values of all the CPUs
are re-calculated.

When a CPU specific request is added/removed/updated, the target value of the
specific core is re-calculated to be the min/max (according to the constrain
type) value of all the global and the CPU specific constraints.

During initialization, before the CPU specific data structures are allocated
and initialized, only global target value is begin used.

I added to this patchset a preliminary work on mlx4_en. In this version the
driver restrict the c-state of all the CPU's during high bandwidth traffic. In
the final version this patch will use the new API and restrict only the
relevant CPU's c-state

TODO's
------
- Use cpumask instead of int, to enable add/del/modify request for
  cpusets in order to specificy a set of cpus, i.e a numa node.
- Update Documentation/tracing

Thanks,
Amir

Amir Vadai (2):
  pm: Introduce QoS requests per CPU
  net/mlx4_en: Use pm_qos API to avoid packet loss in high CPU c-states

 Documentation/trace/events-power.txt            |   2 +
 drivers/base/power/qos.c                        |   6 +-
 drivers/cpuidle/governors/menu.c                |   2 +-
 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c |  37 ++++
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c  |  40 +++++
 drivers/net/ethernet/mellanox/mlx4/en_rx.c      |   7 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h    |  13 ++
 include/linux/pm_qos.h                          |  22 ++-
 include/trace/events/power.h                    |  20 ++-
 kernel/power/qos.c                              | 221 ++++++++++++++++++------
 10 files changed, 302 insertions(+), 68 deletions(-)

-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ