linux-kernel - Re: [RFC PATCH v3 2/7] ktask: multithread CPU-intensive kernel work

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20171205142102.8b53c7d6eca231b07dbf422e@linux-foundation.org>
Date:   Tue, 5 Dec 2017 14:21:02 -0800
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Daniel Jordan <daniel.m.jordan@...cle.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        aaron.lu@...el.com, dave.hansen@...ux.intel.com,
        mgorman@...hsingularity.net, mhocko@...nel.org,
        mike.kravetz@...cle.com, pasha.tatashin@...cle.com,
        steven.sistare@...cle.com, tim.c.chen@...el.com
Subject: Re: [RFC PATCH v3 2/7] ktask: multithread CPU-intensive kernel work

On Tue,  5 Dec 2017 14:52:15 -0500 Daniel Jordan <daniel.m.jordan@...cle.com> wrote:

> ktask is a generic framework for parallelizing CPU-intensive work in the
> kernel.  The intended use is for big machines that can use their CPU power to
> speed up large tasks that can't otherwise be multithreaded in userland.  The
> API is generic enough to add concurrency to many different kinds of tasks--for
> example, zeroing a range of pages or evicting a list of inodes--and aims to
> save its clients the trouble of splitting up the work, choosing the number of
> threads to use, maintaining an efficient concurrency level, starting these
> threads, and load balancing the work between them.
> 
> The Documentation patch earlier in this series has more background.
> 
> Introduces the ktask API; consumers appear in subsequent patches.
> 
> Based on work by Pavel Tatashin, Steve Sistare, and Jonathan Adams.
>
> ...
>
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -319,6 +319,18 @@ config AUDIT_TREE
>  	depends on AUDITSYSCALL
>  	select FSNOTIFY
>  
> +config KTASK
> +	bool "Multithread cpu-intensive kernel tasks"
> +	depends on SMP
> +	depends on NR_CPUS > 16

Why this?

It would make sense to relax (or eliminate) this at least for the
development/test period, so more people actually run and test the new
code.

> +	default n
> +	help
> +	  Parallelize expensive kernel tasks such as zeroing huge pages.  This
> +          feature is designed for big machines that can take advantage of their
> +          cpu count to speed up large kernel tasks.
> +
> +          If unsure, say 'N'.
> +
>  source "kernel/irq/Kconfig"
>  source "kernel/time/Kconfig"
>  
>
> ...
>
> +/*
> + * Initialize internal limits on work items queued.  Work items submitted to
> + * cmwq capped at 80% of online cpus both system-wide and per-node to maintain
> + * an efficient level of parallelization at these respective levels.
> + */
> +bool ktask_rlim_init(void)

Why not static __init?

> +{
> +	int node;
> +	unsigned nr_node_cpus;
> +
> +	spin_lock_init(&ktask_rlim_lock);

This can be done at compile time.  Unless there's a real reason for
ktask_rlim_init to be non-static, non-__init, in which case I'm
worried: reinitializing a static spinlock is weird.

> +	ktask_rlim_node_cur = kcalloc(num_possible_nodes(),
> +					       sizeof(size_t),
> +					       GFP_KERNEL);
> +	if (!ktask_rlim_node_cur) {
> +		pr_warn("can't alloc rlim counts (ktask disabled)");
> +		return false;
> +	}
> +
> +	ktask_rlim_node_max = kmalloc_array(num_possible_nodes(),
> +						     sizeof(size_t),
> +						     GFP_KERNEL);
> +	if (!ktask_rlim_node_max) {
> +		kfree(ktask_rlim_node_cur);
> +		pr_warn("can't alloc rlim maximums (ktask disabled)");
> +		return false;
> +	}
> +
> +	ktask_rlim_max = mult_frac(num_online_cpus(), KTASK_CPUFRAC_NUMER,
> +						      KTASK_CPUFRAC_DENOM);
> +	for_each_node(node) {
> +		nr_node_cpus = cpumask_weight(cpumask_of_node(node));
> +		ktask_rlim_node_max[node] = mult_frac(nr_node_cpus,
> +						      KTASK_CPUFRAC_NUMER,
> +						      KTASK_CPUFRAC_DENOM);
> +	}
> +
> +	return true;
> +}
>
> ...
>