lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1k0qcglol.fsf@fess.ebiederm.org>
Date:   Fri, 12 Mar 2021 14:29:46 -0600
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Jim Newsome <jnewsome@...project.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Oleg Nesterov <oleg@...hat.com>,
        Christian Brauner <christian@...uner.io>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5] do_wait: make PIDTYPE_PID case O(1) instead of O(n)

Jim Newsome <jnewsome@...project.org> writes:

> do_wait is an internal function used to implement waitpid, waitid,
> wait4, etc. To handle the general case, it does an O(n) linear scan of
> the thread group's children and tracees.
>
> This patch adds a special-case when waiting on a pid to skip these scans
> and instead do an O(1) lookup. This improves performance when waiting on
> a pid from a thread group with many children and/or tracees.

I am going to kibitz just a little bit more.

When I looked at this a second time it became apparent that using
pid_task twice should actually be faster as it removes a dependent load
caused by thread_group_leader, and replaces it by accessing two adjacent
pointers in the same cache line.

I know the algorithmic improvement is the main advantage, but removing
60ns or so for a dependent load can't hurt.

Plus I think using the two pid types really makes it clear that one
is always a process and the other is always potentially a thread.

/*
 * Optimization for waiting on PIDTYPE_PID. No need to iterate through child
 * and tracee lists to find the target task.
 */
static int do_wait_pid(struct wait_opts *wo)
{
	bool ptrace;
	struct task_struct *target;
	int retval;

	ptrace = false;
	target = pid_task(wo->wo_pid, PIDTYPE_TGID);
	if (target && is_effectively_child(wo, ptrace, target)) {
		retval = wait_consider_task(wo, ptrace, target);
		if (retval)
			return retval;
	}

	ptrace = true;
	target = pid_task(wo->wo_pid, PIDTYPE_PID);
	if (target && target->ptrace &&
            is_effectively_child(wo, ptrace, target)) {
		retval = wait_consider_task(wo, ptrace, target);
		if (retval)
			return retval;
	}

	return 0;
}

Since the probably needs to be respun to include the improved
description can we look at my micro performance improvement?

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ