lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1ac5woube.fsf@ebiederm.dsl.xmission.com>
Date:	Wed, 23 Aug 2006 00:11:17 -0600
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	linux-kernel@...r.kernel.org, akpm@...l.org, pj@....com,
	saito.tadashi@...t.fujitsu.com, ak@...e.de
Subject: Re: [RFC][PATCH] ps command race fix take2 [1/4] list token

KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> writes:

> On Tue, 22 Aug 2006 10:56:08 -0600
> ebiederm@...ssion.com (Eric W. Biederman) wrote:
>
>> KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> writes:
>> 
>> > This is ps command race fix take2. Unfortunately, against 2.6.18-rc4.
>> > I'll rebase this to appropriate kernel if O.K. (I think this is RFC)
>> >
>> > This patch implements Paul Jackson's idea, 'inserting false link in task
> list'.
>> 
>> Currently the tasklist_lock is one of the more highly contended locks in
>> the kernel.  Adding an extra place it is taken is undesirable.
> yes. taking lock is a probem.
> I know current readdir() uses 8192 bytes buffer for getdents64(). Then,
> maybe write-lock will be acquired all-tgids/400+ times for inserting token
> (in 32bit system).
>  
>> If could see a better algorithm for sending a signal to all processes
>> in a process groups we could remove the tasklist_lock entirely.
>> 
> ??
> Sorry, could you explain more ?

The core problem is not when there is a single user.  The problem is
that no matter how large the system gets we have a single lock.  So it
gets increasingly contended.

I almost removed the tasklist_lock from all read paths.  But as it
happens sending a signal to a process group is an atomic operation
with respect to fork so that path has to take the lock, or else
we get places where "kill -9 -pgrp" fails to kill every process in
the process group.  Which is even worse.


>> In addition you only solves half the readdir problems.  You don't solve
>> the seek problem which is returning to an offset you had been to
>> before.  A relatively rare case but...
>> 
> Ah, I should add lseek handler for proc root. Okay.

Hmm.  Possibly.  Mostly what I was thinking is that a token in the
list simply cannot solve the problem of a guaranteeing lseek to a
previous position works.  I really haven't looked closely on
how you handle that case.

>> > Good point of this approach is cost of searching task is O(N) (N=num of
> tgids).
>> > Bad point is lock and kmalloc/kfree.
>> > I didin't modified thread_list and cpuset's proc list, maybe future work.
>> >
>> > If searching pid bitmap is better, please take Erics.
>> 
>> My patch at least needs a good changelog but I believe it will work
>> better and can be further improved with a better pid data structure
>> if there is actually a problem there.  Given that I don't take
>> any locks it should be much friendlier at scale, and the code
>> was simpler.
> yes. it has several good points and simple.
> My patch's point is just using task_list if we can, because it exists for
> keeping
> all tasks(tgids).

One of the reasons I have an issue with it, is that with the
impending introduction of multiple pid spaces is that the task list
really isn't what we want to traverse.

>> However I will miss a few newly forked processes and I don't think your
>> technique will miss any.  Still neither will miss a process that
>> existed the entire time.
>> 
>> If nothing else I think it was worth posting so we could contrast the two.
>> 
> please post again. I think comparing the two is good.
> I will post take3 with improved comments and lseek handler, and so
> on.

I intend to, I'm unfortunately busy in another direction at the
moment.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ