lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c62985530901210514o75a8ec67y1d48eedc996ee67b@mail.gmail.com>
Date:	Wed, 21 Jan 2009 14:14:23 +0100
From:	Frédéric Weisbecker <fweisbec@...il.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Mandeep Singh Baines <msb@...gle.com>,
	linux-kernel@...r.kernel.org, rientjes@...gle.com,
	mbligh@...gle.com, thockin@...gle.com
Subject: Re: [PATCH] softlockup: remove hung_task_check_count

2009/1/21 Ingo Molnar <mingo@...e.hu>:
>
> * Mandeep Singh Baines <msb@...gle.com> wrote:
>
>>       read_lock(&tasklist_lock);
>>       do_each_thread(g, t) {
>> -             if (!--max_count)
>> -                     goto unlock;
>> +             if (!--max_count) {
>> +                     /*
>> +                      * Drop the lock every once in a while and resched if
>> +                      * necessary. Don't want to hold the lock too long.
>> +                      */
>> +                     get_task_struct(t);
>> +                     read_unlock(&tasklist_lock);
>> +                     max_count = HUNG_TASK_CHECK_COUNT;
>> +                     if (need_resched())
>> +                             schedule();
>> +                     read_lock(&tasklist_lock);
>> +                     put_task_struct(t);
>> +                     /*
>> +                      * t was unlinked from tasklist. Can't continue in this
>> +                      * case. Exit and try again next time.
>> +                      */
>> +                     if (t->state == TASK_DEAD)
>> +                             goto unlock;
>> +             }
>
> firstly, this bit should move into a helper function. Also, why dont you
> do the need_resched() check first (it's very lighweight) - and thus only
> do the heavy ops (get-task-struct & tasklist_lock unlock) if that is set?
>
> But most importantly, isnt the logic above confused? --max_count will
> reach zero exactly once, and then we'll loop for a long time.
>
>        Ingo
>

Thanks Mandeep for this patch.

By reading Ingo's review, I notice that the checks done by
check_hung_task() are quite lightweight and then quick.
I guess your loop is able to go through a good number of tasks before
actually needing to be resched.
Perhaps the max_count can be dropped actually, since it is a kind of
arbitrary chosen constant.
So you would just need to check need_resched() at each iteration.

ie (should be split in helper functions):

bool retry = false;

do {
	read_lock(&tasklist_lock);
	do_each_thread(g, t) {
		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
		if (t->state == TASK_UNINTERRUPTIBLE)
			check_hung_task(t, now, timeout);
                if (need_resched()) {
                    get_task_struct(t);
                    read_unlock(&tasklist_lock);
                    schedule();
                    read_lock(&tasklist_lock);
                    put_task_struct(t);

                    if (t->state == TASK_DEAD) {
                        retry = true;
                        goto unlock;
                    }
                }

	} while_each_thread(g, t);
 unlock:
	read_unlock(&tasklist_lock);
} while(retry);


There are good chances that your "t" task hasn't died, but if so, you
can just retry the whole thing.
Perhaps that should require some tests to see if there is a good
average of t->stat == TASK_DEAD path not taken,
but of course, it depends on what the system is doing most.

What do you think?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ