linux-kernel - Re: [PATCH v21 011/100] eclone (11/11): Document sys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date:	Sat, 29 May 2010 06:31:32 -0400
From:	Albert Cahalan <acahalan@...il.com>
To:	linux-kernel <linux-kernel@...r.kernel.org>,
	sukadev@...ux.vnet.ibm.com, randy.dunlap@...cle.com,
	linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH v21 011/100] eclone (11/11): Document sys_eclone

Sukadev Bhattiprolu writes:

> Randy Dunlap [randy.dunlap at oracle.com] wrote:
>>> base of the region allocated for stack. These architectures
>>> must pass in the size of the stack-region in ->child_stack_size.
>>
>>                               stack region
>>
>> Seems unfortunate that different architectures use
>> the fields differently.
>
> Yes and no. The field still has a single purpose, just that
> some architectures may not need it. We enforce that if unused
> on an architecture, the field must be 0. It looked like
> the easiest way to keep the API common across architectures.

Yuck. You're forcing userspace to have #ifdef messes or,
more likely, just not work on all architectures. There is
no reason to have field usage vary by architecture. The
original clone syscall was not designed with ia64 and hppa
in mind, and has been causing trouble ever since. Let's not
perpetuate the problem.

Given code like this:   stack_base = malloc(stack_size);
stack_base and stack_size are what the kernel needs.

I suspect that you chose the defective method for some reason
related to restarting processes that were created with the
older system calls. I can't say most of us even care, but in
that broken-already case your process restarter can make up
some numbers that will work. (for i386, the base could be the
lowest address in the vma in which %esp lies, or even address 0)

A related issue is that stack allocation and deallocation can
be quite painful: it is difficult (some assembly required) to
free one's own stack, and impossible if one is already dead.
We could use a flag to let the kernel handle allocation, with
the stack getting freed just after any ptracer gets a last look.
This issue is especially troublesome for me because the syscall
essentially requires per-thread memory to work; it is currently
extremely difficult to use the syscall in code which lacks that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/