linux-kernel - Re: [PATCH 2 of 4] Introduce i386 fibril scheduling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <9BE4FD9B-5829-46D1-B9BA-B475261A4116@oracle.com>
Date:	Tue, 6 Feb 2007 17:28:31 -0500
From:	Zach Brown <zach.brown@...cle.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	David Miller <davem@...emloft.net>, kent.overstreet@...il.com,
	davidel@...ilserver.org, mingo@...e.hu,
	linux-kernel@...r.kernel.org, linux-aio@...ck.org,
	suparna@...ibm.com, bcrl@...ck.org
Subject: Re: [PATCH 2 of 4] Introduce i386 fibril scheduling

> That's not how the patches work right now, but yes, I at least  
> personally
> think that it's something we should aim for (ie the interface  
> shouldn't
> _require_ us to always wait for things even if perhaps an early
> implementation might make everything be delayed at first)

I agree that we shouldn't require a seperate syscall just to get the  
return code from ops that didn't block.

It doesn't seem like much of a stretch to imagine a setup where we  
can specify completion context as part of the submission itself.

	declare_empty_ring(ring);
	struct submission sub;

	sub.ring = &ring;
	sub.nr = SYS_fstat64;
	sub.args == ...

	ret = submit(&sub, 1);
	if (ret == 0) {
		wait_for_elements(&ring, 1);
		printf("stat gave %d\n", ring[ring->head].rc);
	}

You get the idea, it's just an outline.

wait_for_elements() could obviously check the ring before falling  
back to kernel sync.  I'm pretty keen on the notion of producer/ 
consumer rings where userspace writes the head as it plucks  
completions and the kernel writes the tail as it adds them.

We might want per-call ring pointers, instead of per submission, to  
help submitters wait for a group of ops to complete without having to  
do their own tracking on event completion.  That only makes sense if  
we have the waiting mechanics let you only be woken as the number of  
events in the ring crosses some threshold.  Which I think we want  
anyway.

We'd be trading building up a specific completion state with syscalls  
for some complexity during submission that pins (and kmaps on  
completion) the user pages.  Submission could return failure if  
pinning these new pages would push us over some rlimit.  We'd have to  
be *awfully* careful not to let userspace corrupt (munmap?) the ring  
and confuse the hell out of the kernel.

Maybe not worth it, but if we *really* cared about making the non- 
blocking case almost identical to the sync case and wanted to use the  
same interface for batch submission and async completion then this  
seems like a possibility.

- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/