lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 17 Sep 2021 15:13:24 -0400 From: Jeff King <peff@...f.net> To: Rolf Eike Beer <eb@...ix.com> Cc: Linus Torvalds <torvalds@...ux-foundation.org>, Junio C Hamano <gitster@...ox.com>, Git List Mailing <git@...r.kernel.org>, Tobias Ulmer <tu@...ix.com>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org> Subject: Re: data loss when doing ls-remote and piped to command On Fri, Sep 17, 2021 at 08:59:07AM +0200, Rolf Eike Beer wrote: > What you need is a _fast_ git server. kernel.org or github.com seem to be too > slow for this if you don't sit somewhere in their datacenter. Use something in > your local network, a Xeon E5 with lot's of RAM and connected with 1GBit/s > Ethernet in my case. One thing that puzzled me here: is the bad output between the server and ls-remote, or between ls-remote and its output pipe? I'd guess it has to be the latter, since otherwise ls-remote itself would barf with an error message. In that case, I'd think "git ls-remote ." would give you the fastest outcome, because it's talking to upload-pack on the local box. But I'm also confused how the speed could matter, as ls-remote reads the entire input into an in-memory array, and then formats it. We do the write using printf(). Is it possible your libc's stdio may drop bytes when the pipe is full, rather than blocking? In general, I'd expect write() to block, so libc doesn't have to care at all. But might there be something in your environment putting the pipe into non-blocking mode, and we get EAGAIN or something? If so, I'd expect stdio to return the error. Maybe patching Git like this would help: diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c index f4fd823af8..5936b2b42c 100644 --- a/builtin/ls-remote.c +++ b/builtin/ls-remote.c @@ -146,7 +146,8 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix) const struct ref_array_item *ref = ref_array.items[i]; if (show_symref_target && ref->symref) printf("ref: %s\t%s\n", ref->symref, ref->refname); - printf("%s\t%s\n", oid_to_hex(&ref->objectname), ref->refname); + if (printf("%s\t%s\n", oid_to_hex(&ref->objectname), ref->refname) < 0) + die_errno("printf failed"); status = 0; /* we found something */ } > And the reader must be "somewhat" slow. Using sha256sum works reliably for me. > Using "wc -l" does not, also md5sum and sha1sum are too fast as it seems. If a slow pipe is involved, maybe: git ls-remote . | (sleep 5; cat) | sha256sum would help reproduce. Assuming ls-remote's output is bigger than your system pipe buffer (which is another interesting thing to check), then it should block for 5 seconds on write() midway through the output, which you can verify with strace. -Peff
Powered by blists - more mailing lists