lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 01 Apr 2008 17:38:40 -0700 From: "H. Peter Anvin" <hpa@...or.com> To: David Newall <davidn@...idnewall.com> CC: Jan Engelhardt <jengelh@...putergmbh.de>, "John T." <j.thomast@...oo.com>, linux-kernel@...r.kernel.org Subject: Re: UTF-8 and Alt key in the console David Newall wrote: > Jan Engelhardt wrote: >> Hence the proposal of using definite start and end markers: >> >> echo -e '\x1B43m\x1D wonderful \x1B0m\x1D' | cosmicrays | cat > > I see no merit in the idea. Most seriously, there isn't any real-world > problem being solved. In addition, it proposes creating yet another > type of terminal emulation. If there's something you don't like about > VT escape codes, use a different emulation. For example, Televideo > terminals used almost exclusively single-character control codes, > reducing the scope of being mid-sequence to, well much closer to zero. > > You need to make quite clear that your proposal is to discontinue use of > VT terminal emulation. Okay, let's put this to rest once and for all: *** ISO 6429 sequences are self-terminating. *** No, you can't tell you're inside one if you miss the leading CSI, but as has been pointed out, there really isn't a huge case for it. The standard is available for free under the name ECMA-48: http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-048.pdf It references ISO 2022, a.k.a. ECMA-35: http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-035.pdf These standards use a decimalized hexadecimal notation, so if you see "05/10" it means 0x5a. A "column" refers to a 16-character set, so "column 4" refers to bytes 0x40 to 0x4f. The structure defined in section 5.4 of ISO 6429/ECMA-48: ----------- 5.4 Control sequences A control sequence is a string of bit combinations starting with the control function CONTROL SEQUENCE INTRODUCER (CSI) followed by one or more bit combinations representing parameters, if any, and by one or more bit combinations identifying the control function. The control function CSI itself is an element of the C1 set. The format of a control sequence is CSI P ... P I ... I F where a) CSI is represented by bit combinations 01/11 (representing ESC) and 05/11 in a 7-bit code or by bit combination 09/11 in an 8-bit code, see 5.3; b) P ... P are Parameter Bytes, which, if present, consist of bit combinations from 03/00 to 03/15; c) I ... I are Intermediate Bytes, which, if present, consist of bit combinations from 02/00 to 02/15. Together with the Final Byte F, they identify the control function; NOTE The number of Intermediate Bytes is not limited by this Standard; in practice, one Intermediate Byte will be sufficient since with sixteen different bit combinations available for the Intermediate Byte over one thousand control functions may be identified. d) F is the Final Byte; it consists of a bit combination from 04/00 to 07/14; it terminates the control sequence and together with the Intermediate Bytes, if present, identifies the control function. Bit combinations 07/00 to 07/14 are available as Final Bytes of control sequences for private (or experimental) use. ----------- Note: DEC added nonstandard control sequences initiated with SS3 (ESC O) as well as CSI (ESC [); otherwise they use the same format. The Final Byte is easy enough to spot, as writing a generic parser which can pick this apart, including parameter handling. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists