lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <868D3980-3323-4E4A-8A7A-B9C26F123A1E@zytor.com> Date: Wed, 27 Dec 2023 15:58:19 -0800 From: "H. Peter Anvin" <hpa@...or.com> To: Elizabeth Figura <zfigura@...eweavers.com>, x86@...nel.org, Linux Kernel <linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>, wine-devel@...ehq.org Subject: Re: x86 SGDT emulation for Wine On December 27, 2023 2:20:37 PM PST, Elizabeth Figura <zfigura@...eweavers.com> wrote: >Hello all, > >There is a Windows 98 program, a game called Nuclear Strike, which wants to do >some amount of direct VGA access. Part of this is port I/O, which naturally >throws SIGILL that we can trivially catch and emulate in Wine. The other part >is direct access to the video memory at 0xa0000, which in general isn't a >problem to catch and virtualize as well. > >However, this program is a bit creative about how it accesses that memory; >instead of just writing to 0xa0000 directly, it looks up a segment descriptor >whose base is at 0xa0000 and then uses the %es override to write bytes. In >pseudo-C, what it does is: > >int get_vga_selector() >{ > sgdt(&gdt_size, &gdt_ptr); > sldt(&ldt_segment); > ++gdt_size; > descriptor = gdt_ptr; > while (descriptor->base != 0xa0000) > { > ++descriptor; > gdt_size -= sizeof(*descriptor); > if (!gdt_size) > break; > } > > if (gdt_size) > return (descriptor - gdt_ptr) << 3; > > descriptor = gdt_ptr[ldt_segment >> 3]->base; > ldt_size = gdt_ptr[ldt_segment >> 3]->limit + 1; > while (descriptor->base != 0xa0000) > { > ++descriptor; > ldt_size -= sizeof(*descriptor); > if (!ldt_size) > break; > } > > if (ldt_size) > return (descriptor - ldt_ptr) << 3; > > return 0; >} > > >Currently we emulate IDT access. On a read fault, we execute sidt ourselves, >check if the read address falls within the IDT, and return some dummy data >from the exception handler if it does [1]. We can easily enough implement GDT >access as well this way, and there is even an out-of-tree patch written some >years ago that does this, and helps the game run. > >However, there are two problems that I have observed or anticipated: > >(1) On systems with UMIP, the kernel emulates sgdt instructions and returns a >consistent address which we can guarantee is invalid. However, it also returns >a size of zero. The program doesn't expect this (cf. the way the loop is >written above) and I believe will effectively loop forever in that case, or >until it finds the VGA selector or hits invalid memory. > > I see two obvious ways to fix this: either adjust the size of the fake >kernel GDT, or provide a switch to stop emulating and let Wine handle it. The >latter may very well a more sustainable option in the long term (although I'll >admit I can't immediately come up with a reason why, other than "we might need >to raise the size yet again".) > > Does anyone have opinions on this particular topic? I can look into >writing a patch but I'm not sure what the best approach is. > >(2) On 64-bit systems without UMIP, sgdt returns a truncated address when in >32-bit mode. This truncated address in practice might point anywhere in the >address space, including to valid memory. > > In order to fix this, we would need the kernel to guarantee that the GDT >base points to an address whose bottom 32 bits we can guarantee are >inaccessible. This is relatively easy to achieve ourselves by simply mapping >those pages as noaccess, but it also means that those pages can't overlap >something we need; we already go to pains to make sure that certain parts of >the address space are free. Broadly anything above the 2G boundary *should* be >okay though. Is this feasible? > > We could also just decide we don't care about systems without UMIP, but >that seems a bit unfortunate; it's not that old of a feature. But I also have >no idea how hard it would be to make this kind of a guarantee on the kernel >side. > > This is also, theoretically, a problem for the IDT, except that on the >machines I've tested, the IDT is always at 0xfffffe0000000000. That's not >great either (it's certainly caused some weirdness and confusion when >debugging, when we unexpectedly catch an unrelated null pointer access) but it >seems to work in practice. > >--Zeb > >[1] https://source.winehq.org/git/wine.git/blob/HEAD:/dlls/krnl386.exe16/ >instr.c#l702 > > A prctl() to set the UMIP-emulated return values or disable it (giving SIGILL) would be easy enough. For the non-UMIP case, and probably for a lot of other corner cases like relying on certain magic selector values and what not, the best option really would be to wrap the code in a lightweight KVM container. I do *not* mean running the Qemu user space part of KVM; instead have Wine interface with /dev/kvm directly. Non-KVM-capable hardware is basically historic at this point.
Powered by blists - more mailing lists