[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <60A7475A-6C11-4070-9044-C6A64CCB4337@zytor.com>
Date: Sat, 24 Jan 2026 15:40:44 -0800
From: "H. Peter Anvin" <hpa@...or.com>
To: David Laight <david.laight.linux@...il.com>
CC: "Maciej W. Rozycki" <macro@...am.me.uk>, Thomas Gleixner <tglx@...nel.org>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Uros Bizjak <ubizjak@...il.com>, Petr Mladek <pmladek@...e.com>,
Andrew Morton <akpm@...ux-foundation.org>, Kees Cook <kees@...nel.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Nathan Chancellor <nathan@...nel.org>,
Kiryl Shutsemau <kas@...nel.org>,
Rick Edgecombe <rick.p.edgecombe@...el.com>,
linux-kernel@...r.kernel.org, linux-coco@...ts.linux.dev,
x86@...nel.org
Subject: Re: [PATCH v1 12/14] x86/boot: tweak a20.c for better code generation
On January 24, 2026 3:16:18 PM PST, "H. Peter Anvin" <hpa@...or.com> wrote:
>On January 24, 2026 3:07:41 PM PST, David Laight <david.laight.linux@...il.com> wrote:
>>On Fri, 23 Jan 2026 20:24:55 -0800
>>"H. Peter Anvin" <hpa@...or.com> wrote:
>>
>>> On 2026-01-23 19:00, Maciej W. Rozycki wrote:
>>> > On Wed, 21 Jan 2026, David Laight wrote:
>>> >
>>> >> No loops needed.
>>> >
>>> > A loop is needed because there can be a considerable delay from issuing
>>> > the I/O request to flip the A20 gate till the circuitry responding. This
>>> > is particularly true with the command issued to the 8042 device, which is
>>> > a microcontroller running its own firmware that needs it time to process
>>> > an incoming request to drive one of the microcontroller's GPIOs. There
>>> > was a reason for port 0x92 circuitry later added to the PC architecture
>>> > with the IBM PS/2 being called the "fast A20 gate".
>>> >
>>>
>>> Indeed. I thought I had responded to this already but I hadn't, apparently.
>>>
>>> Note that the "long" delay is 2^21 loops! That number wasn't taken out of the
>>> air, either; we found machines that actually needed that many iterations.
>>
>>Ok, so you need a loop because it might take ages for the value read from
>>0x1000200 to change.
>>But there is no need to keep changing the value.
>>The comments in the code don't really stress that.
>>
>>> In the case where A20 is enabled already, the loop terminates on either the
>>> first or second iteration (the second iteration is when the value at 0x1000200
>>> is exactly 1 higher than the value at 0x200.
>>>
>>> Modern machines (Nehalem+) already have A20 enabled, and most machines of the
>>> i686+ generation implement int 0x15 function 0x2401.
>>
>>I know some of the history.
>>And just read some more of the gory details...
>>
>>A20 being disabled is there to make a 286 compatible with the older 8086 PCs
>>and any software that relied on address wrapping (rather than using it to get
>>an extra ~64kB in real mode).
>>That would be for dos and win 3.11...
>>
>>The only 8088 and 286 cpu I used were on IO cards.
>>
>>>
>>> -hpa
>>>
>>
>
>No, there is a reason to keep changing the value: you have no idea what is currently stored in that memory, *and you have no way of knowing*.
>
>Whatever value you write might purely accidentally be the value that already is stored at that memory location.
The other thing about this code is that performance is irrelevant – it is a busy wait loop! – but consistency (hence the io_delay) and code size matter.
Powered by blists - more mailing lists