Robert is a design verification manager at Texas Instruments' Microprocessor Design Center. He can be reached via e-mail at rcollins@ti.com.
It's been more than three years since Intel first published the Pentium Family User's Manual. The Manual omitted discussion of some new, advanced programming features. Intel originally planned to release this information in its manuals, but instead, put this information in a document commonly referred to as "Appendix H" (formally known as the Supplement to the Pentium Processor User's Manual) and required recipients to sign a 15-year nondisclosure agreement (NDA). This decision has been the focus of a controversy concerning Intel's right to protect its intellectual property versus the rights of all programmers to have access to information that will benefit their programs. Another point of contention is the NDA itself. Intel claims that anybody needing this information will never be denied it, as long as they sign the NDA. But several stories have circulated regarding programmers being denied because Intel claims they don't need the information. This has spawned a community of programmers dedicated to reverse engineering these features and publishing their findings on Internet newsgroups and the World Wide Web. But is all of this necessary?
Intel has promised that the not-yet-released Pentium Pro Processor Family Developer's Manual will contain information on many of these advanced features, perhaps even a description of 4-MB paging.
Four-MB paging allows the operating system to access very large data structures without constantly referencing the Translation Lookaside Buffer (TLB), which is used by the processor to cache virtual-to-physical address translations for the most recently used pages of memory. This feature is most useful to operating-system developers who want a single page of memory dedicated to the OS kernel or a large data structure, such as a video-frame buffer. Information about 4-MB paging has been publicly documented by Intel--but you need to know where to look to find it. In order to get a complete description of Pentium's 4-MB pages, you need to read both the Pentium Family User's Manual, Volume 3 (P/N 241430) and the i860TM XP Microprocessor Data Book (P/N 240874).
In the Pentium manuals, there are at least nine references to 4-MB pages. This is a good start to reverse engineering 4-MB pages. These references give you the necessary clues to write software that unlocks the secrets of page-size extensions (PSE). However, such an effort is unnecessary. The Intel i860 XP processor documentation claims the i860 XP is page-level compatible with the Intel 386, Intel 486, and Pentium processors. This compatibility is noteworthy because the i860 XP also supports 4-MB pages, and its documentation provides a complete description of the 4-MB paging mechanism (see i860TM XP Microprocessor Data Book, section 2.4). All that's needed to obtain an Appendix H description of 4-MB pages are a few references from the Pentium manuals and the description of 4-MB pages from the i860 XP manual.
When paging is enabled, linear addresses (program-visible addresses) are mapped to physical addresses (bus addresses). Paging makes it possible to execute programs much larger than the computer's available amount of memory. When the microprocessor needs more memory, it generates a page fault to demand that a portion of memory be swapped between the hard disk and main memory. Memory is partitioned into contiguous blocks, called "page frames." Each page frame is 4 KB. The Pentium paging mechanism consists of the following:
Linear addresses are converted to physical addresses by using a 20-bit pointer in a page table and combining it with the low-order 12 bits of the linear address to form a 32-bit physical address. For purposes of conversion, the linear address is broken into three parts:
With an understanding of the 4-KB paging mechanism, it's not difficult to deduce the 4-MB paging mechanism. Recall that each page-directory entry controls 4 MB of memory. Now imagine how Figure 1 would look if the page-table lookup were eliminated. The page-frame index would increase from 12 bits to 22 bits, thus allowing direct control of a 4-MB page size. The 20-bit pointer in the page directory would be reduced to a 10-bit pointer, pointing directly to the 4-MB page frame of memory. With the page-table lookup eliminated, the page directory points directly to a 4-MB page frame. This describes how 4-MB pages are implemented in the i860 XP (i860TM XP Microprocessor Data Book, section 2.4). But the question remains: Are 4-MB i860 XP pages compatible with 4-MB Pentium pages? To answer that question, we need to compare the i860 and Pentium manuals.
The i860 manual claims that the i860 4-KB paging mechanism is compatible with the x86 implementation. A comparison of page-directory format and page-table format substantiates this claim. The page-size (PS) bit of the i860 page directory shares the same location as the Pentium's PS bit (see i860TM XP Microprocessor Data Book, Figure 2.13). With this information, you can assume they are compatible, and look more closely at the Pentium manual for the mechanics of enabling and using 4-MB pages.
Volume 3 of the Pentium manual describes how CR4.PSE enables PSEs and 4-MB pages, but refers you to Appendix H for more information. Later in the Pentium manual, bit 7 of the PDE is identified as the PS bit. Without CR4.PSE=1, the Pentium will always use Intel 486-compatible (4-KB) paging, regardless of the setting of the PDE.PS bit. Similarly, when CR4.PSE=1, and PDE.PS=0, Pentium still uses Intel 486-compatible 4-KB pages. But when CR4.PSE=1, and PDE.PS=1, Pentium uses an i860 XP-compatible 4-MB paging translation.
The linear address for a 4-MB page is converted to a physical address in much the same manner as 4-KB pages. However, the access to the page table is omitted. The high-order 10 bits form an index into the page directory. The page directory no longer contains a 20-bit pointer to a page table, but instead contains a 10-bit pointer to the 4-MB page frame of memory. This convention mandates that all 4-MB pages reside on 4-MB boundaries. The 10-bit pointer in the page directory then is combined with the low-order 22 bits of the linear address to form the 32-bit physical address.
Figure 2 describes the 4-MB and 4-KB paging translation mechanism. Ironically, Figure 11-16 in Pentium Processor Family Developer's Manual, Volume 3, 1993 edition, contained a virtually identical picture. Intel obviously recognized the significance of this pictorial representation of 4-MB pages. Subsequent editions of the Pentium manual were substantially modified to remove the visual representation of the 4-MB paging mechanism.
There are side-effects and caveats to enabling 4-MB pages. Consider the following excerpt from the Pentium Processor Family Developer's Manual, Volume 3, section 23.2.14.1, which discusses compatibility with previous Intel processors:
A Page Fault exception occurs when a 1 is detected in any of the reserved bit positions of a page table entry, page directory entry, or page directory pointer during address translation by the Pentium processor.
In other words, if any reserved bit in the PDE or PTE is 1, a page fault will occur. This does not occur when CR4.PSE=0, but does when PSEs are enabled (CR4.PSE=1). Every bit in CR4 enables a behavioral extension to the Intel 486 processor. In essence, CR4 bits enable/disable incompatibilities with the Intel 486. Therefore, it is a natural extension of enabling 4-MB pages to enable more rigorous type checking of the PDE and PTE. Unfortunately, even then, the aforementioned reference isn't completely accurate. Setting some reserved bits does generate an exception, while setting others does not. This behavior contradicts the Intel documentation. If the Pentium was originally intended to behave as documented, then the documentation didn't get modified to accurately reflect the correct behavior when relaxed type checking for reserved bits was implemented. Table 1 shows all of the Pentium paging structures. All positions in the PDE and PTE marked as reserved will generate a page-fault exception when CR4.PSE=1. All positions in CR3, the PDE, and PTE marked as "0" are reserved, but don't generate a page fault when CR4.PSE=1. Table 2 describes the meaning of all of the fields listed in Table 1.
It might be tempting to believe that the "page-directory pointer" is another name for the CR3 register. This assumption would be incorrect. Actually, the mention of the page-directory pointer is a mistake. This refers to a paging structure for a new paging feature that was to be implemented in the Pentium. This new paging feature was allegedly implemented in beta silicon, but removed before production, and now appears in the Pentium Pro. I'll discuss this in my next column.
The Intel documentation also doesn't tell the whole story of the error code generated by page faults. When CR4.PSE=1, and a 1 is detected in a reserved-bit position of the PDE or PTE, the page-fault error code indicates that an attempt was made to set a reserved bit in a paging structure. This indication is reflected in bit 3 of the page-fault error code. If set to 1, then an attempt was made to set a reserved bit in the PDE or PTE. In Figure 14-7 of the Pentium Processor Family Developer's Manual, Volume 3, 1993 edition, this behavior was correctly documented, but it was removed in subsequent editions. Table 3 shows an accurate representation of the page-fault error code, as shown in the 1993 edition of the Pentium manual.
According to the 1995 edition of the Pentium user's manual, the Pentium has one code TLB and two data TLBs (Pentium Processor Family Developer's Manual, Volume 1, 1995 edition, section 33.2.1.2). The data TLBs consist of a 64-entry TLB for 4-KB page translations, and an 8-entry TLB for 4-MB page translation. The code TLB is a single 32-entry TLB which is shared by 4-KB and 4-MB page translations. The 4-MB code pages are cached in multiples of 4 KB. When the Pentium caches a 4-MB code page in the TLB, it initially uses only a single TLB entry. A code access beyond the initial 4 KB of memory associated with this TLB accesses the PDE as if it were a 4-KB page, and is given its own TLB entry.
You'd assume that enabling and disabling 4-MB pages (CR4.PSE) would invalidate the TLB, as writing to CR3 does. However, this does not occur. A potentially dangerous situation arises when a user wants to disable 4-MB pages when a 4-MB page is still cached in the TLB. Suppose the PDEs were modified with a different paging translation and point to a different area of physical memory than the 4-MB pages (this would be natural to assume, as it complies with the whole purpose of paging). Once CR4.PSE is cleared, then any 4-MB TLB entries still cached remain in effect until they are evicted or until the TLB is invalidated. (Once CR4.PSE=0, TLB entries for 4-MB data pages will never get evicted, since they have their own dedicated TLB.) Any subsequent memory (or code) accesses while the old 4-MB TLB still is cached would retrieve incorrect data. Therefore, before 4-MB paging can be disabled, all 4-MB PDEs must be modified back to 4-KB PDEs. Once the PDEs are modified, CR4.PSE can be cleared, or the TLB invalidated (which effectively disables 4-MB paging). Some could consider this a bug, but Intel's documentation states that it's the operating-system writer's responsibility to manage the paging mechanism, including invalidating the TLB (Pentium Processor Family Developer's Manual, Volume 3, section 11.3.5).
Now that we have an understanding of 4-MB paging, it should be easy to write characterization code that confirms our hypothesis. To detect whether or not 4-MB pages are implemented in Pentium as they are in the i860 XP, you could follow these steps:
You could write more characterization code to prove whether or not any other functional extensions are enabled by setting CR4.PSE. The listings available electronically demonstrate the page-faulting behavior of PSE. I've also included a program that detects the TLB size and associativity. Finally, another program demonstrates that writing any values to CR4.PSE will not invalidate the TLB.
Field Description
RSV Reserved. If set (RSV=1) may cause a page fault
when CR4.PSE=1. Setting this bit only causes a
page fault during page translation. If the
referenced page entry is in the TLB, then setting
this bit, and referencing the page will not cause
a page fault. If the entry is not in the TLB, or
gets flushed from the TLB, then the next reference
to this page will cause a page fault. The page
fault error code on the stack will have the RSV
bit set (bit3).
AVL Available for systems programmer use.
PS* This bit is always set=1. When set=1, then this
page directory entry points to a 4-MB page.
PS** This bit is always clear=0. When clear, then
this page directory entry points to a page table.
D Dirty.
A Accessed.
PCD Page Cache Disable.
PWT Page Write Through.
U User.
W Writable.
P Present.
page 60,132
;-----------------------------------------------------------------------------
; 4MPAGES.ASM Copyright (c) 1996 Robert Collins
; You have my permission to copy and distribute this software for
; non-commercial purposes. Any commercial use of this software or
; source code is allowed, so long as the appropriate copyright
; attributions (to me) are intact, *AND* my email address is properly
; displayed. Basically, give me credit, where credit is due, and
; show my email address.
;-----------------------------------------------------------------------------
; Robert R. Collins email: rcollins@metronet.com
; 7201 Avalon Dr.
; Plano, TX 75025
;-----------------------------------------------------------------------------
;-----------------------------------------------------------------------------
; Build instructions:
; Assembled using Microsoft MASM 6.11.
; To compile without the makefile:
; ML /c /DINCLUDEDIR=[YOUR FAVORITE INCLUDE DIRECTORY] 4MPAGES.ASM
; ML /c /DINCLUDEDIR=[YOUR FAVORITE INCLUDE DIRECTORY] PAGEFNS.ASM
; LINK /NON 4MPAGES.OBJ PAGEFNS.OBJ;
;-----------------------------------------------------------------------------
;-----------------------------------------------------------------------------
; Assembler directives
;-----------------------------------------------------------------------------
.xlist ; disable list file
.586P
.ALPHA
;-----------------------------------------------------------------------------
; Include file section
;-----------------------------------------------------------------------------
% Include INCLUDEDIR\\macros.inc ; Include macros
% Include INCLUDEDIR\\struct.inc ; Include structures
;-----------------------------------------------------------------------------
; Public declarations
;-----------------------------------------------------------------------------
Public GDT_PTR, ext_mem_blocks
;-----------------------------------------------------------------------------
; External declarations
;-----------------------------------------------------------------------------
Extern Init4M_Pages : Near16
Extern Check4M_Pages : Near16
Extern GetLinear4M : Near16
Extern GetPDBR : Near16
Extern PDBR : DWord
.list
;-----------------------------------------------------------------------------
; Dummy segments
;-----------------------------------------------------------------------------
INTSEG segment at 0
int0 dd ?
INTSEG ends
_DATA segment para public use16 'DATA'
;-----------------------------------------------------------------------------
; Data segment
;-----------------------------------------------------------------------------
GDT_386 label fword
GDT_PTR Descriptor <>
SEL_RMCS equ $-GDT_386
GDT_RMCS Descriptor <-1,,,9bh,0,> ; DS Descriptor
SEL_RMDS equ $-GDT_386
GDT_RMDS Descriptor <-1,,,93h,0,> ; DS Descriptor
SEL_4G equ $-GDT_386
GDT_4G Descriptor <-1h,0,0h,93h,8fh,0h> ; 4G Descriptor
GDT_Len equ ($-GDT_386) - 1
;-----------------------------------------------------------------------------
; All other data
;-----------------------------------------------------------------------------
Failure1_Msg db "4M page translation didn't work.",CRLF$
Failure2_Msg db "Unknown page translation (this should never occur).",CRLF$
Passed_Msg db "4M page translation behaves as expected.",CRLF$
ext_mem_blocks dw 0
align
OrigPTE dd 0
OrigINT0 dd 0
OrigSentinal dd 0
_DATA ENDS
_TEXT segment para public use16 'CODE'
ASSUME CS:_TEXT, DS:_DATA, ES:_DATA, SS:STACK
;-----------------------------------------------------------------------------
; Code starts here
;-----------------------------------------------------------------------------
_4MPAGES proc far
mov ax,seg STACK ; setup stack segment
mov ss,ax
mov sp,sizeof StackPtr
xor ax,ax ; clear it
pushf
push ds ; save far return on stack
push ax
;-----------------------------------------------------------------------------
; Set segments to normal data segment
;-----------------------------------------------------------------------------
mov ax,seg _DATA ; get original data segment
mov ds,ax
mov es,ax
;-----------------------------------------------------------------------------
; Check that this processor supports 4M pages.
;-----------------------------------------------------------------------------
call Check4M_Pages ; does this processor support 4M pages?
jnc @F ; yes, continue
@ErrorExit:
mov ah,9
int 21h ; print message
mov ax,4c01h ; set error code
int 21h
iret ; go split, just in case
;-----------------------------------------------------------------------------
; Setup descriptor table
;-----------------------------------------------------------------------------
@@: mov eax,ds ; make pointer to GDT table
shl eax,4 ; have physical address of segment
add eax,offset GDT_386 ; now have physical addr of table
mov GDT_PTR.Seg_limit,GDT_Len ; set length
mov GDT_PTR.Base_A15_A00,ax
shr eax,10h ; get other address bits
mov GDT_PTR.Base_A23_A16,al
mov GDT_PTR.Access_rights,ah
mov eax,cs ; get CS
shl eax,4 ; now have physical address
mov GDT_RMCS.Base_A15_A00,ax
shr eax,10h ; get other address bits
mov GDT_RMCS.Base_A23_A16,al
mov GDT_RMCS.Base_A31_A24,ah
mov eax,ds ; get DS
shl eax,4 ; now have physical address
mov GDT_RMDS.Base_A15_A00,ax
shr eax,10h ; get other address bits
mov GDT_RMDS.Base_A23_A16,al
mov GDT_RMDS.Base_A31_A24,ah
;-----------------------------------------------------------------------------
; Initialize page mode
;-----------------------------------------------------------------------------
call GetPDBR ; get address of page directory
jc @ErrorExit ; oops
mov PDBR,edx ; save it
;-----------------------------------------------------------------------------
; Read CMOS to determine the amount of extended memory.
;-----------------------------------------------------------------------------
mov al,18h
out 70h,al
IO_Delay
in al,71h
mov ah,al
mov al,17h
out 70h,al
IO_Delay
in al,71h
shr ax,6
mov ext_mem_blocks,ax
;-----------------------------------------------------------------------------
; Enter protected mode.
;-----------------------------------------------------------------------------
cli
lgdt GDT_386
mov eax,cr0 ; get control register
or al,1 ;
mov cr0,eax
push cs ; push return selector on stack
push offset PMRET ; set return offset
JMPFAR @F,SEL_RMCS
@@: mov ax,SEL_RMDS ; get DS selector
mov ds,ax
mov ax,SEL_4G ; get GS selector
mov gs,ax
;-----------------------------------------------------------------------------
; Enable page mode
;-----------------------------------------------------------------------------
call Init4M_Pages
mov ebx,PDBR ; initialize CR3
mov cr3,ebx
mov ebx,cr0 ; get 386 control register
or ebx,80000000h ; set PG bit
mov cr0,ebx ; now we're in protected mode
jmp short @F
Align
;-----------------------------------------------------------------------------
; This is the body of the test.
;-----------------------------------------------------------------------------
; Save a signature in memory so we can see if 4M pages work as expected.
;-----------------------------------------------------------------------------
@@: mov esi,FARCS
mov edi,gs:Int0[esi] ; get original memory contents
mov OrigSentinal,edi ; save it
mov gs:Int0[esi],Signature ; save signature in memory
mov edi,gs:Int0 ; get original interrupt vector
mov OrigINT0,edi ; save it
;-----------------------------------------------------------------------------
; Modify the PDE for a 4 MB page.
;-----------------------------------------------------------------------------
mov edx,SEL_4G ; get selector
lea eax,Int0[esi] ; get INT0 offset
call GetLinear4M ; get linear address
mov dword ptr gs:[edx],87h ; modify to 4M PDE
;-----------------------------------------------------------------------------
; Save original contents of signature location
;-----------------------------------------------------------------------------
mov edi,gs:[eax] ; get original PTE
mov OrigPTE,edi ; save it
mov gs:Int0,edi ; save it
;-----------------------------------------------------------------------------
; Enable 4M paging.
;-----------------------------------------------------------------------------
mov ecx,cr3 ; get CR3
mov ebx,cr4 ; get CR4
or bl,PSE ; enable 4M pages
mov cr4,ebx
mov cr3,ecx ; flush TLB
;-----------------------------------------------------------------------------
; The next memory read will read the signature or the PTE depending upon
; whether 4M paging even works.
;-----------------------------------------------------------------------------
mov ebp,gs:Int0[esi] ; try to read from 4M
;-----------------------------------------------------------------------------
; Get out of paging
;-----------------------------------------------------------------------------
mov ecx,cr3 ; clear TLB by loading
mov ebx,cr4 ; get PSE
and bl,not PSE ; turn off PSE
mov cr4,ebx
mov cr3,ecx ; CR3 with any value
;-----------------------------------------------------------------------------
; Restore original value to signature location.
;-----------------------------------------------------------------------------
mov edi,OrigSentinal
mov gs:Int0[esi],edi ; restore sentinal
mov edi,OrigINT0
mov gs:Int0,edi ; restore original INT0 handler
;-----------------------------------------------------------------------------
; Split from this program
;-----------------------------------------------------------------------------
mov cx,SEL_RMDS ; get DS selector
mov gs,cx
mov ecx,cr3 ; clear TLB by loading
mov ebx,cr0 ; get 386 control register
and ebx,not 80000001h ; clear paging bit
mov cr0,ebx ; and store in CR0
mov cr3,ecx ; CR3 with any value
retf
PMRET:
mov ax,seg _DATA
mov ds,ax
mov gs,ax
;-----------------------------------------------------------------------------
; Determine whether or not our test passed.
;-----------------------------------------------------------------------------
mov dx,offset Failure1_Msg ;
cmp ebp,Signature ; did we get a bogus signature?
je @ErrorExit ; yep
mov dx,offset Failure2_Msg
cmp ebp,OrigPTE ; was our signature our original PDE?
jne @ErrorExit ; nope
mov dx,offset Passed_Msg
mov ah,9
int 21h ; print message
mov ax,4c00h ; set error code
int 21h
iret ; go split, just in case
_4MPAGES endp
_TEXT ends
STACK segment para public 'STACK'
;-----------------------------------------------------------------------------
; Stack segment
;-----------------------------------------------------------------------------
StackPtr db 400h dup (?)
STACK ends
_ZSEG segment para public 'DATA'
_ZSEG ends
end _4MPAGES