Chapter 1  Introduction to the 80386
1.2  Related Literature
1.3  Notational Conventions

                     PART I    APPLICATIONS PROGRAMMING

Chapter 2  Basic Programming Model
      2.3.4  Flags Register
2.4  Instruction Format
           2.5.3.2  Effective-Address Computation

Chapter 3  Applications Instruction Set

3.2  Binary Arithmetic Instructions
      3.2.1  Addition and Subtraction Instructions
      3.2.2  Comparison and Sign Change Instruction
      3.2.3  Multiplication Instructions
      3.2.4  Division Instructions

3.4  Logical Instructions
      3.4.1  Boolean Operation Instructions
      3.4.2  Bit Test and Modify Instructions
      3.4.3  Bit Scan Instructions
      3.4.4  Shift and Rotate Instructions
              3.4.4.1  Shift Instructions
              3.4.4.2  Double-Shift Instructions
              3.4.4.3  Rotate Instructions
              3.4.4.4  Fast"bit-blt" Using Double Shift
                         Instructions
              3.4.4.5  Fast Bit-String Insert and Extract

      3.4.5  Byte-Set-On-Condition Instructions
      3.4.6  Test Instruction

3.5  Control Transfer Instructions
      3.5.1  Unconditional Transfer Instructions
              3.5.1.1  Jump Instruction
              3.5.1.2  Call Instruction
              3.5.1.3  Return and Return-From-Interrupt Instruction

      3.5.2  Conditional Transfer Instructions
              3.5.2.1  Conditional Jump Instructions
              3.5.2.2  Loop Instructions
              3.5.2.3  Executing a Loop or Repeat Zero Times

      3.5.3  Software-Generated Interrupts

3.6  String and Character Translation Instructions
      3.6.1  Repeat Prefixes
      3.6.2  Indexing and Direction Flag Control
      3.6.3  String Instructions

3.7  Instructions for Block-Structured Languages
3.8  Flag Control Instructions
      3.8.1  Carry and Direction Flag Control Instructions
      3.8.2  Flag Transfer Instructions

3.9  Coprocessor Interface Instructions
3.10 Segment Register Instructions
      3.10.1  Segment-Register Transfer Instructions
      3.10.2  Far Control Transfer Instructions
      3.10.3  Data Pointer Instructions

3.11  Miscellaneous Instructions
       3.11.1  Address Calculation Instruction
       3.11.2  No-Operation Instruction
       3.11.3  Translate Instruction


Chapter 1  Introduction to the 80386
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

The 80386 is an advanced 32-bit microprocessor optimized for multitasking
operating systems and designed for applications needing very high
performance. The 32-bit registers and data paths support 32-bit addresses
and data types. The processor can address up to four gigabytes of physical
memory and 64 terabytes (2^(46) bytes) of virtual memory. The on-chip
memory-management facilities include address translation registers,
advanced multitasking hardware, a protection mechanism, and paged virtual
memory. Special debugging registers provide data and code breakpoints even
in ROM-based software.

1.2  Related Literature

The following books contain additional material concerning the 80386
microprocessor:

  þ  Introduction to the 80386, order number 231252
  þ  80386 Hardware Reference Manual, order number 231732
  þ  80386 System Software Writer's Guide, order number 231499
  þ  80386 High Performance 32-bit Microprocessor with Integrated Memory
     Management (Data Sheet), order number 231630


1.3  Notational Conventions

This manual uses special notations for data-structure formats, for symbolic
representation of instructions, for hexadecimal numbers, and for super- and
sub-scripts. Subscript characters are surrounded by {curly brackets}, for
example 10{2} = 10 base 2. Superscript characters are preceeded by a caret
and enclosed within (parentheses), for example 10^(3) = 10 to the third
power. A review of these notations will make it easier to read the
manual.

Chapter 2  Basic Programming Model
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

2.3.4  Flags Register

The flags register is a 32-bit register named EFLAGS. Figure 2-8 defines
the bits within this register. The flags control certain operations and
indicate the status of the 80386.

The low-order 16 bits of EFLAGS is named FLAGS and can be treated as a
unit. This feature is useful when executing 8086 and 80286 code, because
this part of EFLAGS is identical to the FLAGS register of the 8086 and the
80286.

The flags may be considered in three groups: the status flags, the control
flags, and the systems flags. Discussion of the systems flags is delayed
until Part II.

Figure 2-8.  EFLAGS Register
                                              16-BIT FLAGS REGISTER
                                                         A
                                     ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
  31                  23                  15               7            0
 ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÑÍÑÍÑÍÑÍÑÍØÍÑÍÑÍÑÍÑÍÑÍÑÍÑÍÑÍÑÍÑÍÑÍÑÍ»
 º                                   ³V³R³ ³N³ IO³O³D³I³T³S³Z³ ³A³ ³P³ ³Cº
 º 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ³ ³ ³0³ ³   ³ ³ ³ ³ ³ ³ ³0³ ³0³ ³1³ º
 º                                   ³M³F³ ³T³ PL³F³F³F³F³F³F³ ³F³ ³F³ ³Fº
 ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍØÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÏÑÏÑÏÍÏÑÏÍØÍÏÑÏÑÏÑÏÑÏÑÏÑÏÍÏÑÏÍÏÑÏÍÏÑ¼
                                      ³ ³   ³  ³  ³ ³ ³ ³ ³ ³   ³   ³   ³
       VIRTUAL 8086 MODEÄÄÄXÄÄÄÄÄÄÄÄÄÄÙ ³   ³  ³  ³ ³ ³ ³ ³ ³   ³   ³   ³
             RESUME FLAGÄÄÄXÄÄÄÄÄÄÄÄÄÄÄÄÙ   ³  ³  ³ ³ ³ ³ ³ ³   ³   ³   ³
        NESTED TASK FLAGÄÄÄXÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ  ³  ³ ³ ³ ³ ³ ³   ³   ³   ³
     I/O PRIVILEGE LEVELÄÄÄXÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ  ³ ³ ³ ³ ³ ³   ³   ³   ³
                OVERFLOWÄÄÄSÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³ ³ ³ ³ ³   ³   ³   ³
          DIRECTION FLAGÄÄÄCÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³ ³ ³ ³   ³   ³   ³
        INTERRUPT ENABLEÄÄÄXÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³ ³ ³   ³   ³   ³
               TRAP FLAGÄÄÄSÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³ ³   ³   ³   ³
               SIGN FLAGÄÄÄSÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³   ³   ³   ³
               ZERO FLAGÄÄÄSÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ   ³   ³   ³
         AUXILIARY CARRYÄÄÄSÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ   ³   ³
             PARITY FLAGÄÄÄSÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ   ³
              CARRY FLAGÄÄÄSÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ

          S = STATUS FLAG, C = CONTROL FLAG, X = SYSTEM FLAG

          NOTE: 0 OR 1 INDICATES INTEL RESERVED. DO NOT DEFINE

2.4  Instruction Format

The information encoded in an 80386 instruction includes a specification of
the operation to be performed, the type of the operands to be manipulated,
and the location of these operands. If an operand is located in memory, the
instruction must also select, explicitly or implicitly, which of the
currently addressable segments contains the operand.

  þ  Prefixes ÄÄ one or more bytes preceding an instruction that modify the
     operation of the instruction.

  þ  Opcode ÄÄ specifies the operation performed by the instruction. Some
     operations have several different opcodes, each specifying a different
     variant of the operation.

  þ  Register specifier ÄÄ an instruction may specify one or two register
     operands. Register specifiers may occur either in the same byte as the
     opcode or in the same byte as the addressing-mode specifier.

  þ  Addressing-mode specifier ÄÄ when present, specifies whether an operand
     is a register or memory location; if in memory, specifies whether a
     displacement, a base register, an index register, and scaling are to be
     used.

  þ  SIB (scale, index, base) byte ÄÄ when the addressing-mode specifier
     indicates that an index register will be used to compute the address of
     an operand, an SIB byte is included in the instruction to encode the
     base register, the index register, and a scaling factor.

  þ  Displacement ÄÄ when the addressing-mode specifier indicates that a
     displacement will be used to compute the address of an operand, the
     displacement is encoded in the instruction. A displacement is a signed
     integer of 32, 16, or eight bits. The eight-bit form is used in the
     common case when the displacement is sufficiently small. The processor
     extends an eight-bit displacement to 16 or 32 bits, taking into
     account the sign.

  þ  Immediate operand ÄÄ when present, directly provides the value of an
     operand of the instruction. Immediate operands may be 8, 16, or 32 bits
     wide. In cases where an eight-bit immediate operand is combined in some
     way with a 16- or 32-bit operand, the processor automatically extends
     the size of the eight-bit operand, taking into account the sign.

2.5.3.2  Effective-Address Computation

The modR/M byte provides the most flexible of the addressing methods, and
instructions that require a modR/M byte as the second byte of the
instruction are the most common in the 80386 instruction set. For memory
operands defined by modR/M, the offset within the desired segment is
calculated by taking the sum of up to three components:

  þ  A displacement element in the instruction.

  þ  A base register.

  þ  An index register. The index register may be automatically multiplied
     by a scaling factor of 2, 4, or 8.

The offset that results from adding these components is called an effective
address. Each of these components of an effective address may have either a
positive or negative value. If the sum of all the components exceeds 2^(32),
the effective address is truncated to 32 bits.Figure 2-10 illustrates the
full set of possibilities for modR/M addressing.

The displacement component, because it is encoded in the instruction, is
useful for fixed aspects of addressing; for example:

  þ  Location of simple scalar operands.
  þ  Beginning of a statically allocated array.
  þ  Offset of an item within a record.

The base and index components have similar functions. Both utilize the same
set of general registers. Both can be used for aspects of addressing that
are determined dynamically; for example:

  þ  Location of procedure parameters and local variables in stack.

  þ  The beginning of one record among several occurrences of the same
     record type or in an array of records.

  þ  The beginning of one dimension of multiple dimension array.

  þ  The beginning of a dynamically allocated array.

The uses of general registers as base or index components differ in the
following respects:

  þ  ESP cannot be used as an index register.

  þ  When ESP or EBP is used as the base register, the default segment is
     the one selected by SS. In all other cases the default segment is DS.

The scaling factor permits efficient indexing into an array in the common
cases when array elements are 2, 4, or 8 bytes wide. The shifting of the
index register is done by the processor at the time the address is evaluated
with no performance loss. This eliminates the need for a separate shift or
multiply instruction.

The base, index, and displacement components may be used in any
combination; any of these components may be null. A scale factor can be used
only when an index is also used. Each possible combination is useful for
data structures commonly used by programmers in high-level languages and
assembly languages. Following are possible uses for some of the various
combinations of address components.

DISPLACEMENT

   The displacement alone indicates the offset of the operand. This
   combination is used to directly address a statically allocated scalar
   operand. An 8-bit, 16-bit, or 32-bit displacement can be used.

BASE

   The offset of the operand is specified indirectly in one of the general
   registers, as for "based" variables.

BASE + DISPLACEMENT

   A register and a displacement can be used together for two distinct
   purposes:

   1.  Index into static array when element size is not 2, 4, or 8 bytes.
       The displacement component encodes the offset of the beginning of
       the array. The register holds the results of a calculation to
       determine the offset of a specific element within the array.

   2.  Access item of a record. The displacement component locates an
       item within record. The base register selects one of several
       occurrences of record, thereby providing a compact encoding for
       this common function.

   An important special case of this combination, is to access parameters
   in the procedure activation record in the stack.  In this case, EBP is
   the best choice for the base register, because when EBP is used as a
   base register, the processor automatically uses the stack segment
   register (SS) to locate the operand, thereby providing a compact
   encoding for this common function.

(INDEX * SCALE) + DISPLACEMENT

   This combination provides efficient indexing into a static array when
   the element size is 2, 4, or 8 bytes. The displacement addresses the
   beginning of the array, the index register holds the subscript of the
   desired array element, and the processor automatically converts the
   subscript into an index by applying the scaling factor.

BASE + INDEX + DISPLACEMENT

   Two registers used together support either a two-dimensional array (the
   displacement determining the beginning of the array) or one of several
   instances of an array of records (the displacement indicating an item
   in the record).

BASE + (INDEX * SCALE) + DISPLACEMENT

   This combination provides efficient indexing of a two-dimensional array
   when the elements of the array are 2, 4, or 8 bytes wide.


Figure 2-10.  Effective Address Computation

      SEGMENT +    BASE   +    (INDEX * SCALE)  +     DISPLACEMENT

                 Ú     ¿
                 ³ --- ³     Ú     ¿     Ú   ¿
      Ú    ¿     ³ EAX ³     ³ EAX ³     ³ 1 ³
      ³ CS ³     ³ ECX ³     ³ ECX ³     ³   ³     Ú                     ¿
      ³ SS ³     ³ EDX ³     ³ EDX ³     ³ 2 ³     ³     NO DISPLACEMENT ³
     Ä´ DS ÃÄ + Ä´ EBX ÃÄ + Ä´ EBX ÃÄ * Ä´   ÃÄ + Ä´  8-BIT DISPLACEMENT ÃÄ
      ³ ES ³     ³ ESP ³     ³ --- ³     ³ 4 ³     ³ 32-BIT DISPLACEMENT ³
      ³ FS ³     ³ EBP ³     ³ EBP ³     ³   ³     À                     Ù
      ³ GS ³     ³ ESI ³     ³ ESI ³     ³ 6 ³
      À    Ù     ³ EDI ³     ³ EDI ³     À   Ù
                 À     Ù     À     Ù

Chapter 3  Applications Instruction Set
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

This chapter presents an overview of the instructions which programmers can
use to write application software for the 80386 executing in protected
virtual-address mode. The instructions are grouped by categories of related
functions.

The instructions not discussed in this chapter are those that are normally
used only by operating-system programmers. Part II describes the operation
of these instructions.

The descriptions in this chapter assume that the 80386 is operating in
protected mode with 32-bit addressing in effect; however, all instructions
discussed are also available when 16-bit addressing is in effect in
protected mode, real mode, or virtual 8086 mode. For any differences of
operation that exist in the various modes, refer to Chapter 13,
Chapter 14, or Chapter 15.

The instruction dictionary in Chapter 17 contains more detailed
descriptions of all instructions, including encoding, operation, timing,
effect on flags, and exceptions.


3.1  Data Movement Instructions

3.1.3  Type Conversion Instructions

The type conversion instructions convert bytes into words, words into
doublewords, and doublewords into 64-bit items (quad-words). These
instructions are especially useful for converting signed integers, because
they automatically fill the extra bits of the larger item with the value of
the sign bit.  This is called sign extension.

There are two classes of type conversion instructions:

  1.  The forms CWD, CDQ, CBW, and CWDE which operate only on data in the
      EAX register.

  2.  The forms MOVSX and MOVZX, which permit one operand to be in any
      general register while permitting the other operand to be in memory or
      in a register.

CWD (Convert Word to Doubleword) and CDQ (Convert Doubleword to Quad-Word)
double the size of the source operand. CWD extends the sign of the
word in register AX throughout register DX. CDQ extends the sign of the
doubleword in EAX throughout EDX. CWD can be used to produce a doubleword
dividend from a word before a word division, and CDQ can be used to produce
a quad-word dividend from a doubleword before doubleword division.

CBW (Convert Byte to Word) extends the sign of the byte in register AL
throughout AX.

CWDE (Convert Word to Doubleword Extended) extends the sign of the word in
register AX throughout EAX.

MOVSX (Move with Sign Extension) sign-extends an 8-bit value to a 16-bit
value and a 8- or 16-bit value to 32-bit value.

MOVZX (Move with Zero Extension) extends an 8-bit value to a 16-bit value
and an 8- or 16-bit value to 32-bit value by inserting high-order zeros.

3.2  Binary Arithmetic Instructions

Many of the arithmetic instructions operate on both signed and unsigned
integers. These instructions update the flags ZF, CF, SF, and OF in such a
manner that subsequent instructions can interpret the results of the
arithmetic as either signed or unsigned. CF contains information relevant to
unsigned integers; SF and OF contain information relevant to signed
integers. ZF is relevant to both signed and unsigned integers; ZF is set
when all bits of the result are zero.

If the integer is unsigned, CF may be tested after one of these arithmetic
operations to determine whether the operation required a carry or borrow of
a one-bit in the high-order position of the destination operand. CF is set
if a one-bit was carried out of the high-order position (addition
instructions ADD, ADC, AAA, and DAA) or if a one-bit was carried (i.e.
borrowed) into the high-order bit (subtraction instructions SUB, SBB, AAS,
DAS, CMP, and NEG).

If the integer is signed, both SF and OF should be tested. SF always has
the same value as the sign bit of the result. The most significant bit (MSB)
of a signed integer is the bit next to the signÄÄbit 6 of a byte, bit 14 of
a word, or bit 30 of a doubleword. OF is set in either of these cases:

  þ  A one-bit was carried out of the MSB into the sign bit but no one bit
     was carried out of the sign bit (addition instructions ADD, ADC, INC,
     AAA, and DAA). In other words, the result was greater than the greatest
     positive number that could be contained in the destination operand.

  þ  A one-bit was carried from the sign bit into the MSB but no one bit
     was carried into the sign bit (subtraction instructions SUB, SBB, DEC,
     AAS, DAS, CMP, and NEG). In other words, the result was smaller that
     the smallest negative number that could be contained in the destination
     operand.

3.2.3  Multiplication Instructions

The 80386 has separate multiply instructions for unsigned and signed
operands. MUL operates on unsigned numbers, while IMUL operates on signed
integers as well as unsigned.

MUL (Unsigned Integer Multiply) performs an unsigned multiplication of the
source operand and the accumulator. If the source is a byte, the processor
multiplies it by the contents of AL and returns the double-length result to
AH and AL. If the source operand is a word, the processor multiplies it by
the contents of AX and returns the double-length result to DX and AX. If the
source operand is a doubleword, the processor multiplies it by the contents
of EAX and returns the 64-bit result in EDX and EAX. MUL sets CF and OF
when the upper half of the result is nonzero; otherwise, they are cleared.

IMUL (Signed Integer Multiply) performs a signed multiplication operation.
IMUL has three variations:

  1.  A one-operand form. The operand may be a byte, word, or doubleword
      located in memory or in a general register. This instruction uses EAX
      and EDX as implicit operands in the same way as the MUL instruction.

  2.  A two-operand form. One of the source operands may be in any general
      register while the other may be either in memory or in a general
      register. The product replaces the general-register operand.

  3.  A three-operand form; two are source and one is the destination
      operand. One of the source operands is an immediate value stored in
      the instruction; the second may be in memory or in any general
      register. The product may be stored in any general register. The
      immediate operand is treated as signed. If the immediate operand is a
      byte, the processor automatically sign-extends it to the size of the
      second operand before performing the multiplication.

The three forms are similar in most respects:

  þ  The length of the product is calculated to twice the length of the
     operands.

  þ  The CF and OF flags are set when significant bits are carried into the
     high-order half of the result. CF and OF are cleared when the
     high-order half of the result is the sign-extension of the low-order
     half.

However, forms 2 and 3 differ in that the product is truncated to the
length of the operands before it is stored in the destination register.
Because of this truncation, OF should be tested to ensure that no
significant bits are lost. (For ways to test OF, refer to the INTO and PUSHF
instructions.)

Forms 2 and 3 of IMUL may also be used with unsigned operands because,
whether the operands are signed or unsigned, the low-order half of the
product is the same.


3.2.4  Division Instructions

The 80386 has separate division instructions for unsigned and signed
operands. DIV operates on unsigned numbers, while IDIV operates on signed
integers as well as unsigned. In either case, an exception (interrupt zero)
occurs if the divisor is zero or if the quotient is too large for AL, AX, or
EAX.

DIV (Unsigned Integer Divide) performs an unsigned division of the
accumulator by the source operand. The dividend (the accumulator) is twice
the size of the divisor (the source operand); the quotient and remainder
have the same size as the divisor, as the following table shows.

Size of Source Operand
      (divisor)             Dividend       Quotient      Remainder

Byte                        AX             AL            AH
Word                        DX:AX          AX            DX
Doubleword                  EDX:EAX        EAX           EDX

Non-integral quotients are truncated to integers toward 0. The remainder is
always less than the divisor. For unsigned byte division, the largest
quotient is 255. For unsigned word division, the largest quotient is 65,535.
For unsigned doubleword division the largest quotient is 2^(32) -1.

IDIV (Signed Integer Divide) performs a signed division of the accumulator
by the source operand. IDIV uses the same registers as the DIV instruction.

For signed byte division, the maximum positive quotient is +127, and the
minimum negative quotient is -128. For signed word division, the maximum
positive quotient is +32,767, and the minimum negative quotient is -32,768.
For signed doubleword division the maximum positive quotient is 2^(31) -1,
the minimum negative quotient is -2^(31). Non-integral results are truncated
towards 0. The remainder always has the same sign as the dividend and is
less than the divisor in magnitude.

3.4  Logical Instructions

The group of logical instructions includes:

  þ  Bit test and modify instructions
  þ  Bit scan instructions
  þ  Rotate and shift instructions
  þ  Byte set on condition

3.4.2  Bit Test and Modify Instructions

This group of instructions operates on a single bit which can be in memory
or in a general register. The location of the bit is specified as an offset
from the low-order end of the operand. The value of the offset either may be
given by an immediate byte in the instruction or may be contained in a
general register.

These instructions first assign the value of the selected bit to CF, the
carry flag. Then a new value is assigned to the selected bit, as determined
by the operation. OF, SF, ZF, AF, PF are left in an undefined state. Table
3-1 defines these instructions.


Table 3-1. Bit Test and Modify Instructions

Instruction                      Effect on CF            Effect on
                                                         Selected Bit

Bit (Bit Test)                   CF  BIT                (none)
BTS (Bit Test and Set)           CF  BIT                BIT  1
BTR (Bit Test and Reset)         CF  BIT                BIT  0
BTC (Bit Test and Complement)    CF  BIT                BIT  NOT(BIT)

3.4.3  Bit Scan Instructions

These instructions scan a word or doubleword for a one-bit and store the
index of the first set bit into a register.  The bit string being scanned
may be either in a register or in memory. The ZF flag is set if the entire
word is zero (no set bits are found); ZF is cleared if a one-bit is found.
If no set bit is found, the value of the destination register is undefined.

BSF (Bit Scan Forward) scans from low-order to high-order (starting from
bit index zero).

BSR (Bit Scan Reverse) scans from high-order to low-order (starting from
bit index 15 of a word or index 31 of a doubleword).

3.4.4  Shift and Rotate Instructions

The shift and rotate instructions reposition the bits within the specified
operand.

These instructions fall into the following classes:

  þ  Shift instructions
  þ  Double shift instructions
  þ  Rotate instructions

3.4.4.1  Shift Instructions

Figure 3-9.  Using SAR to Simulate IDIV

    ; assuming N is in ECX, and the dividend is in EAX
    ;                                               CLOCKS
    CMP     EAX, 0      ; to set sign flag          2
    JGE     NoAdjust    ; jump if sign is zero      3 or 9
    ADD     EAX, ECX    ;                           2
    DEC     EAX         ; EAX := EAX + (N-1)        2
NoAdjust:
    SAR     EAX, CL     ;                           3
    ;                       TOTAL CLOCKS           12 or 18]


3.4.4.4  Fast "BIT BLT" Using Double Shift Instructions

One purpose of the double shifts is to implement a bit string move, with
arbitrary misalignment of the bit strings.  This is called a "bit blt" (BIT
BLock Transfer.)  A simple example is to move a bit string from an arbitrary
offset into a doubleword-aligned byte string.  A left-to-right string is
moved 32 bits at a time if a double shift is used inside the move loop.

     MOV   ESI,ScrAddr
     MOV   EDI,DestAddr
     MOV   EBX,WordCnt
     MOV   CL,RelOffset      ; relative offset Dest-Src
     MOV   EDX,[ESI]         ; load first word of source
     ADD   ESI,4             ; bump source address
BltLoop:
     LODS                    ; new low order part
     SHLD  EDX,EAX,CL        ; EDX overwritten with aligned stuff
     XCHG  EDX,EAS           ; Swap high/low order parts
     STOS                    ; Write out next aligned chunk
     DEC   EBX
     JA    BltLoop

This loop is simple yet allows the data to be moved in 32-bit pieces for
the highest possible performance. Without a double shift, the best that can
be achieved is 16 bits per loop iteration by using a 32-bit shift and
replacing the XCHG with a ROR by 16 to swap high and low order parts of
registers. A more general loop than shown above would require some extra
masking on the first doubleword moved (before the main loop), and on the
last doubleword moved (after the main loop), but would have the same basic
32-bits per loop iteration as the code above.


3.4.4.5  Fast Bit-String Insert and Extract

The double shift instructions also enable:

  þ  Fast insertion of a bit string from a register into an arbitrary bit
     location in a larger bit string in memory without disturbing the bits
     on either side of the inserted bits.

  þ  Fast extraction of a bits string into a register from an arbitrary bit
     location in a larger bit string in memory without disturbing the bits
     on either side of the extracted bits.

The following coded examples illustrate bit insertion and extraction under
variousconditions:

  1.  Bit String Insert into Memory (when bit string is 1-25 bits long,
      i.e., spans four bytes or less):

      ; Insert a right-justified bit string from register into
      ; memory bit string.
      ;
      ; Assumptions:
      ; 1) The base of the string array is dword aligned, and
      ; 2) the length of the bit string is an immediate value
      ;    but the bit offset is held in a register.
      ;
      ; Register ESI holds the right-justified bit string
      ; to be inserted.
      ; Register EDI holds the bit offset of the start of the
      ; substring.
      ; Registers EAX and ECX are also used by this
      ; "insert" operation.
      ;
      MOV   ECX,EDI      ; preserve original offset for later use
      SHR   EDI,3        ; signed divide offset by 8 (byte address)
      AND   CL,7H        ; isolate low three bits of offset in CL
      MOV   EAX,[EDI]strg_base      ; move string dword into EAX
      ROR   EAX,CL       ; right justify old bit field
      SHRD  EAX,ESI,length          ; bring in new bits
      ROL   EAX,length   ; right justify new bit field
      ROL   EAX,CL       ; bring to final position
      MOV   [EDI]strg_base,EAX      ; replace dword in memory

  2.  Bit String Insert into Memory (when bit string is 1-31 bits long, i.e.
      spans five bytes or less):

      ; Insert a right-justified bit string from register into
      ; memory bit string.
      ;
      ; Assumptions:
      ; 1) The base of the string array is dword aligned, and
      ; 2) the length of the bit string is an immediate value
      ;    but the bit offset is held in a register.
      ;
      ; Register ESI holds the right-justified bit string
      ; to be inserted.
      ; Register EDI holds the bit offset of the start of the
      ; substring.
      ; Registers EAX, EBX, ECX, and EDI are also used by
      ; this "insert" operation.
      ;
      MOV   ECX,EDI     ; temp storage for offset
      SHR   EDI,5       ; signed divide offset by 32 (dword address)
      SHL   EDI,2       ; multiply by 4 (in byte address format)
      AND   CL,1FH      ; isolate low five bits of offset in CL
      MOV   EAX,[EDI]strg_base      ; move low string dword into EAX
      MOV   EDX,[EDI]strg_base+4    ; other string dword into EDX
      MOV   EBX,EAX     ; temp storage for part of string     ¿ rotate
      SHRD  EAX,EDX,CL  ; double shift by offset within dword Ã EDX:EAX
      SHRD  EAX,EBX,CL  ; double shift by offset within dword Ù right
      SHRD  EAX,ESI,length          ; bring in new bits
      ROL   EAX,length  ; right justify new bit field
      MOV   EBX,EAX     ; temp storage for part of string         ¿ rotate
      SHLD  EAX,EDX,CL  ; double shift back by offset within word Ã EDX:EAX
      SHLD  EDX,EBX,CL  ; double shift back by offset within word Ù left
      MOV   [EDI]strg_base,EAX      ; replace dword in memory
      MOV   [EDI]strg_base+4,EDX    ; replace dword in memory

  3.  Bit String Insert into Memory (when bit string is exactly 32 bits
      long, i.e., spans five or four types of memory):

      ; Insert right-justified bit string from register into
      ; memory bit string.
      ;
      ; Assumptions:
      ; 1) The base of the string array is dword aligned, and
      ; 2) the length of the bit string is 32
      ;    but the bit offset is held in a register.
      ;
      ; Register ESI holds the 32-bit string to be inserted.
      ; Register EDI holds the bit offset of the start of the
      ; substring.
      ; Registers EAX, EBX, ECX, and EDI are also used by
      ; this "insert" operation.
      ;
      MOV   EDX,EDI     ; preserve original offset for later use
      SHR   EDI,5       ; signed divide offset by 32 (dword address)
      SHL   EDI,2       ; multiply by 4 (in byte address format)
      AND   CL,1FH      ; isolate low five bits of offset in CL
      MOV   EAX,[EDI]strg_base      ; move low string dword into EAX
      MOV   EDX,[EDI]strg_base+4    ; other string dword into EDX
      MOV   EBX,EAX     ; temp storage for part of string     ¿ rotate
      SHRD  EAX,EDX     ; double shift by offset within dword Ã EDX:EAX
      SHRD  EDX,EBX     ; double shift by offset within dword Ù right
      MOV   EAX,ESI     ; move 32-bit bit field into position
      MOV   EBX,EAX     ; temp storage for part of string         ¿ rotate
      SHLD  EAX,EDX     ; double shift back by offset within word Ã EDX:EAX
      SHLD  EDX,EBX     ; double shift back by offset within word Ù left
      MOV   [EDI]strg_base,EAX      ; replace dword in memory
      MOV   [EDI]strg_base,+4,EDX   ; replace dword in memory

  4.  Bit String Extract from Memory (when bit string is 1-25 bits long,
      i.e., spans four bytes or less):

      ; Extract a right-justified bit string from memory bit
      ; string into register
      ;
      ; Assumptions:
      ; 1) The base of the string array is dword aligned, and
      ; 2) the length of the bit string is an immediate value
      ;    but the bit offset is held in a register.
      ;
      ; Register EAX holds the right-justified, zero-padded
      ; bit string that was extracted.
      ; Register EDI holds the bit offset of the start of the
      ; substring.
      ; Registers EDI, and ECX are also used by this "extract."
      ;
      MOV  ECX,EDI      ; temp storage for offset
      SHR  EDI,3        ; signed divide offset by 8 (byte address)
      AND  CL,7H        ; isolate low three bits of offset
      MOV  EAX,[EDI]strg_base       ; move string dword into EAX
      SHR  EAX,CL       ; shift by offset within dword
      AND  EAX,mask     ; extracted bit field in EAX

  5.  Bit String Extract from Memory (when bit string is 1-32 bits long, 
      i.e., spans five bytes or less):

      ; Extract a right-justified bit string from memory bit
      ; string into register.
      ;
      ; Assumptions:
      ; 1) The base of the string array is dword aligned, and
      ; 2) the length of the bit string is an immediate
      ;    value but the bit offset is held in a register.
      ;
      ; Register EAX holds the right-justified, zero-padded
      ; bit string that was extracted.
      ; Register EDI holds the bit offset of the start of the
      ; substring.
      ; Registers EAX, EBX, and ECX are also used by this "extract."
      MOV   ECX,EDI     ; temp storage for offset
      SHR   EDI,5       ; signed divide offset by 32 (dword address)
      SHL   EDI,2       ; multiply by 4 (in byte address format)
      AND   CL,1FH      ; isolate low five bits of offset in CL
      MOV   EAX,[EDI]strg_base      ; move low string dword into EAX
      MOV   EDX,[EDI]strg_base+4    ; other string dword into EDX
      SHRD  EAX,EDX,CL  ; double shift right by offset within dword
      AND   EAX,mask    ; extracted bit field in EAX


3.4.5  Byte-Set-On-Condition Instructions

This group of instructions sets a byte to zero or one depending on any of
the 16 conditions defined by the status flags. The byte may be in memory or
may be a one-byte general register. These instructions are especially useful
for implementing Boolean expressions in high-level languages such as Pascal.

3.6  String and Character Translation Instructions

The power of 80386 string operations derives from the following features of
the architecture:

1.  A set of primitive string operations

   MOVS   ÄÄ Move String
   CMPS   ÄÄ Compare string
   SCAS   ÄÄ Scan string
   LODS   ÄÄ Load string
   STOS   ÄÄ Store string

3.6.1  Repeat Prefixes

All three prefixes causes the hardware to automatically repeat the
associated string primitive until ECX=0. The differences among the repeat
prefixes have to do with the second termination condition. REPE/REPZ and
REPNE/REPNZ are used exclusively with the SCAS (Scan String) and CMPS
(Compare String) primitives. When these prefixes are used, repetition of the
next instruction depends on the zero flag (ZF) as well as the ECX register.
ZF does not require initialization before execution of a repeated string
instruction, because both SCAS and CMPS set ZF according to the results of
the comparisons they make. The differences are summarized in the
accompanying table.

Prefix                      Termination         Termination
                            Condition 1         Condition 2

REP                           ECX = 0             (none)
REPE/REPZ                     ECX = 0             ZF = 0
REPNE/REPNZ                   ECX = 0             ZF = 1

3.6.2  String Instructions

CMPS (Compare Strings) subtracts the destination string element (at ES:EDI)
from the source string element (at ESI) and updates the flags AF, SF, PF, CF
and OF. If the string elements are equal, ZF=1; otherwise, ZF=0. 

SCAS (Scan String) subtracts the destination string element at ES:EDI from
EAX, AX, or AL and updates the flags AF, SF, ZF, PF, CF and OF. If the
values are equal, ZF=1; otherwise, ZF=0. 

When either the REPE or REPNE prefix modifies either the SCAS or CMPS
primitives, the processor compares the value of the current string element
with the value in EAX for doubleword elements, in AX for word elements, or
in AL for byte elements. Termination of the repeated operation depends on
the resulting state of ZF as well as on the value in ECX.

3.7  Instructions for Block-Structured Languages

ENTER (Enter Procedure) creates a stack frame that may be used to implement
the scope rules of block-structured high-level languages. A LEAVE
instruction at the end of a procedure complements an ENTER at the beginning
of the procedure to simplify stack management and to control access to
variables for nested procedures.

The ENTER instruction includes two parameters. The first parameter
specifies the number of bytes of dynamic storage to be allocated on the
stack for the routine being entered. The second parameter corresponds to the
lexical nesting level (0-31) of the routine. (Note that the lexical level
has no relationship to either the protection privilege levels or to the I/O
privilege level.)

The specified lexical level determines how many sets of stack frame
pointers the CPU copies into the new stack frame from the preceding frame.
This list of stack frame pointers is sometimes called the display. The first
word of the display is a pointer to the last stack frame. This pointer
enables a LEAVE instruction to reverse the action of the previous ENTER
instruction by effectively discarding the last stack frame.

   Example: ENTER 2048,3

   Allocates 2048 bytes of dynamic storage on the stack and sets up pointers
   to two previous stack frames in the stack frame that ENTER creates for
   this procedure.

After ENTER creates the new display for a procedure, it allocates the
dynamic storage space for that procedure by decrementing ESP by the number
of bytes specified in the first parameter. This new value of ESP serves as a
starting point for all PUSH and POP operations within that procedure.

To enable a procedure to address its display, ENTER leaves EBP pointing to
the beginning of the new stack frame. Data manipulation instructions that
specify EBP as a base register implicitly address locations within the stack
segment instead of the data segment.

The ENTER instruction can be used in two ways: nested and non-nested. If
the lexical level is 0, the non-nested form is used. Since the second
operand is 0, ENTER pushes EBP, copies ESP to EBP and then subtracts the
first operand from ESP. The nested form of ENTER occurs when the second
parameter (lexical level) is not 0.

Figure 3-16 gives the formal definition of ENTER.

The main procedure (with other procedures nested within) operates at the
highest lexical level, level 1. The first procedure it calls operates at the
next deeper lexical level, level 2. A level 2 procedure can access the
variables of the main program which are at fixed locations specified by the
compiler. In the case of level 1, ENTER allocates only the requested
dynamic storage on the stack because there is no previous display to copy.

A program operating at a higher lexical level calling a program at a lower
lexical level requires that the called procedure should have access to the
variables of the calling program. ENTER provides this access through a
display that provides addressability to the calling program's stack frame.

A procedure calling another procedure at the same lexical level implies
that they are parallel procedures and that the called procedure should not
have access to the variables of the calling procedure. In this case, ENTER
copies only that portion of the display from the calling procedure which
refers to previously nested procedures operating at higher lexical levels.
The new stack frame does not include the pointer for addressing the calling
procedure's stack frame.

ENTER treats a reentrant procedure as a procedure calling another procedure
at the same lexical level. In this case, each succeeding iteration of the
reentrant procedure can address only its own variables and the variables of
the calling procedures at higher lexical levels. A reentrant procedure can
always address its own variables; it does not require pointers to the stack
frames of previous iterations.

By copying only the stack frame pointers of procedures at higher lexical
levels, ENTER makes sure that procedures access only those variables of
higher lexical levels, not those at parallel lexical levels (see Figure
3-17). Figures 3-18 through 3-21 demonstrate the actions of the ENTER
instruction if the modules shown in Figure 3-17 were to call one another in
alphabetic order.

Block-structured high-level languages can use the lexical levels defined by
ENTER to control access to the variables of previously nested procedures.
Referring to Figure 3-17 for example, if PROCEDURE A calls PROCEDURE B
which, in turn, calls PROCEDURE C, then PROCEDURE C will have access to the
variables of MAIN and PROCEDURE A, but not PROCEDURE B because they operate
at the same lexical level. Following is the complete definition of access to
variables for Figure 3-17.

  1.  MAIN PROGRAM has variables at fixed locations.

  2.  PROCEDURE A can access only the fixed variables of MAIN.

  3.  PROCEDURE B can access only the variables of PROCEDURE A and MAIN.
      PROCEDURE B cannot access the variables of PROCEDURE C or PROCEDURE D.

  4.  PROCEDURE C can access only the variables of PROCEDURE A and MAIN.
      PROCEDURE C cannot access the variables of PROCEDURE B or PROCEDURE D.

  5.  PROCEDURE D can access the variables of PROCEDURE C, PROCEDURE A, and
      MAIN. PROCEDURE D cannot access the variables of PROCEDURE B.

ENTER at the beginning of the MAIN PROGRAM creates dynamic storage space
for MAIN but copies no pointers. The first and only word in the display
points to itself because there is no previous value for LEAVE to return to
EBP. See Figure 3-18.

After MAIN calls PROCEDURE A, ENTER creates a new display for PROCEDURE A
with the first word pointing to the previous value of EBP (BPM for LEAVE to
return to the MAIN stack frame) and the second word pointing to the current
value of EBP. Procedure A can access variables in MAIN since MAIN is at
level 1. Therefore the base for the dynamic storage for MAIN is at [EBP-2].
All dynamic variables for MAIN are at a fixed offset from this value. See
Figure 3-19.

After PROCEDURE A calls PROCEDURE B, ENTER creates a new display for
PROCEDURE B with the first word pointing to the previous value of EBP, the
second word pointing to the value of EBP for MAIN, and the third word
pointing to the value of EBP for A and the last word pointing to the current
EBP. B can access variables in A and MAIN by fetching from the display the
base addresses of the respective dynamic storage areas. See Figure 3-20.

After PROCEDURE B calls PROCEDURE C, ENTER creates a new display for
PROCEDURE C with the first word pointing to the previous value of EBP, the
second word pointing to the value of EBP for MAIN, and the third word
pointing to the EBP value for A and the third word pointing to the current
value of EBP. Because PROCEDURE B and PROCEDURE C have the same lexical
level, PROCEDURE C is not allowed access to variables in B and therefore
does not receive a pointer to the beginning of PROCEDURE B's stack frame.
See Figure 3-21.

LEAVE (Leave Procedure) reverses the action of the previous ENTER
instruction. The LEAVE instruction does not include any operands. LEAVE
copies EBP to ESP to release all stack space allocated to the procedure by
the most recent ENTER instruction. Then LEAVE pops the old value of EBP from
the stack. A subsequent RET instruction can then remove any arguments that
were pushed on the stack by the calling program for use by the called
procedure.


Figure 3-16.  Formal Definition of the ENTER Instruction

The formal definition of the ENTER instruction for all cases is given by the
following listing. LEVEL denotes the value of the second operand.

Push EBP
Set a temporary value FRAME_PTR := ESP
If LEVEL > 0 then
      Repeat (LEVEL-1) times:
          EBP :=EBP - 4
          Push the doubleword pointed to by EBP
      End repeat
      Push FRAME_PTR
End if
EBP := FRAME_PTR
ESP := ESP - first operand.


Figure 3-17.  Variable Access in Nested Procedures

      ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
      º                MAIN PROCEDURE (LEXICAL LEVEL 1)                º
      º   ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»   º
      º   º              PROCEDURE A (LEXICAL LEVEL 2)             º   º
      º   º  ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»  º   º
      º   º  º           PROCEDURE B (LEXICAL LEVEL 3)          º  º   º
      º   º  ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ¼  º   º
      º   º  ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»  º   º
      º   º  º           PROCEDURE C (LEXICAL LEVEL 3)          º  º   º
      º   º  º  ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»  º  º   º
      º   º  º  º        PROCEDURE D (LEXICAL LEVEL 4)       º  º  º   º
      º   º  º  ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ¼  º  º   º
      º   º  ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ¼  º   º
      º   ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ¼   º
      ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ¼


Figure 3-18.  Stack Frame for MAIN at Level 1

                                      31          0 
                D  O                 º               º
                I  F              ÚÄ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                R                 ³  º    OLD ESP    º
                E  E     DISPLAY Ä´  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹ÄÄEBP FOR
                C  X              ³  º      EBPM
EBPM = EBP VALUE FOR MAIN    º    MAIN
                T  P              ÆÍ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                I  A              ³  º               º
                O  N              ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                N  S     DYNAMIC Ä´  º               º
                   I     STORAGE  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                 ³ O              ³  º               º
                 ³ N              ÀÄ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹ÄÄESP
                 ³                   º               º
                                                   


Figure 3-19.  Stack Frame for Procedure A

                                      31          0 
                D  O                 º               º
                I  F                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                R                    º    OLD ESP    º
                E  E                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                C  X                 º      EBPM
EBPM = EBP VALUE FOR MAIN    º
                T  P                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                I  A                 º               º
                O  N                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                N  S                 º               º
                   I                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                 ³ O                 º               º
                 ³ N              ÚÄ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                 ³                ³  º      EBPM     º
                                 ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹ÄÄEBP FOR A
                         DISPLAY Ä´  º      EBPM     º
                                  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º      EBPA
EBPA = EBP VALUE FOR PROCEDURE A    º
                                  ÆÍ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º               º
                                  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                         DYNAMIC Ä´  º               º
                         STORAGE  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º               º
                                  ÀÄ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹ÄÄESP
                                     º               º
                                                    


Figure 3-20.  Stack Frame for Procedure B at Level 3 Called from A

                                      31          0 
                D  O                 º               º
                I  F                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                R                    º    OLD ESP    º
                E  E                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                C  X                 º      EBPM
EBPM = EBP VALUE FOR MAIN    º
                T  P                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                I  A                 º               º
                O  N                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                N  S                 º               º
                   I                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                 ³ O                 º               º
                 ³ N                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                 ³                   º      EBPM     º
                                    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º      EBPM     º
                                     ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º      EBPA     º
                                     ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º               º
                                     ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º               º
                                     ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º               º
                                  ÚÄ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º      EBPA     º
                                  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹ÄÄEBP
                                  ³  º      EBPM     º
                         DISPLAY Ä´  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º      EBPA     º
                                  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º      EBPB
EBPB = EBP VALUE FOR PROCEDURE B    º
                                  ÆÍ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º               º
                                  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                         DYNAMIC Ä´  º               º
                         STORAGE  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º               º
                                  ÀÄ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹ÄÄESP
                                     º               º
                                                    


Figure 3-21.  Stack Frame for Procedure C at Level 3 Called from B

                                      31          0 
                D  O                 º               º
                I  F                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                R                    º    OLD ESP    º
                E  E                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                C  X                 º      EBPM
EBPM = EBP VALUE FOR MAIN    º
                T  P                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                I  A                 º               º
                O  N                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                N  S                 º               º
                   I                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                 ³ O                 º               º
                 ³ N                 ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                 ³                   º      EBPM     º
                                    ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º      EBPM     º
                                     ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º      EBPA
EBPA = EBP VALUE FOR PROCEDURE A    º
                                     ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º               º
                                     ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º               º
                                     ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                     º               º
                                  ÚÄ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º      EBPA     º
                                  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹ÄÄEBP
                                  ³  º      EBPM     º
                         DISPLAY Ä´  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º      EBPA     º
                                  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º      EBPB
EBPB = EBP VALUE FOR PROCEDURE B    º
                                  ÆÍ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º               º
                                  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                         DYNAMIC Ä´  º               º
                         STORAGE  ³  ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹
                                  ³  º               º
                                  ÀÄ ÌÍÍÍÍÍÍÍØÍÍÍÍÍÍÍ¹ÄÄESP
                                     º               º
                                                    


3.8  Flag Control Instructions

3.8.2  Flag Transfer Instructions

Though specific instructions exist to alter CF and DF, there is no direct
method of altering the other applications-oriented flags. The flag transfer
instructions allow a program to alter the other flag bits with the bit
manipulation instructions after transferring these flags to the stack or the
AH register.

The instructions LAHF and SAHF deal with five of the status flags, which
are used primarily by the arithmetic and logical instructions.

LAHF (Load AH from Flags) copies SF, ZF, AF, PF, and CF to AH bits 7, 6, 4,
2, and 0, respectively (see Figure 3-22). The contents of the remaining bits
(5, 3, and 1) are undefined. The flags remain unaffected.

SAHF (Store AH into Flags) transfers bits 7, 6, 4, 2, and 0 from AH into
SF, ZF, AF, PF, and CF, respectively (see Figure 3-22).

The PUSHF and POPF instructions are not only useful for storing the flags
in memory where they can be examined and modified but are also useful for
preserving the state of the flags register while executing a procedure.

PUSHF (Push Flags) decrements ESP by two and then transfers the low-order
word of the flags register to the word at the top of stack pointed to by ESP
(see Figure 3-23). The variant PUSHFD decrements ESP by four, then
transfers both words of the extended flags register to the top of the stack
pointed to by ESP (the VM and RF flags are not moved, however).

POPF (Pop Flags) transfers specific bits from the word at the top of stack
into the low-order byte of the flag register (see Figure 3-23), then
increments ESP by two. The variant POPFD transfers specific bits from the
doubleword at the top of the stack into the extended flags register (the RF
and VM flags are not changed, however), then increments ESP by four.


Figure 3-22.  LAHF and SAHF

                     7    6    5    4    3    2    1    0
                   ÉÍÍÍÍËÍÍÍÍËÍÍÍÍËÍÍÍÍËÍÍÍÍËÍÍÍÍËÍÍÍÍËÍÍÍÍ»
                   º SF º ZF º UU º AF º UU º PF º UU º CF º
                   ÈÍÍÍÍÊÍÍÍÍÊÍÍÍÍÊÍÍÍÍÊÍÍÍÍÊÍÍÍÍÊÍÍÍÍÊÍÍÍÍ¼

     LAHF LOADS FIVE FLAGS FROM THE FLAG REGISTER INTO REGISTER AH. SAHF
     STORES THESE SAME FIVE FLAGS FROM AH INTO THE FLAG REGISTER. THE BIT
     POSITION OF EACH FLAG IS THE SAME IN AH AS IT IS IN THE FLAG REGISTER.
     THE REMAINING BITS (MARKED UU) ARE RESERVED; DO NOT DEFINE.

3.9  Coprocessor Interface Instructions

A numerics coprocessor (e.g., the 80387 or 80287) provides an extension to
the instruction set of the base architecture. The coprocessor extends the
instruction set of the base architecture to support high-precision integer
and floating-point calculations. This extended instruction set includes
arithmetic, comparison, transcendental, and data transfer instructions. The
coprocessor also contains a set of useful constants to enhance the speed of
numeric calculations.

ESC (Escape) is a 5-bit sequence that begins the opcodes that identify
floating point numeric instructions. The ESC pattern tells the 80386 to send
the opcode and addresses of operands to the numerics coprocessor. The
numerics coprocessor uses the escape instructions to perform
high-performance, high-precision floating point arithmetic that conforms to
the IEEE floating point standard 754.

WAIT (Wait) is an 80386 instruction that suspends program execution until
the 80386 CPU detects that the BUSY pin is inactive. This condition
indicates that the coprocessor has completed its processing task and that
the CPU may obtain the results.

3.10  Segment Register Instructions

This category actually includes several distinct types of instructions.
These various types are grouped together here because, if systems designers
choose an unsegmented model of memory organization, none of these
instructions is used by applications programmers. 

3.11  Miscellaneous Instructions

3.11.1  Address Calculation Instruction

LEA (Load Effective Address) transfers the offset of the source operand
(rather than its value) to the destination operand.  The source operand must
be a memory operand, and the destination operand must be a general register.
This instruction is especially useful for initializing registers before the
execution of the string primitives (ESI, EDI) or the XLAT instruction (EBX).
The LEA can perform any indexing or scaling that may be needed.

Example: LEA EBX, EBCDIC_TABLE

Causes the processor to place the address of the starting location of the
table labeled EBCDIC_TABLE into EBX.