- Assembly Language Programming and Organization of the IBM PC
- Assembly Language Programming and Organization of the IBM PC
- ASSEMBLY LANGUAGE PROGRAMMING & ORGANIZATION OF THE IBM PC International Editions 1992
- Dedication
- Preface
- Hardware and Software Requirements
- Balanced Presentation
- Features of the Book
- Writing programs early
- Handling input and output
- Structured code
- Definitions
- Advanced applications
- Numeric processor
- Advanced processors
- Acknowledgments
- Elements ofAssembly LanguageProgramming
- Microcomputer Systems
- Overview
- 1.1
- The System Board
- 1.1.1
- Bytes and Words
- Bit Position
- Memory Operations.
- RAM and ROM
- Buses
- 1.1.2
- Execution Unit (EU)
- Bus Interface Unit (BlU):
- Serial and Parallel Ports
- Execute
- Timing
- 1.3 I/O Devices
- Magnetic Disks
- Keyboard
- Display Monitor
- Printers
- 1.4 Programming Languages
- Machine Language
- Assembly Language.
- Assembly language instruction
- Comment
- High-Level Languages
- Advantages of High-Level Languages 32
- Advantages of Assembly Languages
- 5 Assembly nguage Program
- Program Listing PGM1_1.ASM
- Glossary
- mega
- Exercises
- Representation of Numbers and Characters
- Overview
- 2.1 Number Systems
- Binary Number System
- Hexadecimal Number System
- 2.2 Conversion Between Number Systems
- Converting Binary and Hex to Decimal
- Converting Decimal to Binary and Hex
- Conversions Between Hex and Binary
- 2.3 Addition and Subtraction
- Addition
- Table 2.3B Binary Addition Table
- Subtraction
- 2.4
- How Integers Are Represented in the Computer
- One's Complement
- Two's Complement
- Subtraction as Two's Complement Addition
- Decimal Interpretation
- Table 2.4B Signed and Unsigned Decimal Interpretations of a Byte
- ASCII Code
- The Keyboard
- Special Control Characters
- SUMMARY
- Glossary
- Exercises
- Organization of the IBM Personal Computers
- The 8086 and 8088 Microprocessors
- The 80186 and 80188 Microprocessors.
- The 80286 Microprocessor
- The 80386 and 80386SX Microprocessors
- The 80486 and 80486sX Microprocessors
- 3.2.2
- AX (Accumulator Register)
- BX (Base Register)
- CX (Count Register)
- DX (Data Register)
- Memory Segment
- Segment:Offset Address
- Location of Segments
- Program Segments
- 3.2.4
- SP (Stack Pointer)
- BP (Base Pointer)
- SI (Source Index)
- DI (Destination Index)
- 3.2.5 Instruction Pointer: IP
- 3.2.6 FLAGS Register
- 3.3 Organization of the PC
- 3.3.1 :
- The Operating System n
- BIOS
- Table 3.1 Some Common I/O Ports for the PC
- Port Address
- Description
- 3.3.3
- I/O Port Addresses
- 3.3.4
- Start-up Operation
- Summary
- Glossary
- Exercises
- Introduction to IBM PC Assembly Language
- Overview
- 4.1 Assembly Language Syntax
- Statements
- name operation operand(s) comment
- 4.1.1 Name Field
- Examples of legal names
- Examples of illegal names
- 4.1.2
- Operation Field
- 4.1.3
- Operand Field
- 4.1.4
- Comment Field
- 4.2 Program Data
- Numbers
- Number
- Characters
- Table 4.1 Data-Defining Pseudo-ops
- Variables
- Byte Variables
- Word Variables
- 4.3.3
- High and Low Bytes of a Word
- Character Strings
- 4.4 Named Constants
- EQU (Equates)
- 4.5
- A.Few Basic Instructions
- 4.5.1 MOV and XCHG
- Restrictions on MOV and XCHG
- Type Agreement of Operands
- 4.7 Program Structure
- 4.7.1 Memory Models
- Table 4.4 Memory Models
- Description
- 4.7.2
- Data Segment
- 4.7.3.
- Stack Segment
- 4.7.4
- Code Segment
- 4.7.5
- Putting It Together
- Input and Output Instructions
- The INT Instruction
- Function number
- 1.9 A First Program
- Program Listing PGM4_1.ASM
- Terminating a Program
- 4.10 Creating and Running a Program
- Step 1. Create the Source Program File
- Step 2. Assemble the Program
- The Source Listing File
- The Cross-Reference File
- Step 3. Link the Program
- Step 4. Run the Program
- 4.11 Displaying a String
- The LEA Instruction
- Program Segment Prefix
- 4.12
- Program Listing PGM4_3.ASM
- Summary
- Glossary
- object file
- New Instructions
- New Pseudo-Ops
- Exercises
- Programming Exercises
- 5
- The Processor Status and the FLAGS Register
- Overview
- The Status Flags
- Carry Flag (CF)
- Parity Flag (PF)
- Auxiliary Carry Flag (AF)
- Zero Flag (ZF)
- Sign Flag (SF)
- Overflow Flag (OF)
- 5.2
- Overflow
- Examples of Overflow
- 0111 1111 1111 1111 + 0111 1111 1111 1111 1111 1111 1111 1110 = 1FFEh
- How the Processor Indicates Overflow
- How the Processor Determines that Overflow Occurred
- Unsigned Overflow
- Signed Overflow
- 5.3
- How Instructions Affect the Flags
- Instruction Affects flags
- 5.4 The DEBUG Program
- Program Listing PGM5_1.ASM
- Table 5.2 DEBUG Flag Symbols
- Summary
- Glossary
- Exercises
- Flow Control Instructions
- Overview
- .Program Listing PGM6_T.ASM
- 6.2 Conditional Jumps
- Range of a Conditional Jump
- How the CPU Implements a Conditional Jump
- The CMP Instruction
- Table 6.1 Conditional Jumps
- Signed Jumps
- Unsigned Conditional Jumps
- Single-Flag Jumps
- Interpreting the Conditional Jumps
- Signed Versus Unsigned Jumps
- Working with Characters
- Solution:
- 6.3 The JMP Instruction
- 6.4 High-Level Language Structures
- 6.4.1 Branching Structures
- IF-THEN
- IF-THEN-ELSE
- Solution:
- CASE
- Solution:
- Solution:
- Branches with Compound Conditions
- AND Conditions
- Solution:
- OR Conditions
- Solution:
- 6.4.2
- FOR LOOP
- Solution:
- WHILE LOOP
- Solution:
- REPEAT LOOP
- Solution:
- The code is
- WHILE Versus REPEAt
- 6.5 Programming with High-Level Structures
- Problem
- First refinement
- Step 1. Display the opening message.
- Step 2. Read and process a line of text.
- Step 3. Display the results.
- - Program Listing PGM6_2.ASM
- Summary
- Glossary
- New Instructions
- Exercises
- 2. Use a CASE structure to code the following:
- Programming Exercises
- Logic, Shift, and Rotate Instructions
- Overview
- 7.1 Logic Instructions
- Solutions:
- 7.1.1
- AND, OR, and XOR Instructions
- Converting an ASCII Digit to a Number
- Converting a Lowercase Letter to Upper Case
- Clearing a Register
- Testing a Register for Zero
- 7.1.3 TEST Instruction
- Examining Bits
- 7.2 Shift Instructions
- 7.2.1
- Left Shift Instructions
- The SHL Instruction
- Multiplication by Left Shift
- The SAL instruction
- Overflow
- 7.2.2 Right Shift Instructions
- The SHR Instruction
- The SAR Instruction
- Division by Right Shift
- Signed and Unsigned Division
- More General Multiplication and Division
- 7.3 Rotate Instructions
- Rotate Left
- Rotate Right
- Solution:
- Rotate Carry Left
- Rotate Carry Right
- Solution:
- Effect of the rotate instructions on the flags
- An Application: Reversing a Bit Pattern
- 7.4 Binary and Hex I/O
- Binary Input
- Algorithm for Binary Input
- Demonstration (for input 110)
- Binary Output
- Algorithm for Binary Output
- Hex Input
- Algorithm for Hex Input
- Demonstration (for input 6AB)
- Hex. Output
- Algorithm for Hex Output
- Demonstration (BX Contains 4CA9h)
- Summary
- New Instructions
- Exercises
- Programming Exercises
- The Stack and Introduction to Procedures
- Overview
- 8.1 The Stack
- PUSH and PUSHF
- POP and POPF
- 8.2 A Stack Application
- Algorithm to Reverse Input.
- Program Listing PGM8_1.ASM
- 8.3 Terminology of Procedures
- Procedure Declaration
- RET
- Communication Between Procedures
- Procedure Documentation
- 3.4 CALL and RET
- 8.5 An Example of a Procedure
- Multiplication algorithm:
- 'Program Listing PGM8_2.ASM
- Summary
- Glossary
- Exercises
- Programming Exercises
- Multiplication and Division Instructions
- Overview
- Signed-Verstis-Unsigned Multiplication
- Word Form
- Effect of MUL/MUL on the status flags
- Examples
- 9.2
- Simple Applications of MUL and IMUL
- Solution:
- Byte Form
- Word Form
- Divide.Overflow
- 9.4 Sign Extension of the Dividend
- Word Division
- Solution:
- Byte Division
- Solution:
- Decimal Input and Output Procedures
- Decimal Output
- Algorithm for Decimal Output
- Line 6
- Line 7
- Program Listing PGM9_1.ASM
- The INCLUDE Pseudo-op
- Program Listing PGM9_2.ASM
- Decimal Input
- Decimal Input Algorithm (first version)
- Decimal Input Algorithm (second version)
- The algorithm can be coded as follows:
- Program Listing PGM9_3.ASM
- Testing INDEC
- Sample execution:
- Overflow
- Decimal Input Algorithm (third version)
- Summary
- New Instructions
- New Pseudo-Ops
- Exercises
- Programming Exercises
- Arrays and Addressing Modes
- 10.1 One-Dimensional Arrays
- The DUP Operator
- Location of Array Elements
- 10.2 Addressing Modes
- 10.2.1
- Register Indirect Mode
- Example 10.2 Suppose that
- Solution:
- Source offset
- Program Listing PGM10_1.ASM
- Solution:
- Solution:
- 10.2.3
- Using PTR to Override a Type
- The LABEL Pseudo-Op
- Instruction
- Solution:
- 10.2.4
- Segment Override
- 10.2.5
- Accessing the Stack
- Solution:
- 10.3 An Application: Sorting an Array
- Selectsort algorithm
- Program Listing PGM10_2.ASM
- Program Listing PGM10_3.ASM
- 10.4 Two-Dimensional Arrays
- How Two-Dimensional Arrays Are Stored
- Locating an Element in a Two-Dimensional Array
- Solution:
- 10.5 Based Indexed Addressing Mode
- Solution:
- Algorithm
- Program Listing PGM10_4.ASM
- The XLAT Instruction
- Example: Coding and Decoding a Secret Message
- Algorithm for Coding and Decoding a Secret Message
- Program Listing PGM10_5.ASM
- Summary
- Glossary
- Programming Exercises
- Demonstration
- 11
- The String Instructions
- Overview
- CLD and STD
- MOvSW
- 11.3 Store String
- The STOSB Instruction
- Reading and Storing a Character String
- Algorithm for READ_STR
- Program Listing PGM11_1.ASM
- 11.4 Load String
- The LODSB instruction
- Displaying a Character String
- Algorithm for DISP_STR
- Program Listing PGM 11_2.ASM
- For example, if the string
- Algorithm for Counting Vowels and Consonants
- Program Listing PGM11_4.ASM
- The CMPSB Instruction
- REPE and REPZ
- 11.6.1
- Finding a Substring of a String
- Algorithm for Substring Search
- Program Listing PGM11_5.ASM
- 11.7 General Form of the String Instructions
- Instruction
- Summary
- Glossary
- New Instructions
- String Instruction Prefixes
- Exercises
- Programming Exercises
- Input
- Output
- Output
- Advanced Topics
- Text Display and Keyboard Programming
- Overview
- 12.1 The Monitor
- 12.2 Video Adapters and Display Modes
- Video Adapters
- Display Modes
- Table 12.1 Video Adapters
- Kinds of Video Adapters
- Mode Numbers
- 12.3 Text Mode Programming
- Display Pages
- The Active Display Page
- 12.3.1 The Attribute Byte
- 16-Color Display
- Monochrome Display
- Table.12.6 Monochrome Attributes
- 12.3.2
- A Display.Page Demonstration
- Program Listing PGM12_1.ASM
- 12.3.3
- Solution:
- Solution:
- Sylution:
- Solution:
- Solution:
- INT 10h, Function 9: Display Character at the Cursor with Any Attribute
- Solution:
- Solution:
- Program Listing PGM12_2.ASM
- Scan Codes
- The Keyboard Buffer
- Keyboard Operation
- INT 16H
- Solution:
- 12.5 A Screen Editor
- Screen Editor Algorithm
- DO_FUNCTION Algorithm
- Program Listing PGM12_3.ASM
- Summary
- Glossary
- Exercises
- Programming Exercises
- 13
- Macros
- 13.1 Macro Definition and Invocation
- Solution:
- Illegal Macro Invocations
- Restoring Registers
- Micro Expansion in the .LST File
- Program Listing PGM13_1.ASM
- .LST File Options
- Finding Assembly Errors
- Solution:
- Solution:
- 13.4 A Macro Library
- The IF1 Conditional
- INCLUDE MACROS
- Examples of Useful Macros
- Including 3 Macro Library
- Program Listing PGM13_2.ASM
- 13.5 Repetition Macros
- Solution:
- Solution:
- The IRP Macro
- Solutions:
- 13.6 An Output Macro
- Algorithm for Hex Output (of BX)
- Program Listing PGM13_3.ASM
- Table 13.1 Conditional Pseudo-Ops
- A Macro that Uses IF
- Solution:
- A Macro that Uses IFNB:
- Solution:
- The .ERR Directive
- 13.8 Macros and Procedures
- Assembly Time
- Execution Time
- Program Size
- Other Considerations
- Summary
- Glossary
- Exercisos
- Memory Management
- Overview
- 14.1 .COM Programs
- .COM Program Format
- The ORG Directive
- .COM Program Stack
- An Example of a.COM Program
- Program Listing PGM14_1.ASM (a repeat of PGM4_2.ASM)
- Program Listing PGM14_2.ASM
- 14.2 Program Modules
- Assembly and Object Modules
- NEAR and FAR Procedures
- EXTRN
- PUBLIC
- Linking Object Modules
- Program Listing 14_3.ASM: First Module
- Program Listing 14_3A.ASM: Second Module
- A>C:MASM PGM14_3;
- A>C:MASM PGM14_3A;
- The Segment Directive
- Align Type
- Table 14.1 Align Types
- Combine Type
- Table 14.2 Combine Types
- Class Type
- 14.3.1
- The ASSUME Directive
- Program Listing 14_4A.ASM: Second Module
- Note the following:
- Program Listing 14_5.ASM: First Module
- Program Listing 14_5A.ASM: Second Module
- 14.4
- More About the Simplified Segment Definitions
- 14.5 Passing Data Between Procedures
- Program Listing 14.6.ASM: First Module
- Program Listing 14_6A.ASM: Second Module
- 14.5.2
- Passing the Addresses of the Data
- Program Listing 14.7.ASM: First Module
- Program Listing 14.7A.ASM: Second Module
- Program Listing 14.8.ASM: First Module
- Program Listing 14_8A.ASM: Second Module
- Summary
- Glossary
- Exercises
- Programming Exercises
- BIOS and DOS Interrupts
- Overview
- 15.1 Interrupt Service
- Hardware Interrupt
- Software Interrupt
- Processor Exception
- 15.1.1
- Interrupt Vector
- Table 15.1 Interrupt Types
- 15.1.2 Interrupt Routines
- The Control Flags IF and TF
- 15.2 BIOS Interrupts
- Interrupt Types 0-7
- Interrupt Types 8h-Fh
- Interrupt Types 10h-1Fh
- Table 15.2 Equipment Check
- Table 15.3 Printer Status
- 15.3 DOS Interrupts
- 15.4 A Time Display Program
- Program Listing PGM15_1.ASM
- Program Listing PGM15_1A.ASM
- Set Interrupt Vector
- Program Listing PGM15_2A.ASM
- Cursor Control
- Interrupt Procedure
- Program Listing PGM15_2.ASM
- Program Listing PGM15_3A.ASM
- Program Listing PGM15_3B.ASM.
- Program Listing PGM15_3.ASM
- Summary
- New Instructions
- New Pseudo-Ops
- Exercises
- Programming Exercises
- Color Graphics
- Overview
- 16.1 Graphics Modes
- Pixels
- Mode Selection
- Table 16.1 Video Adapter Graphics Display Modes
- 16.2 CGA Graphics
- Table 16.2 Sixteen Standard CGA Colors
- Medium-Resolution Mode
- Reading and Writing Pixels
- High-Resolution Mode
- Program Listing PGM16_1.ASM
- Writing Directly to Memory
- Displaying Text
- Solution:
- 16.4 VGA Graphics
- Solution:
- 16.5 Animation
- Ball Display
- Program Listing PGM16_2A.ASM
- Program Listing PGM16_2B.ASM
- Ball Bounce
- Program Listing PGM16_2CASM
- Program Listing PGM16_2.ASM
- 16.6 An Interactive Video Game
- 16.6.1 Adding Sound
- Program Listing PGM16_3A.ASM
- Program Listing PGM16_3B.ASM
- 16.6.2 Adding a Paddle
- Program Listing PGM16_3C.ASM
- Program Listing PGM16_3D.ASM
- Program Listing PGM16_3.ASM
- Summary
- Glossary
- Exercises
- Programming Exercises
- Recursion
- Overview
- 17.1 The Idea of Recursion
- 17.3 Passing Parameters on the Stack
- Program Listing PGM17_1.ASM
- Program Listing PGM17_2.ASM
- Program Listing PGM17_3.ASM
- 17.6 More Complex Recursion
- Solution:
- Program Listing PGM17_4.ASM
- Summary
- Glossary
- Exercises
- Programming Exercises
- 18
- Advanced Arithmetic
- Overview
- 18.1 Double-Precision Numbers
- 18.1.1 Double-Precision Addition, Subtraction, and Negation
- Program Listing PGM18_1.ASM
- 18.1.2
- 18.2
- 18.2.1
- Packed and
- Unpacked BCD
- 18.2.2
- BCD Addition and the AAA Instruction
- The AAA Instruction
- BCD Subtraction and the AAS Instruction
- The AAS Instruction
- 18.2.4
- BCD Multiplication and the AAM Instruction
- The AAM Instruction
- 18.2.5
- BCD Division and the AAD Instruction
- The AAD Instruction
- 18.3
- Floating-Point Numbers
- 18.3.1
- 18.3.2
- Floating-Point Representation
- 18.3.3
- Floating-Point Operations
- 18.4 The 8087 Numeric Processor
- 18.4.1 Data Types
- 18.4.2 8087 Registers
- 18.4.3 Instructions
- Load and Store
- Add, Subtract, Multiply, and Divide
- 18.4.4
- Algorithm for Converting ASCII Digits to Packed BCD
- Program Listing PGM18_2.ASM
- Algorithm for Printing Packed BCD Numbers
- Program Listing PGM18_3.ASM
- Program Listing PGM18_4.ASM
- 18.4.5
- Real-Number I/O
- Algorithm for Reading Real Numbers
- Program Listing PGM18_S.ASM
- Algorithm for Printing Real Numbers with a Four-Digit Fraction
- Program Listing PGM18_6.ASM
- Summary
- Glossary
- New Instructions
- New Pseudo-Ops
- Exercises
- Programming Exercises
- Disk and File Operations
- Floppy Disks
- Hard Disks
- 19.2 Disk Structure
- 19.2.1 Disk Capacity
- The File Directory
- Clusters
- The FAT
- How DOS Reads a File
- Table 19.2 The First Byte of the FAT for Some Disks
- How DOS Stores a File
- 19.3 File Processing
- 19.3.1 File Handle
- 19.3.2 File Errors
- 19.3.3 Opening and Closing a File
- Solution:
- 19.3.4
- 19.3.5
- Writing a File
- 19.3.6
- Algorithm for displaying a file
- Program Listing PGM19_1.ASM
- 19.3.7 The File Pointer
- INT 21H, Function 42h:
- Application: Appending Records to a File
- Algorithm for Main Program
- "Algorithm for GET_NAME Procedure
- ProgramListing PGM19.2.ASM
- Solution:
- Logical Sector Numbers
- Reading a Sector
- Program Listing PGM19_3.ASM
- C>DEBUG PGM19_3.EXE
- Examining a File Allocation Table
- Summary
- Glossary
- Exercises
- Programming Exercises
- Intel's Advanced Microprocessors
- Overview
- 20.1 The 80286 Microprocessor
- 20.1.1
- Extended Instruction Set
- PUSH and POP
- Multiply
- Shifts and Rotates
- String I/O
- High-Level Instructions
- 20.1.2 Real Address Mode
- Address Generation
- Programs Running Under DOS
- Virtual Addresses
- Tasks
- 20.1.4 Extended Memory
- 20.2 Protected-Mode Systems
- Multitasking
- 20.2.1
- Windows and OSI2
- Windows 3
- OS/2
- Threads and Processes
- 20.2.2
- Programming
- "Hello" Program
- Echo Program
- Program Listing PGM20_3.ASM
- Page-Oriented Virtual Memory
- Virtual 8086 Mode
- 20.3.3 Programming the 80386
- Sixteen-Bit Programming
- Thirty-Two-Bit Programming
- Mixing 16- and 32-Bit Instructions
- Program Listing PGM20_4.ASM
- Summary
- Glossary
- Part Three
- Appendices
- IBM Display Codes
- DOS Commands
- BACKUP
- COPY
- DATE
- DIR (Directory)
- ERASE (or DEL)
- FORMAT
- RENAME (or REN)
- RESTORE
- TIME
- B.1 Tree-Structured Directories
- MKDIR (or MD)
- RMDIR (or RD)
- BIOS and DOS Interrupts
- Interrupt 10: Video
- Table C.1 Interrupts 0 to 0Fh
- Interrupt Type Usage
- Function 2h:
- Move Cursor
- Function 3h:
- Get Cursor Position and Size
- Function 5h:
- Select Active Display Page
- Function 6h:
- Function 7h:
- Scroll Window Down
- Function 8h:
- Read Character and Attribute at Cursor
- Function 9h:
- Write Character and Attribute at Cursor
- Function 0Ah:
- Write Character at Cursor
- Function OBh:
- Function 10h, Subfunction 15h: Get Color Register
- Function 10h, Subfunction 17h: Get Block of Color Registers
- Interrupt 11h: Get Equipment Configuration
- Interrupt 12h: Get Conventional Memory Size
- Interrupt.13h: Disk I/O
- Interrupt 16h: Keyboard
- Interrupt 17h: Printer
- C.3 DOS Interrupts
- Interrupt 21h
- Function Oh:
- Program Terminate
- Function 1h:
- Function 2h:
- Function 5h:
- Function 09h:
- Function 2Ah:
- Function 2Bh:
- Function 2Ch:
- Get Time
- Function 2Dh:
- Function 30h:
- Function 31h:
- Function 33h:
- Function 35h:
- Function 36h:
- Get Disk Free Space
- Function 39h:
- Create Subdirectory (MKDIR)
- Function 3Ah:
- Remove Subdirectory (RMDIR)
- Function 3Bh:
- Change the Current Directory(CHDIR)
- Function 3Ch:
- Function 3Dh:
- Function 48h:
- Allocate Memory
- Function 49h:
- Free Allocated Memory
- Function 4Ch:
- Terminate a Process (EXIT)
- Interrupt 25h: Absolute Disk Read
- Interrupt 26h: Absolute Disk Write
- Interrupt 27h: Terminate but Stay Resident
- MASM and LINK Options
- D.1 MASM
- MASM Command Line
- C>MASM A:FIRST
- Options
- Table D.1 Some MASM Options
- Option
- Action
- A MASM Demonstration
- Program Listing PGMD_1.ASM
- Two-Pass Assembly and the SYMBOL TABLE
- The Cross-Reference File
- LINK Command.Line
- A LINK Demonstration
- DEBUG and CODEVIEW
- E.1 Introduction
- E.2 DEBUG
- A Debug Demonstration
- Program Listing PGM4_2.ASM:
- Flag
- Table E.1 DEBUG Commands
- Command
- Action:
- E.3 CODEVIEW
- Program Preparation
- Entering CODEVIEW
- Window Mode
- Controlling the Display
- Controlling Program Execution
- Selecting from the Menus
- Table E.2 Display Commands
- Table E.3 Function Key Commands
- The RUN Menu
- Watch Commands
- Table E.4 RUN Menu Selections
- Watching Memory
- Watching the Stack
- Watching Expressions
- Register Indirection
- Removing Lines from the WATCH WINDOW
- Tracepoints
- and CODEVIEW woulcl displa
- .Watchpoints,
- DEBUG Commands in CODEVIEW
- Assembly Instruction Set
- Typical 8086 Instruction Format
- 5.2 8086 Instructions
- AAA: ASCII Adjust for Addition
- AAD: ASCII Adjust for Division
- AAM: 'ASCII Adjust for Multiplication
- AAS: ASCII Adjust for Subtraction
- ADC: Add with Carry
- ADD: Addition
- AND: Logical AND
- CALL: Procedure Call
- CBW: Convert Byte to Word
- CL: Clear Carry Flag
- CLI: Clear Interrupt Flag
- CMC: Complement Carry Flag
- CMP: Compare
- CMPS/CMPSB/CMPSW: Compare Byte or Word String
- CWD: Convert Word to Double Word
- DAA: Decimal Adjust for Addition
- DAS: Decimal Adjust for Subtraction
- DEC: Decrement
- DIV: Divide
- ESC: Escape
- HLT: Halt
- IMUL: Integer Multiply
- IN: Input Byte or Word
- INC: Increment
- INT: Interrupt
- INTO: Interrupt if Overflow
- IRET: Interrupt Return
- J(condition): Jump Short, If Condition Is Met
- JMP: Jump
- LAHF: Load AH from Flags
- LDS: Load Data Segment Register
- LEA: Load Effective Address
- LES: Load Extra Segment Register
- LOCK: Lock Bus
- LOOP
- LOOP/LOOPZ: Loop if Equal/Loop If zero
- LOOPNE/LOOPNZ: Loop If Not Equal/Loop If Not Zero
- MOV: Move
- MOVSMOVSBIMOVSW: Move Byte or Word String
- MUL: Multiply
- NEG: Negate
- NOP: No Operation
- NOT: Logical Not
- OR: Logical Inclusive Or
- OUT: Output Byte or Word
- POP: Pop Word Off Stack to Destination
- POPF: Pop Flags Off Stack
- PUSH: Push Word onto Stack
- PUSHF: Push Flags onto Stack
- RCL: Rotate Left Through Carry
- RCR: Rotate Right Through Carry
- REP/REPZ/REPE/REPNE/REPNZ; Repeat String Operation
- RET: Return from Procedure
- ROL: Rotate Left
- ROR: Rotate Right
- SAHF: Store AH in FLAGS Register
- SAL/SHL: Shift Arithmetic Left/Shift Logical Left
- SAR: Shift Arithmetic Right
- SBB: Subtract with Borrow
- SCAS/SCASB/SCASW: Scan Byte or Word String
- SHR: Shift Logical Right
- STC: Set Carry Flag
- STD: Set Direction Flag
- STl: Set Interrupt Flag
- STOS/STOSB/STOSW: Store Byte or Word String
- SUB: Subtract
- TEST: Test (Logical Compare)
- WAIT
- XCHG: Exchange
- XLAT: Translate
- XOR: Exclusive OR
- F.3 8087 Instructions
- FADD: Add Real
- FBLD: Packed Decimal Load
- FBSTP: Packed BCD Store and Pop
- FDIV: Divide Real
- FIADD: Integer Add
- FIDIV: Integer Divide
- FILD: Integer Load
- FIMUL: Integer Multiply
- FIST: Integer Store
- FISTP: Integer Store and Pop
- FISUB: Integer Subtract
- FLD: Load Real
- FMUL: Multiply Real
- FST: Store Real
- FSTP: Store Real and Pop
- FSUB: Subtract Real
- F.4 80286 Instructions
- IMUL: Integer Immediate Multiply
- INS/INSB/INSW: Input from Port to String
- OUTS/OUTSB/OUTSW: Output String to Port
- POPA: Pop All General Registers
- PUSH: Push Immediate
- PUSHA: Push All General Registers
- F.5 90386 Instructions
- Bit Scan Instructions
- Bit Test Instructions
- Move with Extension Instructions
- Set Byte on Condition Instructions
- Double-Precision Shift Instructions
- Assembler Directives
- ALPHA
- ASSUME
- .CODE
- COMMENT
- Examples:
- .CONST
- .CRF and .XCRF
- Example:
- .DATA and .DATA?
- Data-Defining Directives
- Directive Meaning
- DOSSEG
- ELSE
- EQU
- GROUP
- LABEL
- Example:
- LIST and .XUST
- LOCAL
- MACRO and ENDM
- .MODEL
- Model
- ORG
- PAGE
- PROC and ENDP
- Processor and Coprocessor Directives
- Directive Enables assembly of instructions for processors and coprocessors
- PUBLIC
- PURGE
- .SEQ
- STRUC and ENDS
- SUBTTL
- TITLE
- Keyboard Scan Codes
- Index
- A
- B
- C
- D
- E
- F
- G
- H
- 1
- K
- L
- M
- N
- 0
- Q
- R
- 5
- T
- U
- V
- W
- x
Ytha YuCharles Marut
Ytha Yu Department of Mathematics and Computer Science California State University, Hayward, California
Charles Marut Department of Mathematics and Computer Science California State University, Hayward, California
Exclusive rights by McGraw- Hill Book Co- Singapore for manufacture and export. This book cannot be re- exported from the country to which it is consigned by McGraw- Hill.
Copyright © 1992 by McGraw- Hill, Inc. All rights reserved. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the publisher.
567890 KKP 98765
ISBN 0- 07- 072692- 2
Sponsoring editor: Stephen Mitchell Editorial assistant: Denise Nickeson Director of Production: Jane Somers Production assistant: Richard De Vitto Project management: BMR
Library of Congress Catalog Card Number 91- 66269
ISBN is a registered trademark of International Business Machines, Inc. Intel is a registered trademark of Microsoft Corporation.
When ordering this title, use ISBN 0- 07- 112896- 4
Printed in Singapore
In memory of my father, Ping Chau
To my mother, Monica
To my wife, Joanne and our children
Alan, Yolanda, and Brian
For my parents, George and Ruth Marut
For Beth
This book is the outgrowth of our experience in teaching assembly language at California State University, Hayward. Our goal is to write a textbook that is easy to read, yet covers the topics fully. We present the material in a logical order and explore the organization of the IBM PC with practical and interesting examples.
Assembly language is really just a symbolic form of machine language; the language of the computer, and because of this, assembly language instructions deal with computer hardware in a very intimate way. As you learn to program in assembly language you also learn about computer organization. Also because of their close connection with the hardware, assembly language programs can run faster and take up less space in memory than high- level language programs—a vital consideration when writing computer game programs, for instance.
While this book is intended to be used in an assembly language programming class taught in a university or community college, it is written in a tutorial style and can be read by anyone who wants to learn about the IBM PC and how to get the most out of it. Instructors will find the topics covered in a pedagogical fashion with numerous examples and exercises.
It is not necessary to have prior knowledge of computer hardware or programming to read this book, although it helps if you have written programs in some high- level language like Basic, Fortran, or Pascal.
To do the programming assignments and demonstrations, you need to own or have access to the following:
-
An IBM PC or compatible.
-
The MS-DOS or PC-DOS operating system.
-
Access to assembler and linker software, such as Microsoft's MASM and LINK, or Borland's TASM and TLINK.
-
An editor or word processing program.
The world of IBM PCs and compatibles consists of many different computi models with different processors and structures. Similarly, there are differcnt versions of assemblers and debuggers. We have taken the following approach to balance our presentation:
-
Emphasis is on the architecture and instruction set for the 8086/8088 processors, with a separate chapter on the advanced processors. The reason is that the methods learned in programming the 8086/8088 are common to all the Intel 8086 family because the instruction set for the advanced processors is largely just an extension of the 8086/8088 instruction set. Programs written for the 8086/8088 will execute without modification on the advanced processors.
-
Simplified segment definitions, introduced with MASM 5.0, are used whenever possible.
-
The DOS environment is used, because it is still the most popular operating system on PCs.
-
DEBUG is used for debugging demonstrations because it is part of DOS and its general features are common to all assembly debuggers. Microsoft's CODEVIEW is covered in Appendix E.
All the materials have been classroom tested. Some of the features that we believe make this book special are:
You are naturally eager to start writing programs as soon as possible. However, because assembly language instructions refer to the hardware, you first need to know the essentials of the machine architecture and the basics of the binary and hexadecimal number systems. The first program appears in Chapter 1, and by the end of Chapter 4 you will have the necessary tools to write simple but interesting programs.
Input and output in assembly language are difficult because the instruction set is so basic. Our approach is to program input and output by using DOS function calls. This enables us to present completely functioning programs early in the book.
The advantages of structured programming in high- level languages carry over to assembly language. In Chapter 6, we show how the standard high- level branching and looping structures can be implemented in assembly language; subsequent programs are developed from a high- level pseudocode in a t- up- down manner.
To have a clear understanding of the ideas of assembly language programming, it's important to have a firm grasp of the terminology. To facilitate this, new terms appear in boldface the first time they are used, and are included in a glossary at the end of the chapter.
One of the fun things that can easily be done in assembly language is manipulating the keyboard and screen. Two chapters are devoted to this topic; the high point is the development of a video game similar to Pong. Another interesting application is the development of a memory resident program that displays and updates the time.
The operations and instructions of the numeric processor are given detailed treatment.
The structure and operations of the advanced processors are covered in a separate chapter. Because DOS is still the dominant operating system for the PC, most examples are DOS applications.
Note to Instructors
The book is divided into two parts. Part One covers the topics that are basic to all applications of assembly language; Part Two is a collection of advanced topics. The following table shows how chapters in Part Two depend on material from earlier chapters:
Chapter Uses material from chapters
12 1- 10
13 1- 11, 12 (some exercises)
14 1- 10
15 1- 12, 14
16 1- 15
17 1- 10
18 1- 10, 13
19 1- 10
20 1- 11, 13, 14
The chapters in Part One should be covered in sequence. If the students have strong backgrounds in computer science, Chapter 1 can be covered lightly or be assigned as independent reading. In a ten- week course that meets four hours a week, we are usually able to cover the first four chapters in two weeks, and make the first programming assignment at the end of the second week or the beginning of the third week. In ten weeks we are usually able to cover chapters 1- 12, and then go on to choose topics from chapters 13- 16 as time and interest allow.
Exercises
Every chapter ends with numerous exercises to reinforce the concepts and principles covered. The exercises are grouped into practice exercises and programming exercises.
Instructor's Manual
A comprehensive instructor's manual is available. It includes general comments, programming hints, and solutions to the practice exercises. It also includes a set of transparency masters for figures and program listings.
Student Data Disk
A student data disk containing the source code for the programs in the text is available with the accompanying instructor's manual.
We would like to thank our editor, Raleigh Wilson, and the staff at Mitchell McGraw- Hill, including Stephen Mitchell, Denise Nickeson, Jane Somers, and Richard de Vitto, for their support in this project. We would like to thank the staff at BMR, especially Matt Lusher, Jim Love, and Alex Leason for their outstanding work in producing this book.
We would also like to thank our students for their patience, support, and criticism as the manuscript developed. Finally, our thanks go to the following reviewers, whose insights helped to make this a much better book:
David Hayes, San Jose State University, San Jose, California Jim Ingram, Amarillo College, Amarillo, Texas
Linda Kieffer, Cheney, Washington
Paul LeCoq, Spokane Falls Community College, Washington
Thom Luce, Ohio University, Ohio
Eric Lundstrom, Diablo Valley College, Pleasant Hill, California
Mike Michaelson, Palomar College, San Marcos, California
Don Myers, Vincennes University, Vincennes, Indiana
Loren Radford, Baptist College, Charleston, South Carolina
Francis Rice, Oklahoma State University, Oklahoma
David Rosenlof, Sacramento City College, California
Paul W. Ross, Millersville State University, Pennsylvania
R.G. Shurtleff, Colorado Technical College, Colorado
Mel Stone, St. Petersburg Jr. College, Clearwater Campus, Florida
James VanSpuyvroeck, St. Ambrose College, Davenport, Iowa
Richard Weisgerber, Mankato State University, Mankato, Minnesota
We would appreciate any comments that you, the reader, may offer.
Correspondence should be addressed to Ythu Yu or Charles Marut, Depart
ment of Mathematics and Computer Science, California State University,
llawward, llayward California 94542. Internet electronic mail should be ad
dressed to [email protected] or [email protected].
Part One
This chapter provides an introduction to the architecture of microcomputers in general and to the IBM PC in particular. You will learn about the main hardware components: the central processor, memory, and the peripherals, and their relation to the software, or programs. We'll see exactly what the computer does when it executes an instruction, and discuss the main advantages (and disadvantages) of assembly language programming. If you are an experienced microcomputer user, you are already familiar with most of the ideas discussed here; if you are a novice, this chapter introduces many of the important terms used in the rest of the book.
The Components of a Microcomputer System
Figure 1.1 shows a typical microcomputer system, consisting of a system unit, a keyboard, a display screen, and disk drives. The system unit is often referred to as "the computer," because it houses the circuit boards of the computer. The keyboard, display screen, and disk drives are called I/O devices because they perform input/output operations for the computer. They are also called peripheral devices or peripherals.
Integrated- circuit (IC) chips are used in the construction of computer circuits. Each IC chip may contain hundreds or even thousands of transistors. These IC circuits are known as digital circuits because they operate on discrete voltage signal levels, typically, a high voltage and a low voltage. We use the symbols 0 and 1 to represent the low- and high- voltage signals, respectively. These symbols are called binary digits, or bits. All information processed by the computer is represented by strings of 0's and 1's; that is, by bit strings.
Figure 1.1 A Microcomputer System
Functionally, the computer circuits consist of three parts: the central processing unit (CPU), the memory circuits, and the I/O circuits. In a microcomputer, the CPU is a single- chip processor called a microprocessor. The CPU is the brain of the computer, and it controls all operations. It uses the memory circuits to store information, and the I/O circuits to communicate with I/O devices.
Inside the system unit is a main circuit board called the system board, which contains the microprocessor and memory circuits. The system board is also called a motherboard because it contains expansion slots, which are connectors for additional circuit boards called add- in boards or add- in cards. I/O circuits are usually located on add- in cards. Figure 1.2 shows the picture of a motherboard.
Memory
Information processed by the computer is stored in its memory. A memory circuit element can store one bit of data. However, the memory circuits are usually organized into groups that can store eight bits of data, and a string of eight bits is called a byte. Each memory byte circuit—or memory byte, for short—is identified by a number that is called its address, like the street address of a house. The first memory byte has address
- The data stored in a memory byte are called its contents. When the contents of a memory byte are treated as a single number, we often use the term value to denote them.
It is important to understand the difference between address and contents. The address of a memory byte is fixed and is different from the address of any other memory byte in the computer. Yet the contents of a memory byte are not unique and are subject to change, because they denote the data currently being stored. Figure 1.3 shows the organization of memory bytes; the contents are arbitrary.
Another distinction between address and contents is that while the contents of a memory byte are always eight bits, the number of bits in an
Figure 1.3 Memory Represented as Bytes
address depends on the processor. For example, the Intel 8086 microprocessor assigns a 20- bit address, and the Intel 80286 microprocessor uses a 24- bit address. The number of bits used in the address determines the number of bytes that can be accessed by the processor.
Example 1.1 Suppose a processor uses 20 bits for an address. How many memory bytes can be accessed?
Solution: A bit can have two possible values, so in a 20- bit address there can be
In a typical microcomputer, two bytes form a word. To accommodate word data, the IBM PC allows any pair of successive memory bytes to be treated as a single unit, called a memory word. The lower address of the two memory bytes is used as the address of the memory word. Thus the memory word with the address 2 is made up of the memory bytes with the addresses 2 and 3. The microprocessor can always tell, by other information contained in each instruction, whether an address refers to a byte or a word.
In this book, we use the term memory location to denote either a memory byte or a memory word.
Figure 1.4 shows the bit positions in a microcomputer word and a byte. The positions are numbered from right to left, starting with 0. In a word, the bits 0 to 7 form the low byte and the bits 8 to 15 form the high byte. For a word stored in memory, its low byte comes from the memory byte with the lower address and its high byte is from the memory byte with the higher address.
The processor can perform two operations on memory: read (fetch) the contents of a location and write (store) data at a location. In a read operation, the processor only gets a copy of the data; the original contents
Figure 1.4 Bit Positions in a Byte and a Word
Figure 1.5 Bus Connections of a Microcomputer
of the location are unchanged. In a write operation, the data written become the new contents of the location; the original contents are thus lost.
There are two kinds of memory circuits: random access memory (RAM) and read- only memory (ROM). The difference is that RAM locations can be read and written, while, as the name implies, ROM locations can only be read. This is because the contents of ROM memory once initialized, cannot be changed.
Program instructions and data are normally loaded into RAM memory. However, the contents of RAM memory are lost when the machine is turned off, so anything valuable in RAM must be saved on a disk or printed out beforehand. ROM circuits retain their values even when the power is off. Consequently, ROM is used by computer manufacturers to store system programs. These ROM- based programs are known as firmware. They are responsible for loading start- up programs from disk as well as for self- testing the computer when it is turned on.
A processor communicates with memory and I/O circuits by using signals that travel along a set of wires or connections called buses that connect the different components. There are three kinds of signals: address, data, and control. And there are three buses: address bus, data bus, and control bus. For example, to read the contents of a memory location, the CPU places the address of the memory location on the address bus, and it receives the data, sent by the memory circuits, on the data bus. A control signal is required to inform the memory to perform a read operation. The CPU sends the control signal on the control bus. Figure 1.5 is a diagram of the bus connections for a microcomputer.
The CPU
As stated, the CPU is the brain of the computer. It controls the computer by executing programs stored in memory. A program might be a system program or an application program written by a user. In any case, each instruction that the CPU executes is a bit string (for the Intel 8086, instructions are from one to six bytes long). This language of 0's and 1's is called machine language.
Figure 1.6 Intel 8086 Microprocessor Organization
The instructions performed by a CPU are called its instruction set, and the instruction set for each CPU is unique. To keep the cost of computers down, machine language instructions are designed to be simple; for example, adding two numbers or moving a number from one location to another. The amazing thing about computers is that the incredibly complex tasks they perform are, in the end, just a sequence of very basic operations.
In the following, we will use the Intel 8086 microprocessor as an example of a CPU. Figure 1.6 shows its organization. There are two main components: the execution unit and the bus interface unit.
As the name implies, the purpose of the execution unit (EU) is to execute instructions. It contains a circuit called the arithmetic and logic unit (ALU). The ALU can perform arithmetic
The bus interface unit (BlU) facilitates communication betweer he EU and the memory or I/O circuits. It is responsible for transmittir" ad dresses, data, and control signals on the buses. Its registers are named CS, DS, ES, SS, and IP; they hold addresses of memory locations. The IP (instruction pointer) contains the address of the next instruction to be executed by the EU.
The EU and the BlU are connected by an internal bus, and they work together. While the EU is executing an instruction, the BlU fetches up to six bytes of the next instruction and places them in the instruction queue. This operation is called instruction prefetch. The purpose is to speed up the processor. If the EU needs to communicate with memory or the peripherals, the BlU suspends instruction prefetch and performs the needed onerations.
1.1.3
I/O Ports
I/O devices are connected to the computer through I/O circuits. Each of these circuits contains several registers called I/O ports. Some are used for data while others are used for control commands. Like memory locations, the I/O ports have addresses and are connected to the bus system. However, these addresses are known as I/O addresses and can only be used in input or output instructions. This allows the CPU to distinguish between an I/O port and a memory location.
I/O ports function as transfer points between the CPU and I/O devices. Data to be input from an I/O device are sent to a port where they can be read by the CPU. On output, the CPU writes data to an I/O port. The I/O circuit then transmits the data to the I/O device.
The data transfer between an I/O port and an I/O device can be 1 bit at a time (serial), or 8 or 16 bits at a time (parallel). A parallel port requires more wiring connections, while a serial port tends to be slower. Slow devices, like the keyboard, always connect to a serial port, and fast devices, like the disk drive, always connect to a parallel port. But some devices, like the printer, can connect to either a serial or a parallel port.
1.2 Instruction Execution
To understand how the CPU operates, let's look at how an instruction is executed. First of all, a machine instruction has two parts: an opcode and operands. The opcode specifies the type of operation, and the operands are often given as memory addresses to the data to be operated on. The CPU goes through the following steps to execute a machine instruction (the fitch- execute cycle):
Fetch
-
Fetch an instruction from memory.
-
Decode the instruction to determine the operation.
-
Fetch data from memory if necessary.
-
Perform the operation on the data.
-
Store the result in memory if needed.
To see what this entails, let's trace through the execution of a typical machine language instruction for the 8086. Suppose we look at the instruction that adds the contents of register AX to the contents of the memory word at address 0. The CPU actually adds the two numbers in the ALU and then stores the result back to memory word 0. The machine code is
00000001 0000010 00000000 00000000
Before execution, we assume that the first byte of the instruction is stored at the location indicated by the IP.
-
Fetch the instruction. To start the cycle, the BIU places a memory read request on the control bus and the address of the instruction on the address bus. Memory responds by sending the contents of the location specified—namely, the instruction code just given—over the data bus. Because the instruction code is four bytes and the 8086 can only read a word at a time, this involves two read operations. The CPU accepts the data and adds four to the IP so that the IP will contain the address of the next instruction.
-
Decode the instruction. On receiving the instruction, a decoder circuit in the EU decodes the instruction and determines that it is an ADD operation involving the word at address 0.
-
Fetch data from memory. The EU informs the BIU to get the contents of memory word 0. The BIU sends address 0 over the address bus and a memory read request is again sent over the control bus. The contents of memory word 0 are sent back over the data bus to the EU and are placed in a holding register.
-
Perform the operation. The contents of the holding register and the AX register are sent to the ALU circuit, which performs the required addition and holds the sum.
-
Store the result. The EU directs the BIU to store the sum at address 0. To do so, the BIU sends out a memory write request over the control bus, the address 0 over the address bus, and the sum to be stored over the data bus. The previous contents of memory word 0 are overwritten by the sum.
The cycle is now repeated for the instruction whose address is contained in the IP.
The preceding example shows that even though machine instructions are very simple, their execution is actually quite complex. To ensure that the steps are carried out in an orderly fashion, a clock circuit controls the processor
by generating a train of clock pulses as shown in Figure 1.7. The time interval between two pulses is known as a clock period, and the number of pulses per second is called the clock rate or clock speed, measured in megahertz (MHz). One megahertz is 1 million cycles (pulses) per second. The original IBM PC had a clock rate of 4.77 MHz, but the latest PS/2 model has a clock rate of 33 MHz.
The computer circuits are activated by the clock pulses; that is, the circuits perform an operation only when a clock pulse is present. Each step in the instruction fetch and execution cycle requires one or more clock periods. For example, the 8086 takes four clock periods to do a memory read and a multiplication operation may take more than seventy clock periods. If we speed up the clock circuit, a processor can be made to operate faster. However, each processor has a rated maximum clock speed beyond which it may not function properly.
I/O devices are needed to get information into and out of the computer. The primary I/O devices are magnetic disks, the keyboard, the display monitor, and the printer.
We've seen that the contents of RAM are lost when the computer is turned off, so magnetic disks are used for permanent storage of programs and data. There are two kinds of disks: floppy disks (also called diskettes) and hard disks. The device that reads and writes data on a disk is called a disk drive.
Floppy disks come in 5/4- inch or 31/2- inch diameter sizes. They are lightweight and portable; it is easy to put a diskette away for safekeeping or use it on different computers. The amount of data a floppy disk can hold depends on the type; it ranges from 360 kilobytes to 1.44 megabytes. A kilobyte (KB) is 210 bytes.
A hard disk and its disk drive are enclosed in a hermetically sealed container that is not removable from the computer; thus, it is also called a fixed disk. It can hold a lot more data than a floppy disk—typically 20, 40, to over 100 megabytes. A program can also access information on a hard disk much faster than a floppy disk.
Disk operations are covered in Chapter 19.
The keyboard allows the user to enter information into the computer. It has the keys usually found on a typewriter, plus a number of control and function keys. It has its own microprocessor that sends a coded signal to the computer whenever a key is pressed or released.
When a key is pressed, the corresponding key character normally appears on the screen. But interestingly enough, there is no direct connection between the keyboard and the screen. The data from the keyboard are received by the current running program. The program must send the data to the screen before a character is displayed. In Chapter 12 you will learn how to control the keyboard.
The display monitor is the standard output device of the computer. The information displayed on the screen is generated by a circuit in the computer called a video adapter. Most adapters can generate both text characters and graphics images. Some monitors are capable of displaying in color.
We discuss text mode operations in Chapter 12, and cover graphics mode in Chapter 16.
Although monitors give fast visual feedback, the information is not permanent. Printers, however, are slow but provide more permanent output. Printer outputs are known as hardcopies.
The three common kinds of printers are daisy wheel, dot matrix, and laser printers. The output of a daisy wheel printer is similar to that of a typewriter. A dot matrix printer prints characters composed of dots; depending on the number of dots used per character, some dot matrix printers can generate near- letter- quality printing. The advantage of dot matrix printers is that they can print characters with different fonts as well as graphics.
The laser printer also prints characters composed of dots; however, the resolution is so high (300 dots per inch) that it has typewriter quality. The laser printer is expensive, but in the field of desktop publishing it is indispensable. It is also quiet compared to the other printers.
The operations of the computer's hardware are controlled by its software. When the computer is on, it is always in the process of executing instructions. To fully understand the computer's operations, we must also study its instructions.
A CPU can only execute machine language instructions. As we've seen, they are bit strings. The following is a short machine language program for the IBM PC:
Machine instruction
Operation
10100001 00000000 00000000
Fetch the contents of memory word 0 and put it in register AX.
00000101 00000100 00000000
10100011 00000000 00000000
Add 4 to AX.
Store the contents of AX in memory word 0.
As you can well imagine, writing programs in machine language is tedious and subject to error!
A more convenient language to use is assembly language. In assembly language, we use symbolic names to represent operations, registers, and memory locations. If location 0 is symbolized by A, the preceding program expressed in IBM PC assembly language would look like this:
MOV . AX, A
; fetch the contents of
; location A and
; put it in register AX
ADD AX, 4
; add 4 to AX
MOV A, AX
; move the contents of AX
; into location A
A program written in assembly language must be converted to machine language before the CPU can execute it. A program called the assembler translates each assembly language statement into a single machine language instruction.
Even though it's easier to write programs in assembly language than machine language, it's still difficult because the instruction set is so primitive. That is why high- level languages such as FORTRAN, Pascal, C, and others were developed. Different high- level languages are designed for different applications, but they generally allow programmers to write programs that look more like natural language text than is possible in assembly language.
A program called a compiler is needed to translate a high- level language program into machine code. Translation is more involved than assembling because it entails the translation of complex mathematical expressions and natural language commands into simple machine operations. A high- level language statement typically translates into many machine language instructions.
There are many reasons why a programmer might choose to write a program in a high- level language rather than in assembly language.
First, because high- level languages are closer to natural languages, it's easier to convert a natural language algorithm to a high- level language program than to an assembly language program. For the same reason, it's easier to read and understand a high- level language program than an assembly language program.
Second, an assembly language program generally contains more statements than an equivalent high- level language program, so more time is needed to code the assembly language program.
Third, because each computer has its own unique assembly language, assembly language programs are limited to one machine, but a high- level language program can be executed on any machine that has a compiler for that language.
The main reason for writing assembly language programs is efficiency: because assembly language is so close to machine language, a well- written assembly language program produces a faster, shorter machine language program. Also, some operations, such as reading or writing to specific memory locations and I/O ports, can be done easily in assembly language but may be impossible at a higher level.
Actually, it is not always necessary for a programmer to choose between assembly language and high- level languages, because many high- level languages accept subprograms written in assembly language. This means that
crucial parts of a program can be written in assembly language, with the rest written in a high- level language.
In addition to these considerations, there is another reason for learning assembly language. Only by studying assembly language is it possible to gain a feeling for the way the computer "thinks" and why certain things happen the way they do inside the computer. High- level languages tend to obscure the details of the compiled machine language program that the computer actually executes. Sometimes a slight change in a program produces a major increase in the run time of that program, or arithmetic overflow unexpectedly occurs. Such things can be understood on the assembly language level.
Even though here you will study assembly language specifically for the IBM PC, the techniques you will learn are typical of those used in any assembly language. Learning other assembly languages should be relatively easy after you have read this book.
To give an idea of what an assembly language program looks like, here is a simple example. The following program adds the contents of two memory locations, symbolized by A and B. The sum is stored in location SUM.
TITLE PGM1_1. SAMPLE PROGRAM
MODEL SMALL
STACK 100H
DATA
A DW 2
B DW 5
SUM DW ?
CODE
MAIN PROC
;initialize DS
MOV AX,@DATA
MOV DS,AX
;add the number 8
MOV AX,A ;AX has A
ADD AX,B
MOV SUM,AX ;SUM = A+B
;exit to DOS
MOV AX,ACGOH
INT 21H
MAIN ENDP
END MAIN
Assembly language programs consist of statements. A statement is either an instruction to be executed when the program is run, or a directive for the assembler. For example, MODEL SMALL is an assembler directive that specifies the size of the program. MOV AX,A is an instruction. Anything that follows a semicolon is a comment, and is ignored by the assembler.
The preceding program consists of three parts, or segments: the stack segment, the data segment, and the code segment. They begin with the directives .STACK, .DATA, and .CODE, respectively.
The stack segment is used for temporary storage of addresses and data. If no stack segment is declared, an error message is generated, so there must be a stack segment even if the program doesn't utilize a stack.
Variables are declared in the data segment. Each variable is assigned space in memory and may be initialized. For example, A DW 2 sets aside a memory word for a variable called A and initializes it to 2 (DW stands for "Define Word"). Similarly, B DW 5 sets aside a word for variable B and initializes it to 5 (these initial values were chosen arbitrarily). SUM DW ? sets aside an uninitialized word for SUM.
A program's instructions are placed in the code segment. Instructions are usually organized into units called procedures. The preceding program has only one procedure, called MAIN, which begins with the line MAIN PROC and ends with line MAIN ENDP.
The main procedure begins and ends with instructions that are needed to initialize the DS register and to return to the DOS operating system. Their purpose is explained in Chapter 4. The instructions for adding A and B and putting the answer in SUM are as follows:
MOV AX, A AX has A ADD AX, 3 AX has A+B MOV. SUM, AX SUM = A+B
MOV AX, A copies the contents of word A into register AX. ADD AX, B adds the contents of B to it, so that AX now holds the total. MOV. SUM, AX stores the answer in variable SUM.
Before this program could be run on the computer, it would have to be assembled into a machine language program. The steps are explained in Chapter 4. Because there were no output instructions, we could not see the answer on the screen, but we could trace the program's execution in a debugger such as the DEBUG program.
add- in board or card
Circuit board that connects to the motherboard, usually contains I/O circuits or additional memory
address
address bus
A number that identifies a memory location The set of electrical pathways for address signals
arithmetic and logic unit, CPU circuit where arithmetic and logic ALU operations are done
assembly language
assembly language
binary digit bit bus
A program that translates an assembly language program into machine language
Symbolic representation of machine language
A symbol that can have value 0 or 1
Binary digit
A set of wires or connections connecting the CPU, memory, and I/O ports
bus interface unit, B1U
byte central processing unit, CPU clock period clock pulse
clock rate
clock speed compiler
contents
control bus data bus digital circuits
disk drive
execution unit, EU expansion slots
fetch- execute cycle
firmware
fixed disk floppy disk hardcopy hard disk I/O devices
I/O ports
instruction pointer, IP
instruction set
kilobyte, KB machine language
Part of the CPU that facilitates communication between the CPU, memory, and I/O ports
8 bits
The main processor circuit of a computer
The time interval between two clock pulses An electrical signal that rises from a low voltage to a high voltage and down again to a low voltage, used to synchronize computer circuit operations
The number of clock pulses per second, measured in meghertz (MHz)
Clock rate
A program that translates a high- level language to machine language
The data stored in a register or memory location
The set of electrical paths for control signals
The set of electrical paths for data signals
Circuits that operate on discrete voltage levels
The device that reads and writes data on a disk
Part of the CPU that executes instructions Connectors in the motherboard where other circuit boards can be attached
Cycle the CPU goes through to execute an instruction
Software supplied by the computer manufacturer, usually stored in ROM
Nonremovable disk, made of metal
Removable, flexible disk
Printer output
Fixed disk
Devices that handle input and output data of the computer; typical I/O devices are display monitor, disk drive, and printer
Circuits that function as transfer points between the CPU and I/O devices
A CPU register that contains the address of the next instruction
The instructions the CPU is capable of performing
Instructions coded as bit strings: the language of the computer
megabyte, MB
megahertz, MHz
memory byte (circuit)
memory location
memory word
microprocessor
motherboard
opcode
operand
peripheral (device)
random access memory, RAM
read- only memory, ROM
register
system board
video adapter
word
A unit that usually denotes 1 million, but in computer terminology 1 mega is
1,000,000 cycles per second
A memory circuit that can store one byte
A memory byte or memory word
Two memory bytes
A processing unit fabricated on a single circuit chip
The main circuit board of the computer
Numeric or symbolic code denoting the type of operation for an instruction
The data specified in an instruction
I/O device
Memory circuits that can be read or written
Memory circuits that can only be read
A CPU circuit for storing information
Motherboard
Computer circuit that converts computer data into video signals for the display monitor
16 bits
- Suppose memory bytes 0-4 have the following contents:
Address Contents
0 01101010
1 11011101
2 00010001
3 11111111
4 01010101
a. Assuming that a word is 2 bytes, what are the contents of
the memory word at address 2?
the memory word at address 3?
- the memory word whose high byte is the byte at address 2?
b. What is
-
bit 7 of byte 2?
-
bit 0 of word 3?
-
bit 4 of byte 2?
-
bit 11 of word 2?
- A nibble is four bits. Each byte is composed of a high nibble and a low nibble, similar to the high and low bytes of a word. Using the data in exercise 1, give the contents of
a. the low nibble of byte 1.
b. the high nibble of byte 4.
- The two kinds of memory are RAM and ROM. Which kind of memory
a. holds a user's program?
b. holds the program used to start the machine?
c. can be changed by the user?
d. retains its contents, even when the power is turned off?
- What is the function of
a. the microprocessor?
b. the buses?
- The two parts of the microprocessor are the EU and the BIU.
a. What is the function of the EU?
b. What is the function of the BIU?
- In the microprocessor, what is the function of
a. the IP?
b. the ALU?
- a. What are the I/O ports used for?
b. How are they different from memory locations?
-
What is the maximum length (in bytes) of an instruction for the 8086-based IBM PC?
-
Consider a machine language instruction that moves a copy of the contents of register AX in the CPU to a memory word. What happens during
a. the fetch cycle?
b. the execute cycle?
- Give
a. three advantages of high-level language programming.
b. the primary advantage of assembly language programming.
You saw in Chapter 1 that computer circuits are capable of processing only binary information. In this chapter, we show how numbers can be expressed in binary; this is called the binary number system. We also introduce a very compact way of representing binary information called the hexadecimal number system.
Conversions between binary, decimal, and hexadecimal numbers are covered in section 2.2. Section 2.3 treats addition and subtraction in these number systems.
Section 2.4 shows how negative numbers are represented and what effects the fixed physical size of a byte or word has on number representation.
We conclude the chapter by exploring how characters are encoded and used by the computer.
Before we look at how numbers are represented in binary, it is in structive to look at the familiar decimal system. It is an example of a positional number system; that is, each digit in the number is associated with a power of 10, according to its position in the number. For example, the decimal number 3932 represents 3 thousands, 9 hundreds, 3 tens, and 2 ones. In other words,
In a positional system, some number b is selected as the base and symbols are assigned to numbers between 0 and b - 1. For example, in the decimal system there are ten basic symbols (digits): 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. The base ten is represented as 10.
In the binary number system, the base is two and there are only two digits, 0 and 1. For example, the binary string 11010 represents the number
The base two is represented in binary as 10.
Numbers written in binary tend to be long and difficult to express. For example, 16 bits are needed to represent the contents of a memory word in an 8086- based computer. But decimal numbers are difficult to convert into binary. When we write assembly language programs we tend to use both binary, decimal, and a third number system called hexadecimal, or hex for short. The advantage of using hex numbers is that the conversion between binary and hex is easy.
The hexadecimal (hex) system is a base sixteen system. The hex digits are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. The hex letters A through F denote numbers ten to fifteen, respectively. After F comes the base sixteen, represented in hex by 10.
Because sixteen is 2 to the power of 4, each hex digit corresponds to a unique four- bit number, as shown in Table 2.1. This means that the contents of a byte—eight bits—may be expressed neatly as two hex digits, which makes hex numbers useful with byte- oriented computers.
Table 2.2 shows the relations among binary, decimal, and hexadecimal numbers. It is a good idea to take a few minutes and memorize the first
Table 2.1 Hex Digits and Binary Equivalent
| Hex Digits | Binary |
| 0 | 0000 |
| 1 | 0001 |
| 2 | 0010 |
| 3 | 0011 |
| 4 | 0100 |
| 5 | 0101 |
| 6 | 0110 |
| 7 | 0111 |
| 8 | 1000 |
| 9 | 1001 |
| A | 1010 |
| B | 1011 |
| C | 1100 |
| D | 1101 |
| E | 1110 |
| F | 1111 |
Table 2.2 Decimal, Binary, and Hexadecimal Numbers
Decimal Binary Hexadecimal
0 0 0 1 1 1 2 10 2 3 11 3 4 100 4 5 101 5 6 110 6 7 111 7 8 1000 8 9 1001 9 10 1010 A 11 1011 B 12 1100 C 13 1101 D 14 1110 E 15 1111 F 16 10000 10 17 10001 11 18 10010 12 19 10011 13 20 10100 14 21 10101 15 22 10110 16 23 10111 17 24 11000 18 25 11001 19 26 11010 1A 27 11011 1B 28 11100 1C 29 11101 1D 30 11110 1E 31 11111 1F 32 100000 20 256 10000000 10 1024 400 32767 7F1F 32768 8000 65535 1F1F
16 or so lines of the table, because you will often need to express small numbers in all three systems.
A problem in working with different number systems is the meaning of the symbols used. For example, as you have seen, 10 means ten in the decimal system, sixteen in hex, and two in binary. In this book, the following convention is used whenever confusion may arise: hex numbers are followed by the letter h; for example, 1A34h. Binary numbers are followed by the letter b; for example, 101b. Decimal numbers are followed by the letter d; for example, 79d.
In working with assembly language, it is often necessary to take a number expressed in one system and write it in a different system.
Consider the hex number 82AD. It can be written as
Similarly, the binary number 11101 may be written as
This gives one way to convert a binary or hex number to decimal, but an easier way is to use nested multiplication. For example,
This can be easily implemented with a calculator: Multiply the first hex digit by 16, and add the second hex digit. Multiply that result by 16, and add the third hex digit. Multiply the result by 16, add the next hex digit, and so on.
The same procedure converts binary to decimal. Just multiply each result by 2 instead of 16.
Example 2.1 .Convert 11101 to decimal.
Solution: $\begin{array}{r l r l r l} & {} & 1 & {} & 1 & {} & 0 & {} & 1\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} \end{array}$ $\begin{array}{r l r l r l} & {} & 1 & {} & 1 & {} & 1 & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} \end{array}$ $\begin{array}{r l r l r l} & {} & 1 & {} & 1 & {} & 0 & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {}& {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & & {} & {} & {} & {} & {} & {}\ & & {} & {} & {} & {} & {} & {}\ & & {} & {} & {} & {} & {} & {}\ & & {} & {} & {} & {} & {} & {}\ & & {} & {} & {} & {} & {} & {}\ & & {} & {} & {} & {} & {} & {}\ & & {} & {} & {} & {} & {} & {}\ & & & {} & {} & {} & {} & {}\ & & & & {} & {} & {} & {}\ & & & & {} & {} & {} & {}\ & & & & & {} & {} & {}\ & & & & & {} & {} & {}\ & & & & & & {} & {}\ & & & & & & {} & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & & {}\ & & & & & & \end{array}$
Example 2.2 Convert 2BD411 to decimal.
Solution: $\begin{array}{r l r l r l} & {} & 2 & {} & \mathbf{B} & {} & \mathbf{D} & {} & 4\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {}& {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & {} & {} & {} & {} & {} & {}\ & {} & & {} & {} & {} & {} & {}\ & {} & & {} & {} & {} & {} & {}\ & {} & & {} & {} & {} & {} & {}\ & {} & & {} & {} & {} & {} & {}\ & {} & & {} & {} & {} & {} & {}\ & {} & & {} & {} & {} & {} & {}\ & {} & & {} & {} & {} & {} & {}\ & {} & & & {} & {} & {} & {}\ & {} & & & {} & {} & {} & {}\ & {} & & & {} & {} & {} & {}\ & {} & & & {} & {} & {} & {}\ & {} & & & {} & {} & {} & {}\ & {} & & & {} & {} & {} & {}\ & {} & & & {} & {} & {} & {}\ & {} & & & {} & {} & {} & {} \end{array}$
where we have used the fact that
Suppose we want to convert 11172 to hex. The answer 2BA4h may be obtained as follows. First, divide 11172 by 16. We get a quotient of 698 and a remainder of 4. Thus
The remainder 4 is the unit's digit in hex representation of 11172. Now divide 698 by 16. The quotient is 43, and the remainder is
The remainder Ah is the sixteen's digit in the hex representation of 11172. We just continue this process, each time dividing the most recent quotient by 16, until we get a 0 quotient. The remainder each time is a digit in the hex representation of 11172. Here are the calculations:
Now just convert the remainders to hex and put them together in reverse order to get 2BA4h.
This same process may be used to convert decimal to binary. The only difference is that we repeatedly divide by 2.
Example 2.3 Convert 95 to binary.
Solution:
Taking the remainders in reverse order, we get
To convert a hex number to binary, we need only express each hex digit in binary.
Example 2.4 Convert 2B3Ch to binary.
Solution: 2 B 3 C
To go from binary to hex, just reverse this process; that is, group the binary digits in fours starting from the right. Then convert each group to a hex digit.
Example 2.5 Convert 1110101010 to hex.
Solution: 1110101010 = 11 1010 1010 = 3AAh.
2.3 Addition and SubtractionSometimes you will want to do binary or hex addition and subtraction. Because these operations are done by rote in decimal, let's review the process to see what is involved.
Consider the following decimal addition
To get the unit's digit in the sum, we just compute
A reason that decimal addition is easy for us is that we memorized the addition table for small numbers a long time ago. Table 2.3A is an addition table for small hex numbers. To compute
By using the addition table, hex addition may be done in exactly the same way as decimal addition. Suppose we want to compute the following hex sum:
Table 2.3A Hexadecimal Addition Table
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
| 1 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | 10 |
| 2 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | 10 | 11 |
| 3 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | 10 | 11 | 12 |
| 4 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | 10 | 11 | 12 | 13 |
| 5 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | 10 | 11 | 12 | 13 | 14 |
| 6 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | 10 | 11 | 12 | 13 | 14 | 15 |
| 7 | 7 | 8 | 9 | A | B | C | D | E | F | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
| 8 | 8 | 9 | A | B | C | D | E | F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |
| 9 | 9 | A | B | C | D | E | F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| A | A | B | C | D | E | F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| B | B | C | D | E | F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A |
| C | C | D | E | F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B |
| D | D | E | F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C |
| E | E | F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C | 1D |
| F | F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C | 1D | 1E |
In the unit's column, we compute
Binary addition is done the same way as decimal and hex addition, but is a good deal easier because the binary addition table is so small (Table 2.3B). To do the sum:
Compute
Let's begin with the decimal subtraction
in the unit's column, we compute
Hex subtraction may be done the same way as decimal subtraction. To compute the hex difference
we start with
The easy way to figure this is to go to row 9 in Table 2.3A, and notice that 16 appears in column D. This means that
Now let us look at binary subtraction, for example,
The unit's column is easy,
The hardware of a computer necessarily restricts the size of numbers that can be stored in a register or memory location. In this section, we will see how integers can be stored in an 8- bit byte or a 16- bit word. In Chapter 18 we talk about how real numbers can be stored.In the following, we'll need to refer to two particular bits in a byte or word: the most significant bit, or msb, is the leftmost bit. In a word, the msb is bit 15; in a byte, it is bit 7. Similarly, the least significant bit, or 1sb, is the rightmost bit; that is, bit 0.
In the following, we'll need to refer to two particular bits in a byte or word: the most significant bit, or msb, is the leftmost bit. In a word, the msb is bit 15; in a byte, it is bit 7. Similarly, the least significant bit, or 1sb, is the rightmost bit; that is, bit 0.
2.4.1
2.4.1 U:signed Integers
An unsigned integer is an integer that represents a magnitude, so it is never negative. Unsigned integers are appropriate for representing quantities that can never be negative, such as addresses of memory locations, counters, and ASCII character codes (see later). Because unsigned integers are by definition nonnegative, none of the bits are needed to represent the sign, and so all 8 bits in a byte, or 16 bits in a word, are available to represent the number.
The largest unsigned integer that can be stored in a byte is 11111111
Note that if the least significant bit of an integer is 1, the number is odd, and it's even if the 1sb is 0.
2.4.2 Signed Integers
A signed integer can be positive or negative. The most significant bit is reserved for the sign: 1 means negative and 0 means positive. Negative integers are stored in the computer in a special way known as two's com. plcmcnt. To explain it, we first define one's complcmcnt, as follows.
The one's complement of an integer is obtained by complementing each bit; that is, replace each 0 by a 1 and each 1 by a 0. In the following, we assume numbers are 16 bits.
Example 2.6 Find the one's complement of
Solution:
Note that if we add 5 and its one's complement, we get 111111111111111.
To get the two's complement of an integer, just add 1 to its one's complement.
Example 2.7 Find the two's complement of 5.
Solution: From above,
one's complement of
Now look what happens when we add 5 and its two's complement:
5 = 00000000000000101 1 two's complement of 5 = 111111111111011 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
We end up with a 17- bit number. Because a computer word circuit can only hold 16 bits, the 1 carried out from the most significant bit is lost, and the 16- bit result is 0. As 5 and its two's complement add up to 0, the two's complement of 5 must be a correct representation of - 5.
It is easy to see why the two's complement of any integer N must represent - N: Adding N and its one's complement gives 16 ones; adding 1 to this produces 16 zeros with a 1 carried out and lost. The result stored is always 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.
The following example shows what happens when a number is complemented two times.
Example 2.8 Find the two's complement of the two's complement of 5.
Solution: We would guess that after complementing 5 two times, the result should be 5. To verify this, from above,
two's complement of
one's complement of 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
+1 two's complement of 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 1
Example 2.9 Show how the decimal integer - 97 would be represented (a) in 8 bits, and (b) in 16 bits. Express the answers in hex.
Solution: A decimal- to- hex conversion using repeated division by 16 yields
Thus
a. In 8 bits, we get
b. In 16 bits, we get
The advantage of two's complement representation of negative integers in the computer is that subtraction can be done by bit complementation and addition, and circuits that add and complement bits are easy to design.
Example 2.10 Suppose AX contains 5ABCn and BX contains 21fCh. Find the difference of AX minus BX by using complementation and addition.
Solution:
AX contains 5ABCn = (0101 1010 1011 1100 BX contains 21fCh = 0010 0001 1111 1100 5ABCn = 0101 1010 1011 1100
- one's complement of 21fCh = 1101 1110 0000 0011
A one is carried out of the most sign. in: it is it and is lost. The answer stored, 38C0h, is correct, as may be verified by hex subtraction.
2.4.3
In the last section, we saw how signed and unsigned decimal integers may be represented in the computer. The reverse problem is to interpret the contents of a byte or word as a signed or unsigned decimal integer.
Unsighual decimal interpretation: Just do a binary- to- decimal conversion. It's usually easier to convert binary to hex first, and then convert hex to decimal.
Signed decimal interpretation: If the most significant bit is 0, the number is positive, and the signed decimal is the same as the unsigned decimal. If the msb is 1, the number is negative, so call it - N. To find N, just take the twos' complement and then convert to decimal as before.
Example 2.11 Suppose AX contains FEOCh. Give the unsigned and signed decimal interpretations.
Table 2.4A Signed and Unsighed Decimal Interpretations of 16-Bit Register/Memory Contents/
| Hex | Unsigned decimal | Signed decimal |
| 0000 | 0 | 0 |
| 0001 | 1 | 1 |
| 0002 | 2 | 2 |
| . | . | . |
| . | . | . |
| . | . | . |
| 0009 | 9 | 9 |
| 000A | 10 | 10 |
| . | . | . |
| . | . | . |
| . | . | . |
| 7FFE | 32766 | 32766 |
| 7FFF | 32767 | 32767 |
| 8000 | 32768 | -32768 |
| 8001 | 32769 | -32767 |
| . | . | . |
| . | . | . |
| FFFE | 65534 | -2 |
| FFFF | 65535 | -1 |
Solution: Conversion of FEOCh to decimal yields 65036, which is the unsigned decimal interpretation.
For the signed interpretation,
Thus, AX contains - 500.
Tables 2.4A and 2.4B give 16- bit word and 8- bit byte hex values and their signed and unsigned decimal interpretations. Note the following:
-
Because the most significant bit of a positive signed integer is 0, the leading hex digit of a positive signed integer is
$0 \div 7$ ; integers beginning with 8-fh have 1 in the sign bit, so they are negative. -
The largest 16-bit positive signed integer is
$7\mathrm{F}4\mathrm{h} = 32767$ ; the smallest negative integer is$8000\mathrm{h} = -32768$ . For a byte, the largest positive integer is$7\mathrm{Fh} = 127$ and the smallest is$80\mathrm{h} = -128$ . -
The following relationship holds between the unsigned and signed decimal interpretations of the contents of a 16-bit word:
Hex Unsigned decimal Signed decimal 00 0 0 01 1 1 02 2 2 09 9 9 0A 10 10 7E 126 126 7F 127 127 80 128 - 128 81 129 - 127 1 1 254 - 2 255 - 1
For 0000h- 7FFh, signed decimal = unsigned decimal. For 8000h- FFFh, signed decimal = unsigned decimal - 65536.
There are similar relations for the contents of an eight- bit byc:
For 00h- 7Fh, signed decimal = unsigned decimal.
For 80h- FFh, signed decimal = unsigned decimal - 256.
Example 2.12 Use observation 3, from the above, to rework example 2.11.
Solution: We saw that the unsigned decimal interpretation of FFOCh is 65036. Because the leading hex digit is Fh, the content is negative in a signed sense. To interpret it, just subtract 65536 from the unsigned decimal. Thus
signed decimal interpretation = 65036 - 65536 = - 500
2.5 Character .Representation
Not all data processed by the computer are treated as numbers. 1/O devices such as the video monitor and printer are character oriented, and programs such as word processors deal with characters exclusively. Like all
data, characters must be coded in binary in order to be processed by the computer. The most popular encoding scheme for characters is ASCII (American Standard Code for Information Interchange) code. Originally used in communications by teletype, ASCII code is used by all personal computers today.
The ASCII code system uses seven bits to code each character, so there are a total of
Notice that only 95 ASCII codes, from 32 to 126, are considered to be printable. The codes 0 to 31 and also 127 were used for communication control.purposes and do not produce printable characters. Most microcomputers use only the printable characters and a few control characters such as LF, CR, BS, and Bell.
Because each ASCII character is coded by only seven bits, the code of a single character fits into a byte, with the most significant bit set to zero. The printable characters can be displayed on the video monitor or printed by the printer, while the control characters are used to control the operations of these devices. For example, to display the character A on the screen, a program sends the ASCII code 41h to the screen; and to move the cursor back to the beginning of the line, a program sends the ASCII code ODh, which is the CR character, to the screen.
A computer may assign special display characters to some of the non- printed ASCII codes. As you will see later, the screen controller for the IBM PC can actually display an extended set of 256 characters. Appendix A shows the 256 display characters of the IBM PC.
Example 2. ' : : " now the character string "RG 2z" is stored in memory, starting .. address 0.
Solution: From Table 2.5, we have.
Character ASCII Code (hex) ASCII Code (binary) R 52 0101 0010 G 47 0100 0111 space 20 0010 0000 2 32 0011 0010 Z 7A 0111 1010
So memory would look like this
Address Contents 0 01010010 1 01000111 2 00100000 3 00110010 4 01111010
It's reasonable to guess that the keyboard identifies a key by generating an ASCII code when the key is pressed. This was true for a class of keyboards known as ASCII keyboards used by some early microcomputers.
Table 2.5 ASCII Code
| Dec | Hex | Char | Dec | Hex | Char | Dec | Hex | Char | Dec | Hex | Char |
| 0 | 00 | <cc> | 32 | 20 | SP | 64 | 40 | @ | 96 | 60 | . |
| 1 | 01 | <cc> | 33 | 21 | ! | 65 | 41 | A | 97 | 61 | a |
| 2 | 02 | <cc> | 34 | 22 | " | 66 | 42 | B | 98 | 62 | b |
| 3 | 03 | <cc> | 35 | 23 | # | 67 | 43 | C | 99 | 63 | c |
| 4 | 04 | <cc> | 36 | 24 | $ | 68 | 44 | D | 100 | 64 | d |
| 5 | 05 | <cc> | 37 | 25 | % | 69 | 45 | B | 101 | 65 | e |
| 6 | 06 | <cc> | 38 | 26 | & | 70 | 46 | F | 102 | 66 | f |
| 7 | 07 | <cc> | 39 | 27 | . | 71 | 47 | G | 103 | 67 | g |
| 8 | 08 | <cc> | 40 | 28 | ( | 72 | 48 | H | 104 | 68 | h |
| 9 | 09 | <cc> | 41 | 29 | ) | 73 | 49 | ! | 105 | 69 | i |
| 10 | 0A | <cc> | 42 | 2A | * | 74 | 4A | J | 106 | 6A | j |
| 11 | 0B | <cc> | 43 | 2B | + | 75 | 4B | K | 107 | 6B | k |
| 12 | 0C | <cc> | 44 | 2C | . | 76 | 4C | L | 108 | 6C | l |
| 13 | 0D | <cc> | 45 | 2D | - | 77 | 4D | M | 109 | 6D | m |
| 14 | 0E | <cc> | 46 | 2E | . | 78 | 4E | N | 110 | 6E | n |
| 15 | 0F | <cc> | 47 | 2F | / | 79 | 4F | O | 111 | 6F | o |
| 16 | 10 | <cc> | 48 | 30 | 0 | 80 | 50 | P | 112 | 70 | p |
| 17 | 11 | <cc> | 49 | 31 | 1 | 81 | 51 | Q | 113 | 71 | q |
| 18 | 12 | <cc> | 50 | 32 | 2 | 82 | 52 | R | 114 | 72 | r |
| 19 | 13 | <cc> | 51 | 33 | 3 | 83 | 53 | S | 115 | 73 | s |
| 20 | 14 | <cc> | 52 | 34 | 4 | 84 | 54 | T | 116 | 74 | t |
| 21 | 15 | <cc> | 53 | 35 | 5 | 85 | 55 | U | 117 | 75 | u |
| 22 | 16 | <cc> | 54 | 36 | 6 | 86 | 56 | V | 118 | 76 | v |
| 23 | 17 | <cc> | 55 | 37 | 7 | 87 | 57 | V | 119 | 77 | w |
| 24 | 18 | <cc> | 56 | 38 | 8 | 88 | 58 | X | 120 | 78 | x |
| 25 | 19 | <cc> | 57 | 39 | 9 | 89 | 59 | Y | 121 | 79 | y |
| 26 | 1A | <cc> | 58 | 3A | : | 90 | 5A | Z | 122 | 7A | z |
| 27 | 1B | <cc> | 59 | 3B | : | 91 | 5B | [ | 123 | 7B | { |
| 28 | 1C | <cc> | 60 | 3C | < | 92 | 5C | \ | 124 | 7C | l |
| 29 | 1D | <cc> | 61 | 3D | = | 93 | 5D | ] | 125 | 7D | } |
| 30 | 1E | <cc> | 62 | 3E | > | 94 | 5E | ^ | 126 | 7E | - |
| 31 | 1F | <cc> | 63 | 3F | ? | 95 | 5F | - | 127 | 7F | <cc> |
CC> denotes a control character SP = blank space
| Dec | Hex | Char | Meaning |
| 7 | 07 | BEL | bell |
| 8 | 08 | BS | backspace |
| 9 | 09 | HT | horizontal tab |
| 10 | 0A | LF | line feed |
| 12 | 0C | FF | form feed |
| 13 | 0D | CR | carriage return |
Meaning bell backspace horizontal tab line feed form feed carriage return
However, modern keyboards have many control and function keys in addition to ASCII character keys, so other encoding schemes are used. For the IBM PC, each key is assigned a unique number called a scan code; when a key is pressed, the keyboard sends the key's scan code to the computer. Scan codes are discussed in Chapter 12.
-
Numbers are represented in different ways, according to the basic symbols used. The binary system uses two symbols, 0 and 1. The decimal system uses 0-9. The hexadecimal system uses 0-9, A-F.
-
Binary and hex numbers can be converted to decimal by a process of nested multiplication.
-
A hex number can be converted to decimal by a process of repeated division by 16; similarly, a binary number can be converted to decimal by a process of repeated division by 2.
-
Hex numbers can be converted to binary by converting each hex digit to binary; binary numbers are converted to hex by grouping the bits in fours, starting from the right, and converting each group to a hex digit:
-
The process of adding and subtracting hex and binary numbers is the same as for decimal numbers, and can be done with the help of the appropriate addition table.
-
Negative numbers are stored in two's complement form. To get the two's complement of a number, complement each bit and add 1 to the result.
-
If A and B are stored, integers, the processor computes A - B by adding the two's complement of B to A.
-
The range of unsigned integers that can be stored in a byte is 0-255; in a 16-bit word, if is 0-65535.
-
For signed numbers, the most significant bit is the sign bit; 0 means positive and 1 means negative. The range of signed numbers that can be stored in a byte is -128 to 127; in a word, it is -32768 to 32767.
-
The unsigned decimal interpretation of a word is obtained by converting the binary value to decimal. If the sign bit is 0, this is also the signed decimal interpretation. If the sign bit is 1, the signed decimal interpretation may be obtained by subtracting 65536 from the unsigned decimal interpretation.
-
The standard encoding scheme for characters is the ASCII code.
-
A character requires seven bits to code, so it can be stored in a byte.
-
The IBM screen controller can generate a character for each of the 256 possible numbers that can be stored in a byte.
ASCHI (American Standard : The encoding scheme for characters used :
Code for Information : Interchangc) codes
binary number system
on all personal computers
Base two system in which the digits are 0 and 1
hexadecimal number system
Base sixteen system in which the digits.
are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D,
E, and F
least significant bit, isb
The rightmost bit in a word or byte; that is, bit 0
most significant bit, msb.
The leftmost bit in a word or byte; that is, bit 15 in a word or bit 7 in a byte
one's complement of a binary number
scan code
Obtained by replacing each 0 bit by 1
and each 1 bit by 0
A number used to identify a key on the.. keyboard
signed integer
An integer that can be positive or negative
two's complement of a.
binary number
Obtained by adding 1 to the one's com
plement
unsigned integer
An integer representing a magnitude; that is, always positive
In. many applications, It saves time to memorize the conversions among small binary, decimal, and hex numbers. Without refer. ring to Table 2.2, fill in the blanks in the following table:
Binary Decimal Hex 191 9 10 10 110 12 111 14 8
- Convert the following binary and hex numbers to decimal:
a. 1110
b. 100101011101
C. 46Ah
d. FAE5Ch
- Convert the following decimal numbers:
a. 97 to binary.
b. 627 to binary
C. 921 to hex
d. 6120 to hex
- Convert the following numbers:
a. 1001011 to hex
b. 100101011011110 to hex
c. A2C11 to binary
d. B34Dh to binary
- Perform the following additions:
a. 100101b + 10111b
- b. 100111101b + 10001111001b
c. B23611b + 17912b
d. FF:FEh + FBCADh
- Perform the following subtractions:
a. 11011b - 10110b
b. 10000101b - 111011b
c. 5FC12h - 3ABD1h
d. F001Eh - 1FF3Fh
- Give the 16-bit representation of each of the following decimal integers. Write the answer in hex.
a. 234
b. -16
c. 31634
d. -32216
- Do the following binary and hex subtractions by two's complement addition.
a. 10110100 - 10010111
b. 10001011 - 11110111
c. FEOFh - 12ABh
d. 1ABCh - 113F:Ah
- Give the unsigned and signed decimal interpretations of each of the following 16-bit or 8-bit numbers.
a. 7FF:EH
b. 8543h
c. FEh
d. 7Fh
- Show how the decimal integer -120 would be represented
a. in 16 bits.
b. in 8 bits.
- For each of the following decimal numbers, tell whether it could be stored (a) as a 16-bit number (b) as an 8-bit number.
a. 32767
b. -40000
c. 65536
d. 257
e. -128
- For each of the following 16-bit signed numbers, tell whether it is positive or negative.
a. 1010010010010100b
b. 78E3h
C. CB33h
d. 807Fh
e. 9AC4h
-
If the character string "
$\mathfrak{S}12.75$ " is being stored in memory starting at address 0, give the hex contents of bytes 0-5. -
Translate the following secret message, which has been encoded in ASCII as 41 74 74 61 63 6B 20 61 74 20 44 61 77 6E.
-
Suppose that a byte contains the ASCII code of an uppercase letter. What hex number should be added to it to convert it to lower case?
-
Suppose that a byte contains the ASCII code of a decimal digit; that is, "
$\mathfrak{O}$ ". "...9." What hex number should be subtracted from the byte to convert it to the numerical form of the characters? -
It is not really necessary to refer to the hex addition table to do addition and subtraction of hex digits. To compute
$\mathbf{Eh} + \mathbf{Ah}$ , for example, first copy the hex digits:
0 1 2 3 4 5 6 7 8 9 A B C D E F
Now starting at Eh, move to the right
10 11 12 13 14 15 16 17 18 9 A B C D E F STOP ^ START ^
You get Eh + Ah = 18h. Subtraction can be done similarly. For example, to compute 15h - Ch, start at 15h and move left Ch = 12 places. When you go off the left end, continue on at the right:
10 11 12 13 14 15 6 7 8 9 A B C D E F ^ START ^ STOP
You get 15h - Ch = 9h.
Rework exercises 5(c) and 6(c) by this method.
Overview
Chapter 1 described the organization of a typical microcomputer system. This chapter takes a closer look at the IBM personal computers. These machines are based on the Intel 8086 family of microprocessors.
After a brief survey of the 8086 family in section 3.1, section 3.2 concentrates on the architecture of the 8086. We introduce the registers and mention some of their special functions. In section 3.2.3, the important idea of segmented memory is discussed.
In section 3.3, we look at the overall structure of the IBM PC; the memory organization, I/O ports, and the DOS and BIOS routines.
3.1
The Intel 8086 Family of Microprocessors
The IBM personal computer family consists of the IBM PC, PC XT, PC AT, PS/1, and PS/2 models. They are all based on the Intel 8086 family of microprocessors, which includes the 8086, 8088, 80186, 80188, 80286, 80386, 80386SX, 80486, and 80486SX. The 8088 is used in the PC and PC XT; the 80286 is used in the PC AT and PS/1. The 80186 is used in some PC- compatible lap- top models. The PS/2 models use either the 8086, 80286, 80386, or 80486.
Intel introduced the 8086 in 1978 as its first 16- bit microprocessor (a 16- bit processor can operate on 16 bits of data at a time). The 8088 was introduced in 1979. Internally, the 8088 is essentially the same as the 8086. Externally, the 8086 has a 16- bit data bus, while the 8088 has an 8- bit data bus. The 8086 also has a faster clock rate, and thus has better performance. IBM chose the 8088 over the 8086 for the original PC because it was less expensive to build a computer around the 8088.
"The 8086 and 8088 have the same instruction set and it forms the basic set of instructions for the other microprocessors in the family.
The 80186 and 80188 are enhanced versions of the 8086 and 8088, respectively. Their advantage is that they incorporate all the functions of the 8086 and 8088 microprocessors plus those of some support chips. They can also execute some new instructions called the extended instruction set. However, these processors offered no significant advantage over the 8086 and 8088 and were soon overshadowed by the development of the 80286.
The 80286, introduced in 1982, is also a 16- bit microprocessor. However, it can operate faster than the 8086 (12.5 MHz versus 10 MHz) and offers the following important advances over its predecessors:
-
Two modes of operation. The 80286 can operate in either real address mode or protected virtual address mode. In real address mode, the 80286 behaves like the 8086, and programs for the 8086 can be executed in this mode without modification. In protected virtual address mode, also called protected mode, the 80286 supports multitasking, which is the ability to execute several programs (tasks) at the same time, and memory protection, which is the ability to protect the memory used by one program from the actions of another program.
-
More addressable memory. The 80286 in protected mode can address 16 megabytes of physical memory (as opposed to 1 megabyte for the 8086 and 8088).
-
. Virtual memory in protected mode. This means that the 80286 can treat external storage (that is, a disk) as if it were physical memory, and therefore execute programs that are too large to be contained in physical memory; such programs can be up to 1 gigabyte (230 bytes).
Intel introduced its first 32- bit microprocessor, the 80386 (or 386), in 1985. It is much faster than the 80286 because it has a 32- bit data path, high clock rate (up to 3.3 MHz), and the ability to execute instructions in lower clock cycles than the 80286.
Like the 80286, the 386 can operate in either real or protected mode. In real mode, it behaves like an 8086. In protected mode, it can emulate the 80286. It also has a virtual 8086 mode designed to run multiple 8086 applications under memory protection. The 386, in protected mode, can address 4 gigabytes of physical memory, and 64 terabytes (236 bytes) of virtual memory.
The 386sX has essentially the same internal structure as the 386, but it has only a 16- bit data bus.
The 80486 and 80486sX MicroprocessorsIntroduced in 1989, the 80486 (or 486), is another 32- bit microprocessor. It is the fastest and most powerful processor in the family. It incorporates the functions of the 386 together with those of other support chips, including the 80387 numeric processor, which performs floating- point number operations, and an 8- Kb cache memory that serves as a fast memory area to buffer data coming from the slower memory unit. With its numeric processor, cache memory, and more advanced design, the 486 is three times faster than a 386 running at the same clock speed. The 486sX is similar to the 486 but without the floating- point processor.
3.2 Organization of the 8086/8088 Microprocessors
In the rest of this chapter we'll concentrate on the organization of the 8086 and 8088. These processors have the simplest structure, and most of the instructions we will study are 8086/8088 instructions. They also provide insight to the organization of the more advanced processors, discussed in Chapter 20.
Because the 8086 and 8088 have essentially the same internal structure, in the following, the name "8086" applies to both 8086 and 8088.
3.2.1 Registers
As noted in Chapter 1, information inside the microprocessor is stored in registers. The registers are classified according to the functions they perform. In general, data registers hold data for an operation, address registers hold the address of an instruction or data, and a status register keeps the current status of the processor.
The 8086 has four general data registers; the address registers are divided into segment, pointer, and index registers; and the status register is called the FLAGS register. In total, there are fourteen 16- bit registers, which we now briefly describe. See Figure 3.1. Note: You don't need to memorize the special functions of these registers at this time. They will become familiar with use.
Data Registers: AX, BX, CX, DX
These four registers are available to the programmer for general data manipulation. Even though the processor can operate on data stored in memory, the same instruction is faster (requires fewer clock cycles) if the data are stored in registers. This is why modern processors tend to have a lot of registers.
The high and low bytes of the data registers can be accessed separately. The high byte of AX is called AH, and the low byte is AL. Similarly, the high and low bytes of BX, CX, and DX are BH and BL, CH and CL, DH, and DL, respectively. This arrangement gives us more registers to use when dealing with byte- size data.
These four registers, in addition to being general- purpose registers, also perform special functions such as the following.
AX is the preferred register to use in arithmetic, logic, and transfer instructions because its use generates the shortest machine co
In multiplication and division operations, one of the numbers involved must be in AX or AL. Input and output operations also require the use of AL and AX.
BX also serves as an address register; an example is a table look- up . instruction called XLAT (translate).
i Program loop constructions are facilitated by the use of CX, which serves as a loop counter. Another example of using CX as counter is Rl:P (repeat), which controls a special class of instructions called string operations. CL is used as a count in instructions that shift and rotate bits.
DX is used in multiplication and division. It is also used in I/O operations
3.2.3
Segment Registers: CS, DS, SS, ES
Address registers store addresses of instructions and data in memory. These values are used by the processor to access memory locations. We begin with the memory organization.
Chapter I explained that memory is a collection of bytes. Each memory byte has an address, starting with 0. The 8086 processor assigns a 20- bit physical address to its memory locations. Thus it is possible to address
Because addresses are so cumbersome to write in binary, we usually express them as five hex digits, thus
00000 00001 00002
00009 0000A 0000B
and so on. The highest address is FFFF:FHx
In order to explain the function of the segment registers, we first need to introduce the idea of memory segments, which is a direct consequence of using a 20- bit address in a 16- bit processor. The addresses are too
big to fit in a 16- bit register:or memory word. The 8086. gets around this problem by partitioning its memory into segments.
A memory segment is a block of
Within a segment, a memory location is specified by g.vii.g'au'offset. This is the number of bytes from the beginning of the segment. With a 64- KB segment, the offset can be given as a 16- bit number. The first byte in a segment has offset 0. The last offset in a segment is FFFFh.
A memory location may be specified by providing a segment number and an offset, written in the form segment:offset; this is known as a logical address. For example, A4FB:4872h means offset 4872h within segment A4FBh. To obtain a 20- bit physical address, the 8086 microprocessor first shifts the segment address 4 bits to the left (this is equivalent to multiplying by 10h), and then adds the offset. Thus the physical address for A4FB:4872 is
A4FB0h 4872h A9822h (20- bit physical address)
It is instructive to see the layout of the segments in memory. Segment 0 starts at address 0000:0000 = 00000h and ends at 0000:FFF = OFFHh. Segment 1 starts at address 0001:0000 = 00010h and ends at 0001:FFF = 1000Hh. As we can see, there is a lot of overlapping between segments. Figure 3.2 shows the locations of the first three memory segments. The segments start every 10h = 16 bytes and the starting address of a segment always ends with a hex digit 0. We call 16 bytes a paragraph. We call an address that is divisible by 16 (ends with a hex digit 0) a paragraph boundary.
. Because segments may overlap, the segment:offset form of an address is not unique, as the following example shows.
Example 3.1 For the memory location whose physical address is specified by 1256Ah, give the address in segment:offset form for segments 1256h and 1240h.
Solution: Let X be the offset in segment 1256h and Y the offset in segment 1240h. We have
and so
thus-
It is also possible to calculate the segment number when the physical address and the offset are given.
Example 3.2 A memory location has physical address 80FD2h. In what segment does it have offset BFD2h?
Solution: We know that
in this example
So the segment must be 75000h.
Now let us talk about the registers CS, DS, SS, and FS. A typical machine language program consists of instructions (code) and data. There is also a data structure called the stack used by the processor to implement procedure calls. The program's code, data, and stack are loaded into different memory segments, we call them the code segment, data segment, and stack segment.
To keep track of the various program segments, the 8086 is equipped with four segment registers to hold segment numbers. The CS, DS, and SS registers contain the code, data, and stack segment numbers, respectively. If a program needs to access a second data segment, it can use the ES (extra segment) register.
A program segment need not occupy the entire 64 kilobytes in a memory segment. The overlapping nature of the memory segments permits program segments that are less than 64 KB to be placed close together. Figure 3.3 shows a typical layout of the program segments in memory (the segment numbers and the relative placement of the program segments shown are arbitrary).
At any given time, only those memory locations addressed by the four segment registers are accessible; that is, only four memory segments are active. However, the contents of a segment register can be modified by a program to address different segments.
Pointer and Index Registers: SP, BP, SI, DI
The registers SP, BP, SI, and DI normally point to (contain the offset addresses of) memory locations. Unlike segment registers, the pointer and index registers can be used in arithmetic and other operations.
The SP (stack pointer) register is used in conjunction with SS for accessing the stack segment. Operations of the stack are covered in Chapter 8.
The BP (base pointer) register is used primarily to access data on the stack. However, unlike SP, we can also use BP to access data in the other segments.
The SI (source index) register is used to point to memory locations in the data segment addressed by DS. By incrementing the contents of SI, we can easily access consecutive memory locations.
The DI (destination index) register performs the same functions as SI. There is a class of instructions, called string operations, that use DI to access memory locations addressed by ES.
The memory registers covered so far are for data access. To access instructions, the 8086 uses the registers CS and IP. The CS register contains the segment number of the next instruction, and the IP contains the offset. IP is updated each time an instruction is executed so that it will point to the next instruction. Unlike the other registers, the IP cannot be directly manipulated by an instruction; that is, an instruction may not contain IP as its operand.
The purpose of the FLAGS register is to indicate the status of the microprocessor. It does this by the setting of individual bits called flags. There are two kinds of flags: status flags and control flags. The status flags reflect the result of an instruction executed by the processor. For example, when a subtraction operation results in a 0, the ZF (zero flag) is set to 1 (true). A subsequent instruction can examine the ZF and branch to some code that handles a zero result.
The control flags enable or disable certain operations of the processor; for example, if the IF (interrupt flag) is cleared (set to 0), inputs from the keyboard are ignored by the processor. The status flags are covered in Chapter 5, and the control flags are discussed in Chapters 11 and 15.
A computer system is made up of both hardware and software. It is the software that controls the hardware operations. So, to fully understand the operations of the computer, you also study the software that controls the computer.
The most important piece of software for a computer is the operating system. The purpose of the operating system is to coordinate the operations of all the devices that make up the computer system. Some of the operating system functions are
-
reading and executing the commands typed by the user
-
Performing I/O operations
-
generating error messages
-
managing memory and other resources
At present, the most popular operating system for the IBM PC is the disk opcrating system (DOS), also referred to as PC DOS or MS DOS. DOS was designed for the 8086/8088- based computers. Because of this, it can manage only 1 megabyte of memory and it does not support multitask ing. However, it can be used on 80286, 80386, and 80486- based machines when they run in real address mode.
One of the many functions performed by DOS is reading and writing information on a disk. Programs and other information stored on a disk are organized into files. Each file has a file name, which is made up of one to eight characters followed by an optional file extension of a period followed by one to three characters. The extension is commonly used to identify the type of file. For example, COMMAND.COM has a file name COMMAND and an extension .COM.
There are several versions of DOS, with each new version having more capabilities. Most commercial programs require the use of version 2.1 or later. DOS is not just one program; it consists of a number of service routines. The user requests a service by typing a command. The latest version, DOS 5.0, also supports a graphical user interface (gui), allowing the use of a mouse.
The DOS routine that services user commands is called COM:. MAND.COM. It is responsible for generating the DOS prompt- - that is, C- - and reading user commands. There are two types of user commands, internal and external.
Internal commands are performed by DOS routines that have been loaded into memory, external commands may refer to DOS routines that have not been loaded or to application programs. In normal operations, many DOS routines are not loaded into memory so as to save memory space.
Because DOS routines reside on disk, a program must be operating when the computer is powered up to read the disk. In Chapter 1 we mentioned that there are system routines stored in ROM that are not destroyed when the power is off. In the PC, they are called BIOS (Basic Input/Output System) routines.
The BIOS routines perform I/O operations for the PC. Unlike the DOS routines, which operate over the entire PC family, the BIOS routines are machine specific. Each PC model has its own hardware configuration and its own BIOS routines, which invoke the machine's I/O port registers for input and output. The DOS I/O operations are ultimately carried out by the BIOS routines.
Other important functions performed by BIOS are circuit checking and loading of the DOS routines. In section 3.3.4, we discuss the loading of DOS routines.
Figure 3.4 - Memory: Partitioned into: Disjoint Segments:
To let DOS and other programs use the BIOS routines, the addresses of the BIOS routines, called interrupt vectors, are placed in memory, starting at 00000h. Some DOS routines also have their addresses stored there.
Because IBM has copyrighted its BIOS routines, IBM compatibles use their own BIOS routines. The degree of compatibility has to do with how well their BIOS routines match the IBM BIOS.
3.3.2.
Memory Organization of the PC
As indicated in section 3.2.3, the 8086/5088 processor is capable of addressing 1 megabyte of memory. However, not all the memory can be used by an application program. Some memory locations have special meaning for the processor. For example, the first kilobyte (00000 to 003Fh) is used for interrupt vectors.
Other memory locations are reserved by IBM for special purposes, such as for BIOS routines and video display memory. The display memory holds the data that are being displayed on the monitor.
... To show the memory map of the IBM PC, it is useful to partition the memory into disjoint segments. We start with segment 0, which ends at location OFFFIh, so the next disjoint segment would begin at 10000h = 1000:0000. Similarly, segment 1000h ends at 1FI:Ff'h and the next disjoint segment begins at 20000h = 2000:0000. Therefore the disjoint segments are 0000h, 1000h, 2000h, ... 1000h, and so memory may be partitioned into 16 disjoint segments. See Figure 3.4.
Only the first 10 disjoint memory segments are used by DOS for loading and running application programs. These ten segments, 0000h to 9000h, give us 640 KB of memory. The memory sizes of 8086/8088- based PCs are given in terms of these memory segments. For example, a PC with a 512- KB memory has only eight of these memory segments.
Figure 3.5 Memory Map of the PC
Segments A000h and B000h are used for video display memory. Segments C000h to E000h are reserved. Segment F000h is a special segment because its circuits are ROM instead of RAM, and it contains the BIOS routines and ROM BASIC. Figure 3.5 shows the memory layout.
20h- 21h 60h- 63h 200h- 20Fh 2F8h- 2FFh 320h- 32Fh 378h- 37Fh 3C0h- 3CFh 3D0h- 3DFh 3F8h- 3FFh
interrupt controller keyboard controller game controller serial port (COM 2) hard disk parallel printer port 1 EGA CGA serial port (COM1)
The 8086/8088 supports 64 KB of I/O ports. Some common port addresses are given in Table 3.1. In general, direct programming of I/O ports is not recommended because I/O port address usage may vary among computer models.
When the PC.is powered up, the 8086/8088 processor is put in a reset state, the CS register is set to FFFFh, and IP is set to 0000h. So the first instruction it executes is located at FFFF0h. This memory location is in ROM, and it contains an instruction that transfers control to the starting point of the BIOS routines.
The BIOS routines first check for system and memory errors, and then initialize the interrupt vectors and BIOS data area. Finally, BIOS loads the operating system from the system disk. This is done in two steps; first, the BIOS loads a small program, called the boot program, then the boot program loads the actual operating system routines. The boot program is so named because it is part of the operating system; having it load the operating system is like the computer pulling itself up by the bootstraps. Using the boot program isolates the BIOS from any changes made to the operating system and lets it be smaller in size. After the operating system is loaded into memory, COMMAND.COM is then given control.
The IBM personal computer family consists of the PC, PC XT, PC AT, PS/1, and the PS/2 models. They use the Intel 8086 family of microprocessors.
The 8086 family of microprocessors consists of the 8086, 8088, 80186, 80188, 80286, 80386, 80386SX, 80486, and 80486SX.
The 8086 and 8088 have the same instruction set, and this forms the basic set of instructions for the other microprocessors.
The 8086 microprocessor contains 14 registers. They may be classified as data registers, segment registers, pointer and index registers, and the FLAGS register.
The data registers are AX, BX, CX, and IX. These registers may be used for general purposes, and they also perform special functions. The high and low bytes can be addressed separately.
Each byte in memory has a 20- bit
A segment is a 64- KB block of memory. Addresses in memory may be given in segment offset form. The physical address is obtained by multiplying the segment number by 10h, and adding the offset.
The segment registers are CS, DS, SS, and ES. When a machine language program is executing, these registers contain the segment numbers of the code, data, stack, and extra data segments.
-
The pointer and index registers are SP, BP, SI, DI, and IP. SP is used exclusively for the stack segment. BP can be used to access the stack segment. SI and DI may be used to access data in arrays.
-
The IP contains the offset address of the next instruction to be executed.
-
The FLAGS register contains the status and control flags. The status flags are set according to the result of an operation. The control flags may be used to enable or disable certain operations of the microprocessor.
-
DOS is a collection of routines that coordinates the operations of the computer. The routine that executes user commands is COMMAND.COM.
-
Information stored on disk is organized into files. A file has a name and an optional extension.
-
The BIOS routines are used to perform I/O operations. The compatibility of PC clones with the IBM PC depends on how well their BIOS routines match those of the IBM PC.
-
The BIOS routines are responsible for system testing and loading the operating system when the machine is turned on.
basic input/output system, BIOS
boot program
code segment
COMMAND.COM control flags
data segment
disk operating system, DOS
external commands
file
file extension
file name
flags
graphical user interface, gui
Routines that handle input and output operations
The routine that loads the operating system during start- up
Memory segment containing a machine language program's instructions
The command processor for DOS
Flags that enable or disable certain actions of the processor
Memory segment containing a machine language program's data
The operating system for the IBM PC
Commands that correspond to routines residing on disk.
An organized, named collection of data items treated as a single unit for storage on devices such as disks
A period followed by one to three characters; used to identify the kind of file
A one- to eight- character name of a file
Bits of the FLAGS register
A user interface with pointers and graphical symbols
internal commands
interrupt vectors logical address memory protection
memory segment multitasking
offset (of a memory location) operating system
paragraph
paragraph boundary
physical address
protected (virtual address) mode
real address mode
segment number stack
stack segment
status flags video display memory
virtual memory
DOS commands that are executed by routines that are present in memory
Addresses of the BIOS and DOS routines
An address given in the form segment:offset
The ability of a processor to protect the memory used by one program from being used by another running program
A 64- KB block of memory
The ability of a computer to execute several programs at the same time
The number of bytes of the location from the beginning of a segment
A collection of programs that coordinate the operations of the devices that make up a computer system
16 bytes
A hex address ending in 0
Address of a memory location; 8086- based machines have 20- bit addresses
A processor mode in which the memory used by one program is protected from the actions of another program
A processor mode in which the addresses used in a program correspond to a physical memory address
Number that identifies a memory segment
A data structure used by the processor to implement procedure calls
Memory segment containing a machine language program's stack
Flags that reflect the actions of the processor Memory used for storing data for display on the monitor
The ability of the advanced processors to treat external storage as if it were real internal memory, and therefore execute programs that are too large to be contained in internal memory
-
What are the main differences between the 80286 and the 8086 processors?
-
What are the differences between a register and a memory location?
-
List one special function for each of the data registers AX, BX, CX, and DX.
-
Determine the physical address of a memory location given by 0A51:CD90h.
-
A memory location has a physical address 4A37Bh. Compute
a. the offset address if the segment number is 40FFh.
b. the segment number if the offset address is 123Bh.
-
What is a paragraph boundary?
-
What determines how compatible an IBM PC clone is with an authentic IBM PC?
-
What is the maximum amount of memory that DOS allocates for loading run files? Assume that DOS occupies up to the byte OFFFh.
For the following exercises, refer to Appendix B.
- Give DOS commands to do the following. Suppose that A is the logged drive.
a. Copy FILE1 in the current directory to FILE1A on the disk in drive B.
b. Copy all files with an .ASM extension to the disk in drive B.
c. Erase all files with a .BAK extension
d. List all file names in the current directory that begin with A.
e. Set the date to September 21, 1991.
f. Print the file FILE5.ASM on the printer.
- Suppose that (a) the root directory has subdirectories A, B, and C; (b) A has subdirectories A1 and A2; (c) A1 has a subdirectory A1A.
Give DOS commands to
a. Create the preceding directory tree.
b. Make A1A the current directory.
c. Have DOS display the current directory.
d. Remove the preceding directory tree.
This chapter covers the essential steps in creating, assembling, and executing an assembly language program. By the chapter's end you will be able to write simple but interesting programs that carry out useful tasks, and run them on the computer.
As with any programming language, the first step is to learn the syntax, which for assembly language is relatively simple. Next we show how variables are declared, and introduce basic data movement and arithmetic instructions. Finally, we cover program organization; you'll see that assembly language programs are comprised of code, data, and the stack, just like a machine language program.
Because assembly language instructions are so basic, input/output is much harder in assembly language than in high- level languages. We use DOS functions for I/O, they are easy to invoke and are fast enough for all but the most demanding applications.
An assembly language program must be converted to a machine language program before it can be executed. Section 4.10 explains the steps. To demonstrate, we'll create sample programs. They illustrate some standard assembly language programming techniques and serve as models for the exercises.
Assembly language programs are translated into machine language instructions by an assembler, so they must be written to conform to the assembler's specifications. In this book we use the Microsoft Macro Assembler (MASM). Assembly language code is generally not case sensitive, but we use upper case to differentiate code from the rest of the text.
Programs consist of statements, one per line. Each statement is either an instruction, which the assembler translates into machine code, or an assembler directive, which instructs the assembler to perform some specific task, such as allocating memory space for a variable or creating a procedure. Both instructions and directives have up to four fields:
At least one blank or tab character must separate the fields. The fields do not have to be aligned in a particular column, but they must appear in the above order.
An example of an instruction is
START: MOV CX, 5 ; initialize counter
Here, the name field consists of the label START. The operation is MOV, the operands are CX and 5, and the comment is ; initialize counter.
An example of an assembler directive is
MAIN PROC
MAIN is the name, and the operation field cofitains PROC. This particular directive creat. a procedure called MAIN.
The name field is used for instruction labels, procedure names, and variable names. The assembler translates names into memory addresses.
Names can be from 1 to 31 characters long, and may consist of letters, digits, and the special characters ? . @ _ S %. Embedded blanks are not allowed. If a perit.d is used, it must be the first character. Names may not begin with a digit. The assembler does not differentiate between upper and lower case in a name.
COUNTERI TCAATRIS SU' OF LIGITS S1000 DONE? .TEST
TWO WORDS
2abc
A45.28
YOU&ME
contains a blank begins with a digit . not first character contains an illegal character
For an instruction, the operation field contains a symbolic operation code (opcode). The assembler translates a symbolic opcode into a machine language opcode. Opcode symbols often describe the operation's function; for example, MOV, ADD, SUB.
In an assembler directive, the operation field contains a pseudo- operation code (pseudo- op). Pseudo- ops are not translated into machine code; rather, they simply tell the assembler to do something. For example, the PROC pseudo- op is used to create a procedure.
For an instruction, the operand field specifies the data that are to be acted on by the operation. An instruction may have zero, one, or two operands. For example,
NOP no operands, does nothing
INC AX one operand, adds 1 to the contents
of.AX
ADD WORD1,2 two operands, adds 2 to the contents of memory word WORD1
In a two- operand instruction, the first operand is the destination operand. It is the register or memory location where the result is stored (note: some instructions don't store the result). The second operand is the source operand. The source is usually not modified by the instruction
For an assembler directive, the operand field usually contains more information about the directive.
The comment field of a statement is used by the programmer to say something about what the statement does. A semicolon marks the beginning of this field, and the assembler ignores anything typed after the semicolon. Comments are optional, but because assembly language is so low- level, it is almost impossible to understand an assembly language program without comments. In fact, good programming practice dictates a comment on almost every line. The art of good commentary is developed through practice. Don't say something obvious, like this:
MOV CX, 0 ; move .0 to CX
Instead, use comments to put the instruction into the context of the program:
MOV CX, 0 ; CX counts term , initially 0
It is also permissible to make an entire line a comment, and to use them to create space in a program:
; initialize registers MOV AX, 0 MOV BX, 0
The processor operates only on binary data. Thus, the assembler must translate all data representation into binary numbers. However, in an assembly language program we may express data as binary, decimal, or hex numbers, and even as characters.
A binary number is written as a bit string followed by the letter "B" or "b"; for example, 1010B.
A decimal number is a string of decimal digits, ending with an optional "D" or "d".
A hex number must begin with a decimal digit and end with the letter "H" or "h"; for example, 0ABCH (the reason for this is that the assembler would be unable to tell whether a symbol such as "ABCH" represents the variable name "ABCH" or the hex number ABC).
Any of the preceding numbers may have an optional sign.
Here are examples of legal and illegal numbers for MASM:
Type
11011
11011B
4223
decimal
- 21843D
decimal
1,234
illegal- contains a nondigit character hex
134CH
illegal hex number- doesn't end in "H"
184D
illegal hex number- doesn't begin with a decimal digit
FFFH
a decimal digit
OFFFH
hex
Characters and character strings must be enclosed in single or double quotes: for example, "A" or 'hello'. Characters are translated into their ASCII codes by the assembler, so there is no difference between using "A" and 41h (the ASCII code for "A") in a program.
Pseudo- op
Stands for
DB
define byte
DW
define word
DD
define doubleword (two consecutive
DD
words)
DQ
define quadword (four consecutive
words)
DT
define tenbytes (ten consecutive bytes)
4.3
Variables play the same role in assembly language that they do in high- level languages. Each variable has a data type and is assigned a memory address by the program. The data- defining pseudo- ops and their meanings are listed in Table 4.1. Each pseudo- op can be used to set aside one or more data items of the given type.
In this section we use DB and DW to define byte variables, word variables, and arrays of bytes and words. The other data- defining pseudo- ops are used in Chapter 18 in connection with multiple- precision and non- integer operations.
4.3.1
The assembler directive that defines a byte variable takes the following form:
name DB initial value
where the pseudo- op DB stands for "Define Byte".
For example,
ALPHA DB 4
This directive causes the assembler to associate a memory byte with the name ALPHA, and initialize it to 4. A question mark ("?") used in place of an initial value sets aside an uninitialized byte; for example,
BYT DB
The decimal range of initial values that can be specified is - 128 to 127 if a signed interpretation is being given, or 0 to 255 for an unsigned interpretation. These are the ranges of values that fit in a byte.
4.3.2
The assembler directive for defining a word variable has the following form:
name DW initial value
The pseudo- op DW means "Define Word." For example,
WRD DW - 2
as with byte variables, a question mark in place of an initial value means an uninitialized word. The decimal range of initial values that can be specified is - 32768 to 32767 for a signed interpretation, or 0 to 65535 for an unsigned interpretation.
Arrays
In assembly language, an array is just a sequence of memory bytes or words. For example, to define a three- byte array called B_ARRAY, whose initial values are 10h, 20h, and 30h, we can write,
B_ARRAY DB 10H,20H,30H
The name B_ARRAY is associated with the first of these bytes, B_ARRAY+1 with the second, and B_ARRAY+2 with the third. If the assembler assigns the offset address 0200h to B_ARRAY, then memory would look like this:
In the same way, an array of words may be defined. For example,
scts up an array of four words, with initial values 1000, 40, 29887, and 329. The initial word is associated with the name W_ARRAY, the next one with W_ARRAY + 2, the next with W_ARRAY + 4, and so on. If the array starts at 0300h, it will look like this:
Sometimes we need to refer to the high and low bytes of a word variable. Suppose we define
WORD1 DW 1234H
The low byte of WORD1 contains 34h, and the high byte contains 12h. The low byte has symbolic address WORD1, and the high byte has symbolic address WORD1+1.
An array of ASCII codes can be initialized with a string of characters. For example,
LETTERS DB 'ABC'
is equivalent to
LETTERS DB 41H, 42H, 43H
Inside a string, the assembler differentiates between upper and lower case. Thus, the string "abc" is translated into three bytes with values 61h, 62h, and 63h.
It is possible to combine characters and numbers in one definition; for example,
MSG DB 'HELLO', OAH, ODH, 'S'
is equivalent to
MSG DB 48H, 45H, 4CH, 4CH, 4FH, OAH, ODH, 24H
To make assembly language code easier to understand, it is often desirable to use a symbolic name for a constant quantity.
To assign a name to a constant, we can use the EQU (equates) pseudo- op. The syntax is
name ECU! const
For example, the statement
-
- 0A:
assigns the name I.F to OAl, the AMUI code of the line feed character. The name LF may now be used in place of OAh anywhere in the program. Thus, the assembler translates the instructions
MOV DL, OAH
and
MOV DL, I.F
into the same machine instruction.
The symbol on the right of an EQU can also be a string. For example,
PROM! T QU 'TIT! 'OUR NAME'
Then instead of
MSG DB 'TYPE, YOUR NAME'
we could say
MSG CH 'FRCERT
Note: no memory is allocated for EQU names.
There are over a hundred instructions in the instruction set for the 8086 CPU; there are also instructions designed especially for the more advanced processors (see Chapter 20). In this section we discuss six of the most useful instructions for transferring data and doing arithmetic. The instructions we present can be used with either byte or word operands.In the following, WORD1 and WORD2 are word variables, and BYTE1 and BYTE2 are byte variables. Recall from Chapter 3 that AH is the high byte of register AX, and BL is the low byte of BX.
In the following, WORD1 and WORD2 are word variables, and BYTE1 and BYTE2 are byte variables. Recall from Chapter 3 that AH is the high byte of register AX, and BL is the low byte of BX.
The MOV instruction is used to transfer data between registers, between a register and a memory location, or to move a number directly into a register or memory location. The syntax is
MOV destination, source
Here are some examples:
MOV AX,WORD1
This reads "Move WORD1 to AX". The contents of register AX are replaced by the contents of memory location WORD1. The contents of WORD1 are unchanged. In other words, a copy of WORD1 is sent to AX (Figure 4.1).
MOV AX,EX
AX gets what was previously in BX. BX is unchanged.
MOV AH, 'A'
Table 4.2 Legal Combinations of Operands for MOV and XCHG MOV
| Source Operand | Destination Operand | |||
| General register | Segment register | Memory location | Constant | |
| General register | yes | yes | yes | no |
| Segment register | yes | no | yes | no |
| Memory location | yes | yes | no | no |
| Constant | yes | no | yes | no |
| XCHG | Destination Operand | |||
| Source Operand | General register | Memory location | ||
| General register | yes | yes | ||
| Memory location | yes | no | ||
| . | . | . | ||
This is a move of the number 041h (the ASCII code of "A") into register AH. The previous value of AH is overwritten (replaced by new value).
The XCHG (exchange) operation is used to exchange the contents of two registers, or a register and a memory location. The syntax is
XCHG destination, source
An example is
XCHG AH, BL
This instruction swaps the contents of AH and BL, so that AH contains what was previously in BL and BL contains what was originally in AH (Figure 4.2). Another example is
XCHG AX, WORD1
which swaps the contents of AX and memory location WORD1.
For technical reasons, there are a few restrictions on the use of MOV and XCHG. Table 4.2 shows the allowable combinations. Note in particular that a MOV or XCHG between memory locations is not allowed. For example,
II.LEGAL: MOV WORD1, WORD2
but we can get around this restriction by using a register:
MOV AX, WORD2
MOV WORD1, AX
Figure 4.3 ADD WORD1,AX
4.5.2 ADD, SUB, INC, and DEC
The ADD and SUB instructions are used to add or subtract the contents of two registers, a register and a memory location, or to add (subtract) a number to (front) a register or memory location. The syntax is
ADD destination, source SUB destination, source
For example,
ADD WORD1,AX
This instruction, "Add AX to WORD1," causes the contents of AX and memory word WORD1 to be added, and the sum is stored in WORD1. AX is unchanged (figure 4.3).
SUB AX,DX
In this example, "Subtract DX from AX," the value of DX is subtracted from the value of AX, with the difference being stored in AX. DX is unchanged (figure 4.4).
Table 4.3 Legal Combinations of Operands for ADD and SUB
| Source Operand | General register | Memory location |
| General register | yes | yes |
| Memory location | yes | no |
| Constant | yes | yes |
ADD BL, 5
This is an addition of the number 5 to the contents of register BL.
As was the case with MOV and XCHG, there are some restrictions on the combinations of operands allowable with ADD and SUB. The legal ones are summarized in Table 4.3. Direct addition or subtraction between memory locations is illegal; for example,
ILLEGAL: ADD BYTE1, BYTE2
A solution is to move BYTE2 to a register before adding, thus
MOV AL, BYTE2 ; AX gets BYTE2 ADD BYTE1, AL ; add it to BYTE1
INC (increment) is used to add 1 to the contents of a register or memory location and DEC (decrement) subtracts 1 from a register or memory location. The syntax is
INC destination
DEC destination
For example,
INC WORD1
adds 1 to the contents of WORD1 (Figure 4.5).
DEC BYTE1
subtracts 1 from variable BYTE1 (Figure 4.6).
Figure 4.7 NEG BX
4.5.3 NEG
NEG is used to negate the contents of the destination. NEG does this by replacing the contents by its two's complement. The syntax is
NEG destination
The destination may be a register or memory location. For example,
:EG BX
negates the contents of BX (figure 4.7).
The operands of the preceding two- operand instruction must be of the same type; that is, both bytes or words. Thus an instruction such as
:ICV AX,SYTEI :illegal
is not allowed. However, the assembler will accept both of the following instructions:
:OV AH,AA
and
MCV AX,AA
In the former case, the assembler reasons that since the destination AH is a byte, the source must be a byte, and it moves 41h into AH. In the latter case, it assumes that because the destination is a word, so is the source, and it moves 0041h into AX.
4.6 Translation of High- Level Language to Assembly Language
To give you a feeling for the preceding instructions, we'll translate some high- level language assignment statements into assembly language. Only MOV, ADD, SUB, INC, DEC, and NEG are used, although in some cases a better job could be done by using instructions that are covered later. In the discussion, A and B are word variables.
Statement Translation
B = A MOV AX,A.
;move A into AX ;and then into B
As was pointed out earlier, a direct memory- memory move is illegal, so we must move the contents of A into a register before moving it to B.
A = 5 - A MOV AX, 5 ; put 5 in AX SUB AX, A ; AX contains 5 - A MOV A, AX ; put it in A
This example illustrates one approach to translating assignment statements: do the arithmetic in a register—for example, AX—then move the result into the destination variable. In this case, there is another, shorter way:
NEG A. ; A = - A ADD A, 5 ; A = 5 - A
The next example shows how to do multiplication by a constant.
A = B - 2 x A MOV AX, B ; AX has B
SUB AX, A ; AX has B - A
SUB AX, A ; AX has B - 2 x A
MOV A, AX ; move result to A
Chapter 3 noted that machine language programs consist of code, data, and stack. Each part occupies a memory segment. The same organization is reflected in an assembly language program. This time, the code, data, and stack are structured as program segments. Each program segment is translated into a memory segment by the assembler.
We will use the simplified segment definitions that were introduced for the Microsoft Macro Assembler (MASM), version 5.0. They are discussed further in Chapter 14, along with the full segment definitions.
The size of code and data a program can have is determined by specifying a memory model using the .MODEL directive. The syntax is
.MODEL memory_model
The most frequently used memory models are SMALL, MEDIUM, COMPACT, and LARGE. They are described in Table 4.4. Unless there is a lot of code or data, the appropriate model is SMALL. The .MODEL directive should come before any segment definition.
Model
SMALL
code in one segment data in one segment
HUGE
code in more than one segment data in one segment
code in one segment data in more than one segment
code in more than one segment data in more than one segment no array larger than 64k bytes
code in more than one segment data in more than one segment arrays may be larger than 64k bytes
A program's data segment contains all the variable definitions. Constant definitions are often made here as well, but they may be placed elsewhere in the program since no memory allocation is involved. To declare a data segment, we use the directive. DATA, followed by variable and constant declarations. For example,
.DATA
WORD1
DW 2
WORD2 DW 5
MSG
DB THIS IS A MESSAGE'
MASK
EQU 10010010B
The purpose of the stack segment declaration is to set aside a block of memory (the stack area) to store the stack. The stack area should be big enough to contain the stack at its maximum size. The declaration syntax is
.STACK size
where size is an optional number that specifies the stack area size in bytes. For example,
.STACK 100H
sets aside 100h bytes for the stack area (a reasonable size for most applications). If size is omitted, 1 KB is set aside for the stack area.
- The code segment contains a program's instructions. The declaration syntax is
.CODE name
where name is the optional name of the segment (there is no need for a name in a SMALL program, because the assembler will generate an error).
Inside a code segment, instructions are organized as procedures. The simplest procedure definition is
name PROC
;body of the procedure
name ENDP
where name is the name of the procedure; PROC and ENDP are pseudo- ops that delineate the procedure.
Here is an example of a code segment definition:
.CODE
MAIN PROC
;main procedure instructions
MAIN ENDP
;other procedures go here
Now that you have seen all the program segments, we can construct the general form of a .SMALL model program. With minor variations, this form may be used in most applications:
MODEL SMALL
STACK 100H
DATA
.data definitions go here
.CODE
MAIN PROC
; instructions go here
MAIN ENDP
; other procedures go here
END MAIN
The last line in the program should be the END directive, followed by name of the main procedure.
4.8
In Chapter 1, you saw that the CPU communicates with the periph erals through I/O registers called I/O ports. There are two instructions, IN and OUT, that access the ports directly. These instructions are used when fast I/O is essential; for example, in a game program. However, most applications programs do not use IN and OUT because (1) port addresses vary among computer models, and (2) it's much easier to program I/O with the service routines provided by the manufacturer.
There are two categories of I/O service routines: (1) the Basic Input/Output System (BIOS) routines and (2) the DOS routines. The BIOS routines are stored in ROM and interact directly with the I/O ports. In Chapter 12, we use them to carry out basic screen operations such as moving the cursor and scrolling the screen. The DOS routines can carry out more complex tasks; for example, printing a character string; actually they use the BIOS routines to perform direct I/O operations.
To invoke a DOS or BIOS routine, the INT (interrupt) instruction is used. It has the format
INT interrupt_number
INT interrupt_number where interrupt_number is a number that specifies a routine. For example. INT 16h'invokes a BIOS routine that performs keyboard input. Chapter 15 covers the INT instruction in more detail. In the following, we use a particular DOS routine, INT 21h.
4.8.1
INT 21h
INT 21h may be used to invoke a large number of DOS functions (see Appendix C); a particular function is requested by placing a function number in the AH register and invoking INT 21h. Here we are interested in the following functions:
1
2
9
single- key input single- character output character string output
INT 21h functions expect input values to be in certain registers and retur output values in other registers. These are listed as we describe each function
Function 1:
Single- Key Input
Input: AH = 1 Output AL = ASCII code if character key is pressed = 0 if non- character key is pressed
To invoke the routine, execute these instructions:
MOV AH, 1 ; input key function INT 21h ; ASCII code in AL
The processor will wait for the user to hit a key if necessary. If a character key is pressed, AL gets its ASCII code; the character is also displayed on the screen. If any other key is pressed, such as an arrow key, F1- F10, and so on; AL will contain 0. The instructions following the INT 21h can examine AL and take appropriate action.
Because INT 21h, function 1, doesn't prompt the user for input, he or she might not know whether the computer is waiting for input or is occupied by some computation. The next function can be used to generate an input prompt.
Function 2:
Display a character or execute a control function
Input: AH = 2 DL = ASCII code of the display character or control character Output: AL = ASCII code of the display character or control character
To display a character with this function, we put its ASCII code in DL. For example, the following instructions cause a question mark to appear on the screen:
MOV AH, 2 ; display character function MOV DL, ? ; character is ? ; INT 21h ; display character
After the character is displayed, the cursor advances to the next position or the line (if at the end of the line, the cursor moves to the beginning of the next line).
Function 2 may also be used to perform control functions. If DL contains the ASCII code of a control character, INT 21h causes the control function to be performed. The principal control characters are as follows:
ASCII code (iHex) : Symbol 7 BEL 8* BS 9 HT A LF D CR
On execution, AL gets the ASCII code of the control character.
Our first program will read a character from the keyboard and display it at the beginning of the next line.
We start by displaying a question mark:
MOV AH,2 :display character function MOV'DL,'?' :character is ?' INT 21h :display character
The second instruction moves 3Fh, the ASCII code for "?", into DL. Next we read a character:
MOV AH,1 :read character function INT 21h :character in AL
Now we would like to display the character on the next line. Before doing so, the character must be saved in another register. (We'll see why in a moment.)
MOV BL,AL :save it in BL
To move the cursor to the beginning of the next line, we must execute a carriage return and line feed. We can perform these functions by putting the ASCII codes for them in TL and executing INT 21h.
MOV AH,2 :display character function MOV DL,ODH :carriage return INT 21h :execute carriage return MOV DL,0AH :line feed INT 21h :execute line feed
The reason why we had to move the input character from AL to BL is that the INT 21h, function 2, changes AL.
Finally we are ready to display the character:
MCV DL,BL :get character INT 21h :and display it
Here is the complete program:
TITLE PGM4_1: ECHC PROGRAM
.MODEL SMALL
. STACK 100H
CODE
MAIN PROC
;display prompt
MOV AH,2 ;display character function
MOV DL, '?' ;character is '?'
INT 21H ;display it
;input a character
MOV AH, 1 ;read character function
INT 21H ;character in AL
MOV BL, AL ;save it in BL
;go to a new line.
MOV AH, 2 ;display character function
MOV DL, 0DH ;carriage return
INT 21H ;execute carriage return
MOV DL, 0AH ;line feed
INT 21H ;execute line feed
;display character
MOV DL, BL; ;retrieve character
INT 21H ;and display it
;return to DOS
MOV AH, 4CH; ;DOS exit function
INT 21H ;exit to DOS
MAIN ENDP
END MAIN
Because no variables were used, the data segment was omitted.
The last two lines in the MAIN procedure require some explanation. When a program terminates, it should return control to DOS. This can be accomplished by executing INT 21h, function 4Ch.
We are now ready to look at the steps involved in creating and running a program. The preceding program is used to demonstrate the process. The four steps are (Figure 4.8):
-
Use a text editor or word processor to create a source program file.
-
Use an assembler to create a machine language object file.
-
Use the LINK program (see description later) to link one or more object files to create a run file.
-
Execute the run file.
In this demonstration, the system files we need (assembler and linker, are in drive C and the programmer's disk is in drive A. We make A the default drive so that the files created will be stored on the programmer's disk.
We used an editor to create the preceding program, with file name PGM4_1. ASM. The .ASM extension is the conventional extension used to identify an assembly language source file.
We use the Microsoft Macro Assembler (MASM) to translate the source file PGM4_1. ASM into a machine language object file called PGM 4_1. OBJ. The simplest command is (user's response appears in boldface):
A>C:MASM PGM4_1;
Microsoft (R) Macro Assembler Version 5.10
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.
50060 + 418673 Bytes symbol space free
0 Warning Errors
0 Severe Errors
After printing copyright information, MASM checks the source file for syntax errors. If it finds any, it will display the line number of each error and a short description. Because there are no errors here, it translates the assembly language code into a machine language object file named PGM4_1. OBJ.
The semicolon after the preceding command means that we don't want certain optional files generated. Let's omit it and see what happens:
A>C:MASM PGM4_1
Microsoft (R) Macro Assembler Version 5.10
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.
Object filename {PGM4_1. CBJ}:
Source listing (NUL.LST): PGM4_1
'1355- reference [NUL.CRF]: PGM4_1
5066 - 418673 bytes symbol space free
C Warning Errors
0 Severe Errors
This time MASM prints the names of the files it can create, then waits for us to supply names for the files. The default names are enclosed in square brackets. To accept a name, just press return. The default name NUL means that no file will be created unless the user does specify a name, so we reply with the name PGM4_1.
The source listing file (.LST file) is a line- numbered text file that displays assembly language code and the corresponding machine code side by side, and gives other information about the program. It is especially helpful for debugging purposes, because MASM's error messages refer to line numbers.
The cross- reference file (.CRF file) is a listing of names that appear in the program and the line numbers on which they occur. It is useful in locating variables and labels in a large program.
Examples of .LST and .CRF files are shown in Appendix D, along with other MASM options.
The .OBJ file created in step 2 is a machine language file, but it cannot be executed because it doesn't have the proper run file format. In particular,
-
because it is not known where a program will be loaded in memory for execution, some machine code addresses may not have been filled in.
-
some names used in the program may not have been defined in the program. For example, it may be necessary to create several files for a large program and a procedure in one file may refer to a name defined in another file.
The LINK program takes one or more object files, fills in any missing addresses, and combines the object files into a single executable file (.EXE file). This file can be loaded into memory and run. To link the program, type
C:LINK PGM4_1;
As before, if the semico:!n is omitted, the linker will prompt you for names of the output hie: generated. See Appendix D.
To run it, just type the run file name, with or without the .EXE extension.
PGM4_1
?A
A
The prrrrmm prrrrrr a "?" anl waits for us to enter a chaacter. We enter "A" and hn prrrr...r rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr rrrr
In our first program, we used INI 21h, functions 1 and 2, to read and display a single character. Here is another INI 21h function that can be used to display a character string:
INT 21h, Function 9:
Display a String
Input: DX = offset address of string.
The string must end with a '5' character.
The "$" marks the end of the string and is not displayed. If the string contains the ASCII code of a control character, the control function is performed.
To demonstrate this function, we will write a program that prints "HELLO!" on the screen. This message is defined in the data segment as
MSG DB 'HELLO! S'
INT 21h, function 9, expects the offset address of the character string to be in DX. To get it there, we use a new instruction:
LEA destination, source
LEA destination, sourcewhere destination is a general register and source is a memory location. LEA stands for "Load Effective Address." It puts a copy of the source offset address into the destination. For example,
LEA DX, MSG
puts the offset address of the variable MSG into DX.
LEA DX, MSGputs the offset address of the variable MSG into DX.Because our second program contains a data segment, it will begin with instructions that initialize DS. The following paragraph explains why these instructions are needed.
When a program is loaded in memory, DOS prefaces it with a 256- byte program segment prefix (PSP). The PSP contains information about the program. So that programs may access this area, DOS places its segment number in both DS and ES before executing the program. The result is that DS does not contain the segment number of the data segment. To correct this, a program containing a data segment begins with these two instructions:
MOV AX, @DATA MOV DS, AX
@Data is the name of the data segment defined by .DATA. The assembler translates the name @DATA into a segment number. Two instructions are needed because a number (the data segment number) may not be moved directly into a segment register.
With DS initialized, we may print the "HELLO!" message by placing its address in DX and executing INT 21h:
LEA DX, MSG MOV AH, 9 INT 21h
; get message
; display string function
; display string
Here is the complete program:
Program Listing PGM4_2. ASM
TITLE PGM4_2: PRINT STRING · PROGRAM
MODEL SMALL
. STACK 100H
. DATA
MSG DB 'HELLO!' $'
.CODE
MAIN PROC
; initialize DS
MOV AX, @DATA
MOV DS, AX ; initialize DS
; display message
'LEA DX, MSG ; get message
MOV AH, 9 ; display string function INT 21h ; display message
; return to DOS
MOV AH, 4CH
INT 21h ;DOS exit MAIN ENDP END MAIN
And here is a sample execution:
A> PGM4.2
HELLO!
A Case Conversion Program
We will now combine most of the material covered in this chapter into a single program. This program begins by prompting the user to enter a lowercase letter, and on the next line displays another message with the letter in uppercase. For example,
ENTER A LOWERCASE LETTER: IN UPPER CASE IT IS: A
We use EQU to define CR and LF as names for the constants ODH and OAH.
CR EQU ODH
LF EQU OAH
The messages and the input character can be stored in the data seg. ment like this:
MSG1 DB 'ENTER A LOWERCASE LETTER: ()
MSG2 DB CR,LF IN UPPER CASE IT IS:
CHAR DB ?,'$'
In defining MSG2 and CHAR, we have used a helpful trick: because the program is supposed to display the second message and the letter (after conversion to upper case) on the next line, MSG2 starts with the ASCII codes for carriage return and line feed; when MSG2 is displayed with INT 21h, function 9, these control functions are executed and the output is displayed on the next line. Because MSG2 does not end with () 1(INT 21h goes on and displays the character stored in CHAR.
Our program begins by displaying the first message and reading the character:
LEA DX,MSG1 ;get first message
MOV AH,9 ;display string function
INT 21h ;display first message
MOV AH,1 ;read character function
INT 21h ;read a small letter into AL
Having read a lowercase letter, the program must convert it to upper case. In the ASCII character sequence, the lowercase letters begin at 61h and the uppercase letters start at 41h, so subtraction of 20h from the contents of AL does the conversion:
SUB AL,20H ; convert it to upper case MOV CHAR,AL ; and store it
Now the program displays the second message and the uppercase letter:
LEA DX, MSG2 ;get second message MOV AH, 9 ;display string function INT 21h ;display message and uppercase letter
Here is the complete program:
TITLE PGM4_3: CASE CONVERSION PROGRAM
MODEL SMALL
. STACK 100H
. DATA
CR EQU ODH
LF EQU OAH
MSG1 DB 'ENTER A LOWER CASE LETTER: $'
MSG2 DB ODH, OAH. IN UPPER CASE IT IS:
CHAR DB ?, $'
CODE
MATH PROC
initialize DS
MOV AX, ?DATA ;get data segment
MOV DS, AX ;initialize DS
;print user prompt
LEA DX, MSG1 ;get first message
MOV AH, ? ;display string fun
INI 21H ;display first message
;input a character and convert to upper case
MOV AH, 1 ;read character function
INT 21H ;read a small letter into AL
SUB AL, 20H ;convert it to upper case
MCV CHAR, AL ;and store it
;display on the next line
LEA DX, MSG2 ;get second message
MOV AH, 9 ;display string function
INT 21H ;display message and upper case
;DOS exit ;letter in front
MOV AH, 4CH
INT 21H ;DOS exit
MAIN END?
END MAIN
Assembly language programs are made up of statements. A statement is either an instruction to be executed by the computer, or a directive for the assembler.
Statements have name, operation, operand(s), and comment fields.
A symbolic name can contain up to 31 characters. The characters can be letters, digits, and certain special symbols.
Numbers may be written in binary, decimal, or hex.
Characters and character strings must be enclosed in single or double quotes.
Directives DB and DW are used to define byte and word variables, respectively. EQU can be used to give names to constants.
A program generally contains a code segment, a data segment, and a stack segment.
MOV and XCHG are used to transfer data. There are some restrictions for the use of these instructions; for example, they may not operate directly between memory locations.
ADD, SUB, INC, DEC, and NEG are some of the basic arithmetic instructions.
There are two ways to do input and output on the IBM PC: (1) by direct communication with I/O devices, (2) by using BIOS or DOS interrupt routines.
The direct method is fastest, but is tedious to program, and depends on specific hardware circuits.
Input and output of characters and strings may be done by the DOS routine INT 21h.
INT 21h, function 1, causes a keyboard character to be read into AL.
INT 21h, function 2, causes the character whose ASCII code is in DL to be displayed. If DL contains the code of a control character, the control function is performed.
INT 21h, function 9, causes the string whose offset address is in DX to be displayed. The string must end with a "$" character.
array
assembler directive
code segment
.CRF file
data segment destination operand
.EXE file
instruction
.LST file
memory model
A sequence of memory bytes or words
Directs the assembler to perform some specific task
Part of the program that holds the instructions
A file created by the assembler that lists names that appear in a program and line numbers where they occur
Part of the program that holds variables First operand in an instruction- receives the result
Same as run file
A statement that the assembler translates to machine code
A line- numbered file created by the assembler that displays assembly language code, machine code, and other information about a program
Organization of a program that indicates the amount of code and data
Program segment prefix, PSP
pcudo- op
run file
source operand
source program file
stack&egment
variable
The machine language file created by the assembler from the source program file
The 256- byte area that precedes the program in memory- - - contains information about the program
Assembler directive
The executable machine language file created by the LINK program
Second operand in an instruction- - -
usually not changed by the instruction A program text file' created with a word processor or text editor
Part of the program that holds the runtime stack
Symbolic name for a memory location that stores data
ADD
DEC
INC
'NEG
SUB
XCHG
.CODE
.DATA
.MODEL
.STACK
- Which of the following names are legal in IBM PC assembly language?
a. TWO_WORDS
b. ?1
c. Two words
d. .e?
e. $145
f. LET'S_GO
g. T =
- Which of the following are legal numbers? If they are legal, tell whether they are binary, decimal, or hex numbers.
a. 246
b. 246h
c. 1001
d. 1,101
e. 2A3h
f. FFFEh
g. OAh
h. Bh
i. '1110b
- If it is legal, give data definition pseudo-ops to define each of the following.
a. A word variable A initialized to 52
b. A word variable WORD1, uninitialized
c. A byte variable B, initialized to 25h
d. A byte variable C1, uninitialized
e. A word variable WORD2,-initialized to 65536
f. A word array ARRAY1, initialized to the first five positive integers (i.e. 1-5)
g. A constant BELL equal to 07h
h. A constant MSG equal to 'THIS IS A MESSAGES'
- Suppose that the following data are loaded starting at offset 0000h:
A DB 7
B DW 1ABCh
C DB 'HELLO'
a. Give the offset address assigned to variables A, B, and C.
b. Give the contents of the byte at offset 0002h in hex.
c. Give the contents of the byte at offset 0004h in hex.
d. Give the offset address of the character "O" in "HELLO."
- Tell whether each of the following instructions is legal or illegal. W1 and W2 are word variables, and B1 and B2 are byte variables.
a. MOV DS,AX
b. MOV DS,1000h
C. MOV CS,ES
d. MOV W1,DS
e. XCHG W1,W2
f. SUB 5,B1
g. ADD B1,B2
h. ADD AL,258
i. MOV W1,B1
- Using only MOV, ADD, SUB, INC, DEC, and NEG, translate the following high-level language assignment statements into assembly language. A, B, and C are word variables.
a.
b.
C-
d.
e.
- Write instructions (not a complete program) to do the following.
a. Read a character, and display it at the next position on the same line.
b. Read an uppercase letter (omit error checking), and display it at the next position on the same line in lower case.
- Write a program to (a) display a “?”, (b) read two decimal digits whose sum is less than 10, (c) display them and their sum on the next line, with an appropriate message.
Sample execution:
?27 THE SUM OF 2 AND 7 IS 9
- Write a program to (a) prompt the user, (b) read first, middle, and last initials of a person's name, and (c) display them down the left margin.
Sample execution:
ENTER THREE INITIALS: JFK J F K
- Write a program to read one of the hex digits A-l; and display it on the next line in decimal.
Sample execution:
ENTER A HEX DIGIT: C IN DECIMAL IT IS 12
-
Write a program to display a
$\mathbf{10}\times \mathbf{10}$ solid box of asterisks. Hint: declare a string in the data segment that specifies the box, and display it with INT 21h, function 9h. -
Write a program to (a) display “?”, (b) read three initials, (c) display them in the middle of an
$11\times 11$ box of asterisks, and (d) beep the computer.
One important feature that distinguishes a computer from other machines is the computer's ability to make decisions. The circuits in the CPU can perform simple decision making based on the current state of the processor. For the 8086 processor, the processor state is implemented as nine individual bits called flags. Each decision made by the 8086 is based on the values of these flags.
The flags are placed in the I'LAGS register and they are classified as either status flags or control flags. The status flags reflect the result of a computation. In this chapter, you will see how they are affected by the machine instructions. In Chapter 6, you will see how they are used to implement jump instructions that allow progr. ..is to have multiple branches and loops. The control flags are used to en. ...or to disable certain operations of the processor; they are covered in later ci apters.
In section 5.4 we introduce the DOS pr. gram DEBUG. We'll show how to use DEBUG to trace through a user program and to display registers, flags, and memory locations.
which bit is which flag- - Table 5.1 gives the names of the flags and their symbols. In this chapter, we concentrate on the status flags.
As stated earlier, the processor uses the status flags to reflect the result of an operation. For example, if SUB AX,AX is executed, the zero flag becomes 1, thereby indicating that a zero result was produced. Now let's get to know the status flags.
Table 5.1 Flag Names and Symbols
| Bit | Name | Symbol |
| 0 | Carry flag | CF |
| 2 | Parity flag | PF |
| 4 | Auxiliary carry flag | AF |
| 6 | Zero flag | ZF |
| 7 | Sign flag | SF |
| 11 | Overflow flag | OF |
| Control Flags | ||
| Bit | Name | Symbol |
| 8 | Trap flag | TF |
| 9 | Interrupt flag | IF |
| 10 | Direction flag | DF |
The phenomenon of overflow is associated with the fact that the range of numbers that can be represented in a computer is limited.
Chapter 2 explained that the (decimal) range of signed numbers that can be represented by a 16- bit word is - 32768 to 32767; for an 8- bit byte the range is - 128 to 127. For unsigned numbers, the range for a word is 0 to 65535; for a byte, it is 0 to 255. If the result of an operation falls outside these ranges, overflow occurs and the truncated result that is saved will be incorrect.
Signed and unsigned overflows are independent phenomena. When we perform an arithmetic operation such as addition, there are four possible outcomes: (1) no overflow, (2) signed overflow only, (3) unsigned overflow only, and (4) both signed and unsigned overflows.
As an example of unsigned overflow but not signed overflow, suppose AX contains FFFFh, BX contains 0001h, and ADD AX,BX is executed. The binary result is
If we are giving an unsigned interpretation, the correct answer is
As an example of signed but not unsigned overflow, suppose AX and BX both contain 7FFFh, and we execute ADD AX,BX. The binary result is
The signed and unsigned decimal interpretation of 7FFFh is 32767. Thus for both signed and unsigned addition, 7FFFh + 7FFFh = 32767 + 32767 = 65534. This is out of range for signed numbers; the signed interpretation of the stored answer FFFEh is - 2, so signed overflow occurred. However, the unsigned interpretation of FFFEh is 65534, which is the right answer, so there is no unsigned overflow.
There are two questions to be answered in connection with overflow: (1) how does the CPU indicate overflow, and (2) how does it know that overflow occurred?
The processor sets
In determining overflow, the processor does not interpret the result as either signed or unsigned. The action it takes is to use both interpretations for each operation and to turn on CF or OF for unsigned overflow or signed overflow, respectively.
It is the programmer who is interpreting the results. If a signed in tegration is being given, then only OF is of interest and CIi can be ignored; conversely, for an unsigned interpretation CIi is important but not OI:
Many instructions can cause overflow; for simplicity, we'll limit the discussion to addition and subtraction.
On addition, unsigned overflow occurs when there is a carry out of the msb. This means that the correct answer is larger than the biggest unsigned number; that is, FFFFh for a word and Ffh for a byte. On subtraction, unsigned overflow occurs when there is a borrow into the msb. This means that the correct answer is smaller than 0.
On addition of numbers with the same sign, signed overflow occurs when the sum has a different sign. This happened in the preceding example when we were adding 7FFFh and 7FFFh (two positive numbers), but got FFFFh (a negative result).
Subtraction of numbers with different signs is like adding numbers of the same sign. For example,
In addition of numbers with different signs, overflow is impossible, because a sum like
Actually, the processor uses the following method to set the OF: If the carries into and out of the msb don't match- - that is, there is a carry into the msb but no carry out, or if there is a carry out but no carry in- - then signed overflow has occurred, and OF is set to 1. See example 5.2, in the next section.
In general, each time the processor executes an instruction, the flags are altered to reflect the result. However, some instructions don't affect any of the flags, affect only some of them, or may leave them undefined. Because the jump instructions studied in Chapter 6 depend on the flag settings, it's important to know what each instruction does to the flags. Let's return to the seven basic instructions introduced in Chapter 4. They affect the flags as follows:
MOV/XCHG none
ADD/SUB all
INC/DEC all except CF
NEC all
OF = 1 if word operand is 8000th.
or byte operand is 8001
To get you used to seeing how these instructions affect the flags, we will do several examples. In each example, we give an instruction, the contents of the operands, and predict the result and the settings of CF, PF, ZI, SF, and OF (we ignore AF because it is used only for BCD arithmetic).
Example 5.1 AID AX.BX, where AX contains FFFFh, BX contains FFFFh.
Solution: FFh + FFh FFh
The result stored in AX is FFEh = 1111 1111 1111 1111.
ZF = 0 because the result is nonzero.
CF = 1 because there is a carry out of the msb on addition.
OF = 0 because the sign of the stored result is the same as that of the numbers being added (as a binary addition, there is a carry into the msb and also a carry out).
Example 5.2 ADD AI,BL, where AL contains 80h, BL contains 80h.
Solution:
80h + 80h 100h
The result stored in AI. is OOh.
the result is 0 (as a binary addition, there is no carry into the msb but there is a carry out).
Example 5.3 SUB AX,BX, where AX contains 8000h and BX contains 0001h.
Solution: 8000h
0001h
7FFFh = 0111 1111 1111 1111
The result stored in AX is 7FFFh.
from a larger one.
Now for OF. In a signed sense, we are subtracting a positive number from a negative one, which is like adding two negatives. Because the result is positive (the wrong sign),
Example 5.4 INC AI., where AL contains FFh.
Solution:
The result stored in AI. is 00h.
Example 5.5 M(OV AX, - 5
Solution: The result stored in AX is
None of the flags are affected by MOV.
Example 5.6 NEG AX, where AX contains 8000h.
Solution: 8000h = 1000 0000 0000 0000.
one's complement = 0111 1111 1111 1111
The result stored in AX is 8000h.
we would expect a sign change, but because 8000h is
its own two's complement, there is no sign change.
In the next section, we introduce a program that lets us see the actual setting of the flags.
The DEBUG program provides an environment in which a program may be tested. The user can step through a program, and display and change the registers and memory. It is also possible to enter assembly code directly, which DEBUG converts to machine code and stores in memory. A tutorial for DEBUG and CODEVIEW, a more sophisticated debugger, may be found in Appendix E.
We use DEBUG to demonstrate the way instructions affect the flags. To that end, the following program has been created.
TITLE PGM5_1:CHECK FLAGS
;used in DEBUG to check flag settings
MODEL SMALL
STACK 100H
CODE
MAIN PROC
MOV AX,4000H ;AX = 4000h
ADD AX,AX ;AX = 8000h
SUB AX,0FFFFH ;AX = 8001h
NEG AX ;AX = 7FFFh
INC AX ;AX = .8000h
MOV AH,4CH
INT 21H ;DOS exit
MAIN ENDF
END MAIN
We assemble and link the program, producing the run file PGM5_1. EXE, which is on a disk in drive A. In the following, the user's responses are in boldface.
The DEBUG program is on the DOS disk, which is in drive C. To enter DEBUG with our demonstration program, we type
C>DEBUG A:PGMS_1 EXE
DEBUG responds by its prompt, "- ", and waits for a command to be entered.
First, we can view the registers by typing "R".
The display shows the contents of the registers in hex. On the third line of the display, we see
0EE6:0000 BB0040 MOV AX,4000
0EE6:0000 is the address of the next instruction to be executed, in segy ment:offset form. B80040h is the machine code of that instruction. Segment OEE6h is where DOS decided to load the program, if you try this demonstration, you will probably see a different segment number.
The eight pairs of letters appearing on the second line at the right are the current flag settings. The flags appear in this order: OF, DF, IF, SF, ZF, AF, PF, and CF. Table 5.2 gives the symbols DEBUG uses for the flags. You can see that they have been cleared by DEBUG. The meaning of the control flag symbols are explained in Chapters 11 and 15.
To step through our program, we use the "T" (trace) command. Before doing so, let's display the registers again.
- R
AX=0000 BX=0000 CX=001F DX=0000 SP=000A BP=0000 SI=0000 DI=0000
DS=0ED5 ES=0ED5 SS=0EE5 CS=0EE6 IP=0000 NV UP DI FL NZ NA PO NC
0EE6:0000 BB0040 MOV AX,4000
The first instruction is MOV AX,4000h.
- T
AX=4000 EX=0000 CX=001F DX=0000 SP=000A EP=0000 SI=0000 DI=0000
CS=0FD5 EC=0ECS SS=0EE5 CS=0EE6 IF=0003 NV UP DI PL NZ :A PO NC
CEE6:0003 C3CC A.D AX,AX
Execution of MOV AX,4000h puts 4000h in AX. The flags are unchanged since a MOV doesn't affect them. Now let's execute ADD AX,AX:
- T
AX=8000 BX=0000 CX=001F DX=0000 SP=000A BP=0000 SI=0000 DI=0000 DS=0ED5 ES=0ED5 SS=0EE5 CS=0EE6 IP=0005 OV UP DI NG NZ NA PE NC
OEE6:0005 2DFFF SUB AX,FFFF
Status Flag Set (1) Symbol
Clear (0) Symbol
CF CY (carry) NC (no carry)
PF PE (even parity) PO (odd parity)
AF AC (auxiliary carry) NA (no auxiliary carry)
ZF ZR (zero) NZ (nonzero)
SF NG (negative) PL (plus)
OF OV (overflow) NV (no overflow)
Control Flag
DF DN (down) UP (up)
IF EI (enable interrupts) DI (disable interrupts)
AX now contains 4000h + 4000h = 8000h. SF becomes 1 (NG) to indicate a negative result. Signed overflow is indicated by
Next we trace SUB AX,0FFFh:
- T
AX=2001 BX=0000 CX=001F DX=0000 SP=000A BP=0000 SI=0000 DI=0000
DS=CED5 ES=0ED5 SS=0EE5 CS=0EE6 IP=0008 NV UP DI NG NZ AC PO CY
OEE6:0008 F7D8 NEG AX
AX gets 8000h - FFFFh = 8001h. Of changes back to 0 (NV), because we are subtracting numbers of like sign, so signed overflow is impossible. However,
Now let's trace NEG AX:
- T
AX=7FFF BX=0000 CX=001F DX=0000 SP=000A BP=0000 SI=0000 DI=0000
DS=CED5 ES=0ED5 SS=0EE5 CS=0EE6 IP=000A NV UP DI PL NZ AC PE CY
OEE6:000A 40 ING AX
AX gets the two's complement of 8001h = 7FFFh. For NEG,
Finally, we execute INC AX:
- T
AX=8000 BX=0000 CX=001F DX=0000 SP=000A BP=0000 SI=0000 DI=0000
DS=OED5 ES=OED5 SS=OEE5 CS=OEE6 IP=000B OV UP DI NG NZ AC PE CY
OEE6:000B B44C MOV AH,4C
OF changes back to 1 (OV) because we added two positives (7FFFh and 1), and got a negative result. Even though there was no carry out of the insb, CF stays 1 because INC doesn't affect this flag.
To complete execution of the program, we can type "G" (go):
- G
Program terminated normally
and to exit DEBUG, type "Q" (quit)
-
The FLAGS register is one of the registers in the 8086 microprocessor. Six of the bits are called status flags, and three are control flags.
-
The status flags reflect the result of an operation. They are the carry flag (CF), parity flag (PF), auxiliary carry flag (AF), zero flag (ZF), sign flag (SF), and overflow flag (OF).
-
CF is 1 if an add or subtract operation generates a carry out or borrow into the most significant bit position; otherwise, it is 0.
-
PF is 1 if there is an even number of 1 bits in the result; otherwise, it is 0.
-
AF is 1 if there is a carry out or borrow into bit 3 in the result; otherwise, it is 0.
-
ZF is 1 if the result is 0; otherwise, it is 0.
-
SF is 1 if the most significant bit of the result is 1; otherwise, it is 0.
-
OF is 1 if the correct signed result is too big to fit in the destination; otherwise, it is 0.
Overflow occurs when the correct result is outside the range of values represented by the computer. Unsigned overflow occurs if an unsigned interpretation is being given to the result, and signed overflow happens if a signed interpretation is being given.
The processor uses CF and OF to indicate overflow:
The processor sets CF if there is a carry out of the msb on addition, or a borrow into the msb on subtraction. In the latter case, this means that a larger unsigned number is being subtracted from a smaller one.
The processor sets OF if there is a carry into the msb but no carry out, or if there is a carry out of the msb but no carry in.
There is another way to tell whether signed overflow occurred on addition and subtraction. On addition of numbers of like sign, signed overflow occurs if the result has a different sign; subtraction of numbers of different sign is like adding numbers of the same sign, and signed overflow occurs if the result has a different sign.
On addition of numbers of different sign, or subtraction of numbers of the same sign, signed overflow is impossible.
Generally the execution of each instruction affects the flags, but some instructions don't affect any of the flags, and some affect only some of the flags.
The settings of the flags is part of the DEBUG display.
The DEBUG program may be used to trace a program. Some of its commands are "R", to display registers; "I", to trace an instruction; and "G", to execute a program.
control flags
Flags that are used to enable or disable certain operations of the CPU
flags
Bits of the FLAGS register that represent a condition of the CPU
FLAGS register
status flags
Register in the CPU whose bits are flags
Flags that reflect the result of an instruction executed by the CPU
- For each of the following instructions, give the new destination contents and the new settings of CF, SF, ZF, PF, and OF. Suppose that the flags are initially 0 in each part of this question.
a. ADD AX, BX where AX contains 7FFFh and BX contains 0001h
b. SUB AL, BL where AL contains 01h and BL contains FFh
c. DEC AL where AL contains 00h
d. NEG AL where AL contains 7Ph e. XCHG AX,BX where AX contains 1ABCh and BX contains 712Ah f. ADD AL,BL where AL contains 80h and BL contains FFh g. SUB AX,BX where AX contains 0000h and BX contains 8000h h. NEG AX where AX contains 0000h
- a. Suppose that AX and BX both contain positive numbers, and ADD AX,BX is executed. Show that there is a carry into the msb but no carry out of the msb if, and only if, signed overflow occurs.
b. Suppose AX and BX both contain negative numbers, and ADD AX,BX is executed. Show that there is a carry out of the msb but no carry into the msb if, and only if, signed overflow occurs. 1
- Suppose ADD AX,BX is executed. In each of the following parts, the first number being added is the contents of AX, and the second number is the contents of BX. Give the resulting value of AX and tell whether signed or unsigned overflow occurred.
a. 512Ch +4185h
b. FE12h +1ACBh
c. E1E4h +DAB3h
d. 7132h +7000h
e. 6389h +1176h
- Suppose SUB AX,BX is executed. In each of the following parts, the first number is the initial contents of AX and the second number is the contents of BX. Give the resulting value of AX and tell whether signed or unsigned overflow occurred.
a. 2143h -1986h
1b. 81F1h -1986h
c. 19BCh -81FEh
d. 0002h -FE0Hh
e. 8BCDh -71ABh
For assembly language programs to carry out useful tasks, there must be a way to make decisions and repeat sections of code. In this chapter we show how these things can be accomplished with the jump and loop instructions.
The jump and loop instructions transfer control to another part of the program. This transfer can be unconditional or can depend on a particular combination of status flag settings.
After introducing the jump instructions, we'll use them to implement high- level language decision and looping structures. This application will make it much easier to convert a pseudocode algorithm to assembly code.
i.1 In Example of I Jump
To get an idea of how the jump instructions work, we will write a program to display the entire IBM character set.
1: TITLE PGM6_1: IBM CHARACTER DISPLAY
2: .MODEL SMALL
3: .STACK 100H
4: .CODE
5: MAIN PROC
6: MOV AH,2 :display char function
7: MOV CX,256 :no.. of chars to'display
8: MOV DL,0 :DL.has ASCII code of null ch
9: PRINT_LOOP:
10: INT 21h ;display a char 11: INC DL ; increment ASCII code 12: DEC CX ; decrement counter 13: JNZ PRINT_LOOP ; keep going if CX not 0 14: ;DOS exit 15: MOV AH,4CH 16: INT 21h 17: MAIN ENDP 18: END MAIN
There are 256 characters in the IBM character set. Those with codes 32 to 127 are the standard ASCII display characters introduced in Chapter 2. IBM also provides a set of graphics characters with codes 0 to 31 and 128 to 255.
To display the characters, we use a loop (lines 9 to 13). Before entering the loop, AH is initialized to 2 (single- character display) and DL is set to 0, the initial ASCII code. CX is the loop counter; it is set to 256 before entering the loop and is decremented after each character is displayed.
The instruction that controls the loop is JNZ (Jump if Not Zero). If the result of the preceding instruction (DEC CX) is not zero, then the JNZ instruction transfers control to the instruction at label PRINT_LOOP. When CX finally contains 0, the program goes on to execute the DOS return instructions. Figure 6.1 shows the output of the program. Of course, the ASCII codes of backspace, carriage return, and so on cause a control function to be performed, rather than displaying a symbol.
Note: PRINT_LOOP is the first statement label we've used in a program. Labels are needed in situations where one instruction refers to another, as is the case here. Labels end with a colon, and to make labels stand out, they are usually placed on a line by themselves. If so, they refer to the instruction that follows.
JNZ is an example of a conditional jump instruction. The syntax is
Jxxx destination_label
If the condition for the jump is true, the next instruction to be executed is the one at destination_label, which may precede or follow the jump instruction itself. If the condition is false, the instruction immediately following the jump is done next. For JNZ, the condition is that the result of the previous operation is not zero.
The structure of the machine code of a conditional jump requires that destination_label must precede the jump instruction by no more than 126 bytes, or follow it by no more than 127 bytes (we'll show how to get around this restriction later).
To implement a conditional jump, the CPU looks at the FLAGS register. You already know it reflects the result of the last thing the processor did. If the conditions for the jump (expressed as a combination of status flag settings) are true; the CPU adjusts the IP to point to the destination label, so that the instruction at this label will be done next. If the jump condition is false, then IP is not altered; this means that the next instruction in line will be done.
In the preceding program, the CPU executes JNZ PRINT_LOOP by inspecting ZF. If ZF = 0, control transfers to PRINT_LOOP; if ZF = 1, the program goes on to execute MOV AH,4CH.
Table 6.1 shows the conditional jumps. There are three categories: (1) the signed jumps are used when a signed interpretation is being given to results, (2) the unsigned jumps are used for an unsigned interpretation, and (3) the single- flag jumps, which operate on settings of individual flags. Note: the jump instructions themselves do not affect the flags.
The first column of Table 6.1 gives the opcodes for the jumps. Many of the jumps have two opcodes; for example, JG and JNLE. Both opcodes produce the same machine code. Use of one opcode or its alternate is usually determined by the context in which the jump appears.
The jump condition is often provided by the CMP (compare) instruction. It has the form
CMP destination, source
This instruction compares destination and source by computing destination contents minus source contents. The result is not stored, but the flags are affected. The operands of CMP may not both be memory locations. Destination may not be a constant. Note: CMP is just like SUB, except that destination is not changed.
For example, suppose a program contains these lines:
CMP AX,BX JG BELOW
where AX = 7FFFh, and BX = 0001. The result of CMP AX,BX is 7FFFh = 0001h = 7FFEh. Table 6.1 shows that the jump condition for JG is satisfied, because ZF = SF = OF = 0, so control transfers to label BELOW.
Symbol Description Condition for Jumps
JG/JNLE jump if greater than
jump if not less than
or equal to
JGE/JNL jump if greater than
or equal to
jump if not less than
or equal to
JUJINGE jump if less than
jump if not greater than
or equal
JLE/JNG jump if less than or equal
jump if not greater than
Symbol Description Condition for Jumps
JAVJNBE jump if above
jump if not below or equal
JAE/JNB jump if above or equal
jump if not below
JB/JNAE jump if below
jump if not above or equal
JBE/JNA jump if equal
jump if not above
Symbol Description Condition for Jumps
JE/JZ jump if equal
JNE/JNZ jump if not equal
JNP/JPO jump if parity odd
jump if not zero
JC jump if carry
JNC jump if no carry
JO jump if overflow
JNO jump if no overflow
JS jump if sign negative
JNS jump if nonnegative sign
JP/JPE jump if parity even
JNP/JPO jump if parity odd
In the example just given, we determined by looking at the flags after CMP was executed that control transfers to label BELOW. This is how the CPU implements a conditional jump. But it's not necessary for a programmer to think about the flags; you can just use the name of the jump to decide if control transfers to the destination label. In the following,
CMP AX,BX
JG BELOW
if AX is greater than BX (in a signed sense), then JG (jump if greater than) transfers to BELOW.
Even though CMP is specifically designed to be used with the conditional jumps, they may be preceded by other instructions, as in PGM6_1. Another example is
DEC AX JL THERE
Here, if the contents of AX, in a signed sense, is less than 0, control transfers to THERE.
Each of the signed jumps corresponds to an analogous unsigned jump; for example, the signed jump JG and the unsigned jump JA. Whether to use a signed or unsigned jump depends on the interpretation being given. In fact, Table 6.1 shows that these jumps operate on different flags: the signed jumps operate on ZF, SF, and OF, while the unsigned jumps operate on ZF and CF. Using the wrong kind of jump can lead to incorrect results.
For example, suppose we're giving a signed interpretation. If AX = 7FFFh, BX = 8000h, and we execute
CMP AX,BX JA BELOW
then even though 7FFFh > 8000h in a signed sense, the program does not jump to BELOW. The reason is that 7FFFh < 8000h in an unsigned sense, and we are using the unsigned jump JA.
In working with the standard ASCII character set, either signed or unsigned jumps may be used, because the sign bit of a byte containing a character code is always zero. However, unsigned jumps should be used when comparing extended ASCII characters (codes 8(h) to 11h).
Example 6.1 Suppose AX and BX contain signed numbers. Write some code to put the biggest one in CX.
MOV CX,AX ;put AX in CX
CMP EX,CX. ;is BX bigger?
JLE NEXT ;no, go on
MOV CX,BY ;yes, put BX in CX
NEXT:
The JMP (jump) instruction causes an unconditional transfer of control (unconditional jump). The syntax is
JMP destination
where destination is usually a label in the same segment as the JMP itself (see Appendix F for a more general description).
JMP can be used to get around the range restriction of a conditional jump. For example, suppose we want to implement the following loop:
TOP:
; body of the loop
DEC CX ; decrement counter
JNZ TOP ; keep looping if CX > 0
MOV AX, EX
and the loop body contains so many instructions that label TOP is out of range for JNZ (more than 126 bytes before JMP TOP). We can do this:
TOP:
; body of the loop
DEC CX ; decrement counter JNZ BOTTOM ; keep looping if CX > 0 JMP EXIT
BOTTOM:
JMP TOP
EXIT:
MOV AX, BX
We've shown that the jump instructions can be used to implement branches and loops. However, because the jumps are so primitive, it is difficult, especially for beginning programmers, to code an algorithm with them without some guidelines.
Because you have probably had some experience with high- level language constructs—such as the IF- THEN- ELSE decision structure or WHILE loops—we'll show how these structures can be simulated in assembly language. In each case, we will first express the structure in a high- level pseudocode.
In high- level languages, branching structures enable a program to take different paths, depending on conditions. In this section, we'll look at three structures.
The IF- THEN structure may be expressed in pseudocode as follows:
IF condition is true.
THEN
execute true- branch statements
END IF
See Figure 6.2.
The condition is an expression that is true or false. If it is true, the true- branch statements are executed. If it is false, nothing is done, and the program goes on to whatever follows.
Example 6.2 Replace the number in AX by its absolute value.
Solution: A pseudocode algorithm is
IF AX < 0
THEN
replace AX by - AX
END IF
It can be coded as follows:
;if AX < 0
CMP AX, 0
JNL END IF ; no, exit
;then
NEG AX
;yes, change sign
END IF:
The condition AX < 0 is expressed by CMP AX, 0. If AX is not less than 0, there is nothing to do, so we use a JNL (jump if not less) to jump around the NEG AX. If condition AX < 0 is true, the program goes on to execute NEG AX.
IF condition is true
THEN
execute true- branch statements
ELSE
execute false- branch statements
END_IF
See Figure 6.3.
In this structure, if condition is true, the true- branch statements are executed. If condition is false, the false- branch statements are done.
Example 6.3 Suppose AL and BL contain extended ASCII characters. Display the one that comes first in the character sequence.
IF AL
THEN
display the character in AL
ELSE
display the character in BL
END_IF
It can be coded like this:
MOV AH,2 ;prepare to display
;if AL
CMP AL,BL ;AL
;NBSE ELSE_ ;no, display char in BL
;then
;AL
MOV DL,AL ;move char to be displayed
JMP DISPLAY ;go to display
ELSE_
;BL < AL
MOV DL,BL
DISPLAY:
INT 21h ;display it
END_IF
Note: the label ELSE_ is used because ELSE is a reserved word.
The condition AL
If AL
A CASE is a multiway branch structure that tests a register, variable, or expression for particular values or a range of values. The general form is as follows:
CASE expression
values_1: statements_1 values_2: statements_2
values_n: statements_n
END_CASE
See Figure 6.4.
In this structure, expression is tested; if its value is a member of the set values_i, then statements_i are executed. We assume that sets values_1,..,values_n are disjoint.
Example 6.4 If AX contains a negative number, put - 1 in BX; if AX contains 0, put 0 in BX; if AX contains a positive number, put 1 in BX.
CASE AX
<0: put - 1 in BX =0: put 0 in BX >0: put- 1 in BX
END CASE
It can be coded as follows:
;case AX
CMP AX,0 ;test ax JL NEGATIVE ;AX < 0 JE ZERO ;AX = 0
NEGATIVE:
MOV BX,- 1 ;put - 1 in BX JMP END CASE ;and exit
ZERO:
MOV BX,0 ;put 0 in BX JMP END CASE ;and exit
POSITIVE:
MOV BX,1 ;put 1 in BX
END CASE:
Note: only one CMP is needed, because jump instructions do not affect the
Example 6.5 If AL contains 1 or 3, display "o"; if AL contains 2 or 4, display "e".
CASE AL
1,3: display 'c'
2,4: display 'e'
END CASE
The code is
;case AL
;1,3:
CMP AL,1 ;AL = 1?
JE CDD ;yes, display 'o'
CMP AL,3 ;AL = '3'
JE CDD ;yes, display 'o'
;2,4:
CMP AL,2 ;AL = 2?
JE EVEN: ;yes, display 'e'
CMP AL,4 ;AL = 4?
JE EVEN ;yes, display 'e'
JMP END CASE ;not 1,4
JMP END CASE ;not 1,4
CDD: ;display 'o'
MOV DL, 'o' ;get 'o'
JMP DISPLAY ;gc - to display
EVEN: ;display 'e'
MOV DL, 'e' ;get 'e'
DISPLAY:
MOV AH,2 INT 21H :display char
END CASE:
Sometimes the branching condition in an IF or CASE takes the form condition_1 AND condition_2
or
condition_1 OR condition_2
'where condition_1 and condition_2 are either true or false. We will refer to the first of these as an AND condition and to the second as an OR condition.
An AND condition is true if and only if condition_1 and condition_2 are both true. Likewise, if either condition is false, then the whole thir
Example 6.6 Read a character, and if it's an uppercase letter, display it.
Read a character (into AL)
IF
THEN
display character
END IF
To code this, we first see if the character in Al. follows "A" (or is "A") in the character sequence. If not, we can exit. If so, we still must see if the character precedes "Z" (or is "Z") before displaying it. Here is the code:
;read a character
MOV AH,1
;prepare to read
INT 21H
;char in A
;if ('A' <= char) and (char <= 'Z')
CMP 'AL 'A' ;char >= 'A'?
JNGE END IF ;no, exit
CMP AL, 'Z' ;char <= 'Z'?
JNLE END IF ;no, exit
;then display char
MOV DL,AL ;get char
MOV AH,2 ;prepare to display
INT 21H
;display char
END IF:
Condition_1 OR condition_2 is true if at least one of the conditions is true; it is only false when both conditions are false.
Example 6.7 Read a character. If it's "y" or "Y", display it; otherwise, terminate the program.
Read a character (into AL)
IF (character = 'y') OR (character = 'y')
THEN
display it
ELSE
terminate the program
END IF
To code this, we first see if character = "y". If so, the OR condition is true and we can execute the THEN statements. If not, there is still a chance the OR condition will be true. If character = "y", it will be true, and we execute the THEN statements; if not, the OR condition is false and we do the ELSE statements. Here is the code:
; read a character
MOV AH,1 ;prepare to read
INT 21H ;char in AL
;if (character = 'y') or (character = 'y')
CMP AL,'y' ;char - 'y'?
JE THEN ;yes, go to display it
CMP AL,'y' ;char - 'y'?
JE THEN ;yes, go to display it
JMP ELSE_ ;no, terminate
THEN:
MOV AH,2 ;prepare to display
MOV LL,AL ;get char
INT 21H ;display it
JMP END_IF ;and exit
ELSE_:
MOV AH,4CH
INT 21H ;DOS exit
END IF:
Looping Structure:
A loop is a sequence of instructions that is repeated. The number of times to repeat may be known in advance, or it may depend on conditions
This is a loop structure in which the loop statements are repeated a known number of times (a count- controlled loop). In pseudocode,
FOR loop_count times DO
statements
END_FoR
See Figure 6.5.
The LOOP instruction can be used to implement a FOR loop. It has the form
LOOP destination_label
The counter for the loop is the register CX which is initialized to loop_count. Execution of the LOOP instruction causes CX to be decremented automatically,
and if CX is not 0, control transfers to destination_label. If CX = 0, the next instruction after LOOP is done. Destination_label must precede the LOOP instruction by no more than 126 bytes.
Using the instruction LOOP, a FOR loop can be implemented as follows:
initialize CX to loop_count
;body of the loop
LOOP TOP
Example 6.8 Write a count- controlled loop to display a row of 80 stars.
FOR 80 times DO
display
END FOR
The code is
MOV CX, 80 ; number of stars to display
MOV AH, 2 ; display character function
MOV DL, ; character to display
INT 21h ; display a star
LOOP TOP ; repeat 30 times
You may have noticed that a FOR loop, as implemented with a LOOP instruction, is executed at least once. Actually, if CX contains 0 when the loop is entered, the LOOP instruction causes CX to be decremented to FFFFh, and
the loop is then executed FFFFh = 65535 more times! To prevent this, the struction JCXZ (jump if CX is zero) may be used before the loop. Its syntax
JCXZ
destination_label
If CX contains 0, control transfers to the destination label. So a loop i plemented as follows is bypassed if CX is 0:
JCKZ SKIP
TOP:
;body of the loop
LOOP TOP
SKIP:
This loop depends on a condition. In pseudocode,
WHILE condition DO
statements
END WHILE
See Figure 6.6.
The condition is checked at the top of the loop. If true, the stat .ments : executed; if false, the program goes on to whatever follows. It is p ssible ' ' the condition will be false initially, in which case the loop body is n it execu t at all. The loop executes as long as the condition is true.
Example 6.9 Write some code to count the number of chars ters in : n input line.
initialize count to 0
read a character
WHILE character <> carriage_return DO
count - count + 1
read a character
END WHILE
The code is
MOV DX, 0 ; DX counts characters
MOV AH, 1 ; prepare to read
INT 21H ; character in AL
WHILE_:
CMP AL, 0DH ; CR?
JE END WHILE ; yes, exit
INC DX ; not CR, increment count
INT 21H ; read a character
CMP WHILE ; loop back?
END WHILE:
Note that because a WHILE loop checks the terminating condition at the top of the loop, you must make sure that any variables involved in the condition are initialized before the loop is entered. So you read a character before entering the loop, and read another one at the bottom. The label WHILE: is used because WHILE is a reserved word.
Another conditional loop is the REPEAT LOOP. In pseudocode,
REPEAT
statements
UNTIL condition
See Figure 6.7.
In a REPEAT...UNTIL loop, the statements are executed, and then the condition is checked. If true, the loop terminates; if false, control branches to the top of the loop.
Example 6.10 Write some code to read characters until a blank is read.
REPEAT
read a character
UNTIL character is a blank
MOV AH,1 ;prepare to read
REPEAT:
INT 21H ;char in AL
;until
CMP AL, ;a blank?
JNE REPEAT ;no, keep reading
In many situations where a conditional loop is needed, use of a WHILE loop or a REPEAT loop is a matter of personal preference. The advantage of a WHILE is that the loop can be bypassed if the terminating. condition is initially false, whereas the statements in a REPEAT must be done at least once. However, the code for a REPEAT loop is likely to be a little shorter because there is only a conditional jump at the end, but a WHILE loop has two jumps: a conditional jump at the top and a JMP at the bottom.
To show how a program may be developed from high- level pseudocode to assembly code, let's solve the following problem.
Prompt the user to enter a line of text. On the next line, display the capital letter entered that comes first alphabetically and the one that comes last. If no capital letters are entered, display "No capital letters". The execution should look like this:
Type a line of text: THE QUICK BROWN FOX JUMPED. First capital - B Last capital - X
To solve this problem, we will use the method of top- down program design that you may have encountered in high- level language programming. In this method, the original problem is solved by solving a series of subproblems, each of which is easier to solve than the original problem. Each subproblem is in turn broken down further until we reach a level of subproblems that can be coded directly. The use of procedures (Chapter 8) may enhance this method.
-
Display-the opening message.
-
Read and process a line of text.
-
Display the results.
This step can be coded immediately
MOV AH, 9 ; display string function LEA DX, PROMPT ; get opening message INT 21H ; display it
The message will be stored in the data segment as
PROMPT DB 'Type a line of text:',0DH,0AH,'S'
We include a carriage return and line feed to move the cursor to the next line so the user can type a full line of text.
This step does most of the work in the program. It takes input from the keyboard, and returns the first and last capital letters read (it should also indicate if no capitals were read). Here is a breakdown:
Read a character
WHILE character is not a carriage return DO IF character is a capital letter (*)
THEN
IF character precedes first capital
THEN
first capital
END_IF
IF character follows last capital
THEN
last capital
END_IF
END_IF
Read a character
END WHILE
Line (*) is actually an AND condition:
IF ('A' <= character) AND (character <= 'Z')
Step 2 can be coded as follows:
MOV AH, 1 ; read char function
INT 21H ; char in AL
WHILE
; while character is not a carriage return do
CMP AL,0DH ; CR?
JE END WHILE ; yes, exit
; if character is a capital letter
CMP AL,'A' ; char >= 'A'?
JNGE END IF ; not a capital letter
CMP AL,'Z' ; char <= 'Z'?
JNLE END IF ; not a capital letter
; then
; if character precedes first capital
CMP AL, FIRST ; char < FIRST?
JNL CHECK_LAST ; no, >=
; then first capital = character
MOV FIRST, AL ; FIRST = char
; end_if
CHECK_LAST:
: if character follows last capital CMP AL, LAST ; char > LAST? JNG END_IF ; no, <= ; then last capital = character MOV LAST, AL ; LAST = char
; end_if
END_IF:
; read a character.
INT 21H ; char in AL
JMP WHILE ; repeat loop
END WHILE:
Variables FIRST and LAST must have values before the WHILE loop is executed the first time. They can be initialized in the data segment as follows:
FIRST DB '1' LAST DB 'e'
The initial values
With step 2 coded, we can proceed to the final step.
IF no capitals were typed,
THEN
display "No capitals"
ELSE
display first capital and last capital
END_IF
This step will display one of two possible messages; NOCAP_MSG if no capitals are entered, and CAP_MSG if there are capitals. We can declare them in the data segment as follows:
NOCAP_MSG DB 'No capitals
When CAP_MSG is displayed, it will display "First capital
The program decides, by inspecting FIRST, whether any capitals were read. If FIRST contains its initial value
Step 3 may be coded as follows:
MOV AH,9 ; display string function
; if no capitals were typed
CMP FIRST, 'J'; FIRST = 'J'?
JNE CAPS ; no, display results
; then
LEA DX, NOCAP_MSG
JMP DISPLAY
CAPS:
LEA DX, CAP_MSG
DISPLAY:
INT 21H ; display message
; end_if
- Here is the complete program:
TITLE PGM6_2: FIRST AND LAST CAPITALS
-
MODEL SMALL
-
STACK 100H
-
DATA
-
PROMPT DB 'Type a line of text',0DH,0AH,'$'
NCCAP_MSG DB ODH,SAH,'No capitals $'
.CAP_MSG DB CDH,0AH,'First capital = '
FIRST DB '
DB 'Last capital = '
LAST DB '@ $'
CODE
MAIN PROC
;initialize DS
MOV AX,@DATA
MOV DS,AX
;display opening message
-
MOV .AH,9 ;display string function
-
LEA: DX, PROMPT ;get opening message
-
INT 21H ;display it
;read and process,a line of text
MOV AH, 1 ;reid char function
INT 21H ;char in AL
WHILE
;while character is not a carriage return do
CMP: AL,ODH ;CR?
- -JE END WHILE ;yes, exit
;if character is a capital letter
CMP AL,A' ;char >= 'A'?
JNGE END IF ;not a capital letter
CMP AL,'Z';char <= 'Z'?
JNLE END IF ;not a capital letter
;then
;if character precedes first capital
CMP AL, FIRST ;char < first capital?
JNL CHECK_LAST ;no, >
; then.first capital = character.
MOV FIRST,AL ;FIRST = char
;end_if
CHECK_LAST:
'; if character follows last capital
CMP AL, LAST ;char > last capital?
JNG END IF ;no, <=
; then last capital character
MOV LAST,AL ;LAST = char
; end_if
END IF:
; read a character
INT 21H ;char in AL
JMP WHILE ;repeat loop
END WHILE:
; display results
MOV AH,9 ;display string function
;if no capitals were typed
CMP FIRST,'l';first - 'l'
JNE CAPS ;no, display results
;then
LEA DX,POCAP_MSG ;no capitals
JMP DISPLAY
CAPS:
LEA DX,CAP_MSG ;capitals
DISPLAY:
INT 21H ;display message
;end_if
;dos exit
MOV AH,4CH
INT 21H
MAIN ENDP
END MAIN
The jump instructions may be divided into unconditional and conditional jumps. The conditional jumps may be classified as signed, unsigned, and single- flag jumps.
The conditional jumps operate on the settings of the status flags. The CMP (compare) instruction is often used to set the flags just before a jump instruction.
The destination label of a conditional jump must be less than 126 bytes before or 127 bytes after the jump. A JMP can often be used to get around this restriction.
In an IF- THEN decision structure, if the test condition is true, then the true- branch statements are done; otherwise, the next statement in line is done.
In an IF- THEN- ELSE decision structure, if the test condition is true, then the true- branch statements are done; otherwise the false- branch statements are done. A JMP must follow the true- branch statements so that the false- branch will be bypassed.
In a CASE structure, branching is controlled by an expression; the branches correspond to the possible values of the expression.
A FOR loop is executed a known number of times. It may be implemented by the LOOP instruction. Before entering the loop, CX is initialized to the number of times to repeat the loop statements.
In a WHILE loop, the loop condition is checked at the top of the loop. The loop statements are repeated as long as the condition is true. If the condition is initially false, the loop statements are not done at all.
In a REPEAT loop, the loop condition is checked at the bottom of the loop. The statements are repeated until the condition is true. Because the condition is checked at the bottom of the loop, the statements are done at least once.
AND condition
conditional jump instruction
loop
OR condition
signed jump
single- flag jump
top- down program design
unconditional jump unsigned jump
A logical AND of two conditions
A jump instruction whose execution depends on status flag settings
A sequence of instructions that is repeated
A logical OR of two conditions
A conditional jump instruction used with signed numbers
A conditional jump that operates on the setting of an individual status flag
Program development by breaking a large problem into a series of smaller problems
An unconditional transfer of control
A conditional jump instruction used with unsigned numbers
CMP
JCXZ
JE/JZ
JG/JNLE
JGE/JNL
JL/JNLE
JLE/JNG
JL/JNLE
JMP
JNC
JNE/JNZ
LOOP
- Write assembly code for each of the following decision structures.
a. IF AX < 0
THEN
PUT - 1 IN BX
END IF
b. IF AL < 0
THEN
put FPh in AH'
ELSE
put 0 in AH
END IF
c. Suppose DL contains the ASCII code of a character.
(IF DL >= "A") AND (DL <= 'Z')
THEN
display DL
END IF
d. IF AX < BX
THEN
IF BX < CX
THEN
put 0 in AX ELSE put 0 in BX END IF END IF
e. IF (AX < BX) OR (BX < CX) THEN put 0 in DX ELSE put 1 in DX END IF
f. IF AX < BX THEN put 0 in AX ELSE IF BX < CX THEN put 0 in BX ELSE put 0 in CX END IF END IF
Read a character. If it's "A", then execute carriage return. If it's "B", then execute line feed. If it's any other character, then return to DOS.
Write a sequence of instructions to do each of the following:
a. Put the sum
b. Put the sum
Employ LOOP instructions to do the following:
a. put the sum of the first 50 terms of the arithmetic sequence 1, 5, 9, 13, ... in DX.
b. Read a character and display it 80 times on the next line.
c. Read a five-character password and overprint it by executing a carriage return and displaying five X's. You need not store the input characters anywhere.
The following algorithm may be used to carry out division of two nonnegative numbers by repeated subtraction:
initialize quotient. to 0 WHILE dividend
Write a sequence of instructions to divide AX by BX, and put the quotient in CX.
- The following algorithm may be used to carry out multiplication of two positive numbers M and N by repeated addition:
initialize product to 0 REPEAT
add M to product
decrement N
UNTIL N = 0
Write a sequence of instructions to multiply AX by BX, and put the product in CX. You may ignore the possibility of overflow.
- It is possible to set up a count-controlled loop that will continue to execute as long as some condition is satisfied. The instructions
LOOPE label ;loop while equal
and
LOOPZ label ;loop while zero
cause CX to be decremented, then if CX
LOOPNE label ;loop while not equal
and
LOOPNZ label ;loop while not zero
cause CX to be decremented, then if CX
a. Write instructions to read characters until either a nonblank character is typed, or 80 characters have been typed. Use LOOPE.
b. Write instructions to read characters until either a carriage return is typed or 80 characters have been typed. Use LOOPNE.
-
Write a program to display a
${\bf \omega}^{\prime \prime}{\bf \omega}^{\prime \prime}$ , read two capital letters, and display them on the next line in alphabetical order. -
Write a program to display the extended ASCII characters (ASCII codes 80h to FFh). Display 10 characters per line, separated by blanks. Stop after the extended characters have been displayed once.
-
Write a program that will prompt the user to enter a hex digit character ("0"... "9" or "A"... "F"), display it on the next line in decimal, and ask the user if he or she wants to do it again. If the user types "y" or "Y", the program repeats; if the user types anything else, the program terminates. If the user enters an illegal character, prompt the user to try again.
Sample execution:
- ENTER A HEX DIGIT: 9 IN DECIMAL IS IT 9 DO YOU WANT TO DO IT AGAIN? y ENTER A HEX DIGIT: c ILLEGAL CHARACTER
- ENTER 0..9 OR A..F: C IN DECIMAL IT IS 12 DO YOU WANT TO DO IT AGAIN? N
-
Do programming exercise 10, except that if the user fails to enter a hex-digit character in three tries, display a message and terminate the program.
-
(hard) Write a program that reads a string of capital letters, ending with a carriage return, and displays the longest sequence of consecutive alphabetically increasing capital letters read.
Sample execution:
ENTER A STRING OF CAPITAL LETTERS: FGAADefGHC THE LONGEST CONSECUTIVELY INCREASING STRING IS: DEFGH
7
In this chapter we discuss instructions that can be used to change the bit pattern in a byte or word. The ability to manipulate bits is generally absent in high- level languages (except C), and is an important reason for programming in assembly language:
In section 7.1, we introduce the logic instructions AND, OR, XOR, and NOT. They can be used to clear, set, and examine bits in a register or variable. We use these instructions to do some familiar tasks, such as converting a lowercase letter to upper case, and some new tasks, such as determining if a register contains an even or odd number.
Section 7.2 covers the shift instructions. Bits can be shifted left or right in a register or memory location; when a bit is shifted out, it goes into CF. Because a left shift doubles a number and a right shift halves it, these instructions give us a way to multiply and divide by powers of 2. In Chapter 9, we'll use the MUL and DIV instructions for doing more general multiplication and division; however, these latter instructions are much slower than the shift instructions.
In section 7.3, the rotate instructions are covered. They work like the shifts, except that when a bit is shifted out one end of an operand it is put back in the other end. These instructions can be used in situations where we want to examine and/or change bits or groups of bits.
In section 7.4, we use the logic, shift, and rotate instructions to do binary and hexadecimal I/O. The ability to read and write numbers lets us solve a great variety of problems.
As noted earlier, the ability to manipulate individual bits is on the advantages of assembly language. We can change individual bits in computer by using logic operations. The binary values of 0 and 1 are tre as false and true, respectively. Figure 7.1 shows the truth tables for the I operators AND, OR, XOR (exclusive OR), and NOT.
When a logic operation is applied to 8- or 16- bit operands, the re is obtained by applying the logic operation at each bit position.
Example 7.1 Perform the following logic operations:
-
10101010 AND 11110000
-
10101010 OR 11110000
-
10101010 XOR 11110000
-
NOT 10101010
- 10101010
AND 11110000
=10100000
- 10101010
OR 11110000
=11111010
- 10101010
XOR 11110000
=01011010
- NOT 10101010
=01010101
The AND, OR, and XOR instructions perform the named logic operations. The formats are
AND destination,source
OR destination,source
XOR destination,source
The result of the operation- is stored in the destination, which must be a register or memory location. The source may be a constant, register, or memory location. However, memory- to- memory operations are not allowed.
Effect on flags:
SF, ZF, PF reflect the result
AF is undefined
CF,
One use of AND, OR, and XOR is to selectively modify the bits in the destination. To do this, we construct a source bit pattern known as a mask. The mask bits are chosen so that the corresponding destination bits are modified in the desired manner when the instruction is executed.
To choose the mask bits, we make use of the following properties of AND, OR, and XOR. From Figure 7.1, if b represents a bit (0 or 1)
b AND 1 = b b OR 0 = b b XOR 0 = b b AND 0 = 0 b OR 1 = 1 b XOR 1 = - b (complement of b)
From these, we may conclude that
-
The AND instruction can be used to clear specific destination bits while preserving the others. A 0 mask bit clears the corresponding destination bit; a 1 mask bit preserves the corresponding destination bit.
-
The OR instruction can be used to set specific destination bits while preserving the others. A 1 mask bit sets the corresponding destination bit; a 0 mask bit preserves the corresponding destination bit.
-
The XOR instruction can be used to complement specific destination bits while preserving the others. A 1 mask bit complements the corresponding destination bit; a 0 mask bit preserves the corresponding destination bit.
Example 7.2 Clear the sign bit of AL while leaving the other bits unchanged.
Solution: Use the AND instruction with 01111111b = 7Fh as the mask. Thus,
AND AL, 7Fh
Example 7.3 Set the most significant and least significant bits of AL while preserving the other bits.
Solution: Use the OR instruction with 10000001b = 81h as the mask. Thus,
OR AL, 81h
Example 7.4 Change the sign bit of DX.
Solution: Use the XOR instruction with a mask of 80000r. Thus,
XOR DX, 8000h
Note: to avoid typing errors, it's best to express the mask in hex rather than binary, especially if the mask would be 16 bits long.
The logic instructions are especially useful in the following frequently occurring tasks.
We've seen that when a program reads a character from the keyboard, AL gets the ASCII code of the character. This is also true of digit characters. For example, if the "5" key is pressed, AL gets 30h instead of 5. To get 5 in AL, we could do this:
SUB AL, 30h
Another method is to use the AND instruction to clear the high nibble (high four bits) of AL:
AND AL, 00h
Because the codes of "0" to "9" are 30h to 39h, this method will convert any ASCII digit to a decimal value.
By using the logic instruction AND instead of SUB, we emphasize that we're modifying the bit pattern of AL. This is helpful in making the program more readable.
The reverse problem of converting a stored decimal digit to its ASCII code- is left as an exercise.
The ASCII codes of "a" to "z" range from 61h to 7Ah; the codes of "A" to "Z" go from 41h to 5Ah. Thus for example, if DL contains the code of a lowercase letter, we could convert to upper case by executing
SUB DL, 20h
This method was used in Chapter 4. However, if we compare the binary codes of corresponding lowercase and uppercase letters
| Character | Code | Character | Code |
| a | 01100001 | A | 01000001 |
| b | 01100010 | B | 01000010 |
| 01111010 | Z | 01011010 |
it is apparent that to convert lower to upper case we need only clear bit 5. This can be done by using an AND instruction with the mask 11011111b, or ODFh. So if the lowercase character to be converted is in DL, execute
AND DL, ODFh
The reverse problem of conversion from upper to lower case is left as ar. exercise.
Clearing a RegisterWe already know two ways to clear a register. For example, to clear AX we could execute
MOV AX,0
or
SUB AX,AX
Us:ng the fact that 1 XOR
XOR AX,AX
The machine code of the first method is three bytes, versus two bytes for the latter two methods, so the latter are more efficient. However, because of the prohibition on memory- to- memory operations, the first method must be used to clear a memory location.
Testing a Register for ZeroBecause 1 OR
OR CX,CX
OR CX,CXbecause it leaves the contents of CX unchanged. However, it affects ZF and SF, and in particular if CX contains 0 then ZF = 1. So it can be used as an alternative to
CMP CX,0
CMP CX,0to test the contents of a register for zero, or to check the sign of the contents.
7.1.2 NOT Instruction
7.1.2 NOT InstructionThe NOT instruction performs the one's complement operation on the destination. The format is
NOT destination
There is no effect on the status flags.
Example 7.5 Complement the bits in AX.
Solution:
NOT AX
The TEST instruction performs an AND operation of the destination with the source but does not change the destination contents. The purpose of the TEST instruction is to set the status flags. The format is
TEST destination, source
Effect on flags
SF, ZF, PF reflect the result
AF is undefined
CF,
The TEST instruction can be used to examine individual bits in an operand. The mask should contain 1's in the bit positions to be tested and O's elsewhere. Because 1 AND
TEST destination, mask
will have 1's in the tested bit positions if and only if the destination has 1's in these positions; it will have O's elsewhere. If destination has O's in all the tested position, the result will be O and so
Example 7.6 Jump to label BELOW if AL contains an even number.
Solution: Even numbers have a 0 in bit 0. Thus, the mask is 00000001b
TEST AL, 1 :is AL even? JZ BELOW :yes, go to BELOW
The shift and rotate instructions shift the bits in the destination operat by one or more positions either to the left or right. For a shift instruction, the bits shifted out are lost; for a rotate instruction, bits shifted out from one end of the operand are put back into the other end. The instructions have two possible formats. For a single shift or rotate, the form is
Opcode destination, 1
For a shift or rotate of
Opcode destination, CL
where CL contains N. In both cases, destination is an 8- or 16- bit register or memory location. Note that for Intel's more advanced processors, a shift or rotate instruction also allows the use of an 8- bit constant.
As we'll see presently, these instructions can be used to multiply and divide by powers of 2, and we will use them in programs for binary and hex I/O.
The SHL (shift left) instruction shifts the bits in the destination to the left. The format for a single shift is
SHL destination, 1
A 0 is shifted into the rightmost bit position and the msb is shifted into CF (Figure 7.2). If the shift count N is different from 1, the instruction takes the form
SHL destination,CL
where CL contains N. In this case, N single left shifts are made. The value of CL remains the same after the shift operation.
Effect on flags
SF, PF, ZF reflect the result
AF is undefined
CF = last bit shifted out
OF = 1 if result changes sign on last shift
Example 7.7 Suppose DH contains 8Ah and CL contains 3. What are the values of DH and of CF after the instruction SHL DH,CL is executed?
Solution: The binary value of DH is 10001010. After 3 left shifts, CF will contain 0. The new contents of DH may be obtained by erasing the leftmost three bits and adding three zero bits to the right end, thus 01010000b = 50n.
Consider the decimal number 235. If each digit is shifted left one position and a 0 attached to the right end, we get 2350; this is the same as multiplying 235 by ten.
In the same way, a left shift on a binary number multiplies it by 2. For example, suppose that AL contains
Thus, the SHL instruction can be used to multiply an operand by multiples of 2. However, to emphasize the arithmetic nature of the operation the opcode SAL (shift arithmetic left) is often used in instances where numeric multiplication is intended. Both instructions generate the same machine code.
Negative numbers can also be multip. y powers of 2 by left shifts For example, if AX is FFFFh (- 1), then shifti, ree times will yield AX - FFF8h (- 8).
When we treat left shifts as multiplication, overflow may occur. For a single left shift, CF and OF accurately indicate unsigned and signed overflow, respectively. However, the overflow flags are not reliable indicators for a multiple left shift. This is because a multiple shift is really a series of single shifts, and CF, CF only reflect the result of the last shift. For example, if BL contains 80h, CL contains 2 and we execute SHL BL,CL, then
Example 7.8 Write some code to multiply the value of AX by 8. Assume that overflow will not occur.
Solution: To multiply by 8, we need to do three left shifts.
MOV CL,3 SAL AX,CL
;number of shifts to do ;multiply by 8
The instruction SHR (shift right) performs right shifts on the destination operand. The format for a single shift is
SHM 9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999
A O Is shifted into the msb position, and the rightmost bit is shifted into CF. See Figure 7.3. If the shift count N is different from 1, the instruction takes the form
SHR destination,CL
where CL contains N. In this case N single right shifts are made.
The effect on the flags is the same as for SHL.
Example 7.9 Suppose DH contains 8Ah and CL contains 2. .What are the values of DH and CF after the instruction SHR DH,CL is executed?
Solution: The value of DH in binary is 10001010. After two right shifts, CF = 1. The new value of DH is obtained by erasing the rightmost two bits and adding two 0 bits to the left end, thus DH = 00100010b = 22h.
The SAR instruction (shift arithmetic right) opefates like SHR, with one difference: the msb retains its original value. See Figure 7.4. The syntax is SAR destination,1
and
SAR destination,CL
The effect on flags is the same as for SHR.
Because a left shift doubles the destination's value, it's reasonable to guess that a right shift might divide it by 2. This is correct for even numbers.
For odd numbers, a right shift halves it and rounds down to the nearest integer. For example, if BL contains 00000101b = 5, then after a right shift, BL will contain 00000010 = 2.
In doing division by right shifts, we need to make a distinction between signed and unsigned numbers. If an unsigned interpretation is being given, SHR should be used. For a signed interpretation, SAR must be used, because it preserves the sign.
Example 7.10 Use right shifts to divide the unsigned number 65143 by 4. Put the quotient in AX.
Solution: To divide by 4, two right shifts are needed. Since the dividend is unsigned, we use SHR. The code is
MOV AX, 65143 ; AX has number
MOV CL, 2 ; CL has number of right shifts
SHR AX, CL ; divide by 4
Example 7.11 If AL contains - 15, give the decimal value of AL after SAR AL, 1 is performed.
Solution: Execution of SAR AL, 1 divides the number by 2 and rounds down. Dividing - 15 by 2 yields - 7.5, and after rounding down we get - 8. In terms of the binary contents, we have - 15 = 11110001b. After shifting, we have 11111000b = - 8.
We've seen that multiplication and division by powers of 2 can be accomplished by left and right shifts. Multiplication by other numbers, such as 10d, can be done by a combination of shifting and addition (see Chapter 8).
In Chapter 9, we cover the MUL and IMUL, DIV and IDIV instructions. They are not limited to multiplication and division by powers of 2, but are much slower than the shift instructions.
The instruction ROL (rotate left) shifts bits to the left. The msb is shifted into the rightmost bit. The CF also gets the bit shifted out of the msb. You can think of the destination bits forming a circle, with the least significant bit following the msb in the circle. See Figure 7.5. The syntax is
ROL- destination,1
and
ROL destination,CL
The instruction ROR (rotate right) works just like ROL, except that the bits are rotated to the right. The rightmost bit is shifted into the msb, and also into the CF. See Figure 7.6. The syntax is
ROR destination,1
and
ROR destination.CF
In ROL and ROR, CF reflects the bit that is rotated out. The next example shows how this can be used to inspect the bits in a byte or word, without changing the contents.
Example 7.12 Use ROL to count the number of 1 bits in BX, without changing BX. Put the answer in AX.
XOR AX,AX ;AX counts bits
MOV CX,16 ;loop counter
TOP:
ROL BX,1 ;CF = bit rotated out
JNC NEXT ;0 bit
INC AX ;1 bit, increment total
NEXT:
LOOP TOP ;loop until done
In this example, we used JNC (Jump if No Carry), which causes a jump if
The instruction RCL (Rotate through Carry Left) shifts the bits of the destination to the left. The msb is shifted into the CF, and the previous value of CF is shifted into the rightmost bit. In other words, RCL works like just like ROL, except that CF is part of the circle of bits being rotated. See Figure 7.7. The syntax is
RCL destination,1
and
RCL destination,CL
The instruction RCR (Rotate through Carry Right) works just like RCL, except that the bits are rotated to the right. See Figure 7.8. The syntax is
RCR destination, 1
and
RCR destination, CL
Example 7.13 Suppose'DH contains 8Ah, CF = 1, and CL contains 3. What are the values of DH and CF after the instruction RCR DH,CL is executed?
CF DH
initial values 1
10001010
after 1 right 0
11000101
rotation
after 2 right 1
rotations
after 3 right 0
rotations
10110001 = B1h
' SF, PF, ZF reflect the result
AF is undefined
CF = last bit shifted out
OF = 1 if result changes sign on the last rotation
As an application of the shift and rotate instructions, let's consider the problem of reversing the bit pattern in a byte or word. For example, if AL contains 11011100, we want to make it 00111011.
An easy way to do this is to use SHL to shift the bits out the left end of AL into CF, and then use RCR to move them into the left end of another register; for example, BL. If this is done eight times, BL will contain the reversed bit pattern and it can be copied back into AL. The code is
MOV CX,8 ;number of operations to do
REVERSE:
SHL AL,1 ;get a bit into CF
RCR BL,1 ;rotate it into BL
LOOP REVERSE ;loop until done
MOV AL,BL ;AL gets reversed pattern
One useful application of the shift and rotate instructions is in binary and' hex I/O:
For binary input, we assume a program reads in a binary number from the keyboard, followed by a carriage return. The number actually is a character string of 0's and 1's. As each character is entered, we need to convert it to a bit value, and collect the bits in a register. The following algorithm reads a binary number from the keyboard and stores its value in BX.
Clear BX /* BX will hold binary value *
Input a character /* '0' or '1' */
WHILE character <> CR DO
Convert character to binary value
Left shift BX
Insert value into 1sb of BX
Input a character
END WHILE
Clear BX BX = 0000 0000 0000 0000
Input character '1', convert to 1 Left shift BX
BX = 0000 0000 0000 0000
Insert value into 1sb
BX = 0000 0000 0000 0001
Input character '1', convert to 1
Left shift BX
BX = 0000 0000 0000 0010
Insert value into 1sb
BX = 0000 0000 0000 0011
Input character '0' convert to 0
Left shift BX
BX = 0000 0000 0000 0110
Insert value into 1sb
BX = 0000 0000 0000 0110
BX contains 110b.
The algorithm assumes (1) input characters are either "0", "1", or CR, and (2) at most 16 binary digits are input. As a new digit is input, the previous bits in BX must be shifted to the left to make room; then an OR operation can be used to insert the new bit into BX. The assembly instructions are
XOR BX,BX ;clear BX
MOV AH,1 ;input char function
INT 21H ;read a character
WHILE_:
CMP AL,ODH ;CR?
JE END WHILE ;yes, done
AND AL,OFH ;no, convert to binary value
SHL BX,1 ;make room for new value
OR BL,AL ;put value into BX
INT 21H ;read a character.
JMP WHILE ;loop back
END WHILE:
Outputting the contents of BX in binary also involves the shift eration. Here we only give an algorithm; the assembly code is left to be done as an exercise.
FOR 16 times DO
Rotate left BX /* BX holds output value,
put msb into CF */
IF CF = 1
THEN
output '1'
ELSE
output '0'
END IF
END FOR
Hex input consists of digits ("0" to "9") and letters ("A" to "F") followed by a carriage return. For simplicity, we assume that (1) only uppercase letters are used, and (2) the user inputs no more than four hex characters. The process of converting characters to binary values is more involved than it was for binary input, and BX must be shifted four times to make room for a hex value.
Clear BX /* BX will hold input value */
input hex character
WHILE character <> CR DO
convert character to binary value
left shift BX 4 times
insert value into lower 4 bits of BX
input a character
END WHILE
Clear BX
'BX = 0000 0000 0000 0000
Input '6', convert to 0110
Left shift BX 4 times
BX = 0000 0000 0000 0000
Insert value into lower 4 bits of BX
BX = 0000 0000 0000 0110
Input 'A', convert to Ah = 1010
Left shift BX 4 times
BX = 0000 0000 0110 0000
Insert value into lower 4 bits of BX
BX = 0000 0000 0110 1010
Input 'B', convert to 1011
Left shift BX 4 times
BX = 0000 0110 1010 0000
Insert value into lower 4 bits of BX
BX = 0000 0110 1010 1011
BX contains 06ABh.
Here is the code:
XOR BX,BX
MOV CL,4
MOV AH,1
INT 21H
WHILE_:
CMP AL,ODH ;CR?
JE END WHILE :yes, exit
;convert character to binary value
CMP AL,39H ;a digit?
JG LETTER. ;no, a letter
;input is a digit
AND AL,OFH
JMP SHIFT
LETTER: SUB
SUB AL,37H
;convert digit to binary value ;go to insert in BX
;convert
;convert
;convert
;convert
;Convert
;Convert
;Convert
;Convert
;convert
;Convert
;Convert
;convert
;convert
;Convert
;convert
;Convert
;convert
;convert
;convert
; Convert
;Convert
;Convert
;Convert
; Convert
;Convert
;Convert
; Convert
; Convert
; Convert
; Convert
;Convert
;Convert
;convert
; Convert
; Convert
; Convert
;convert
;Convert
;Convert
; Convert
;convert
;Convert
; Convert
; Convert
;Convert
; Convert
; Convert
;convert
; Convert
; Convert
;Convert
;convert
;Convert
; Convert
;Convert
; Convert
;Convert
;convert
; Convert
;Convert
; Convert
;convert
; Convert
;Convert
;convert
;convert
; Convert
; Convert
;convert
;convert
;Convert
; Convert
;convert
;convert
; Convert
;convert
;Convert
;convert
; Convert
;convert
; Convert
;convert
Note that the program does not check for valid input characters.
BX contains 16 bits, which equal four hex digit values. To output the contents of BX, we start from the left and get hold of each digit, convert it to a hex character, and output it. The algorithm which follows is similar to that for binary output.
FOR 4 times DO
Move BH to DL BX holds output value */
shift DL 4 times to the right
IF DL < 10
THEN
convert to character in '0'..'9'.
ELSE
convert to character in 'A'..'F'
END IF
output character
Rotate BX left 4 times
END FOR
BX = '4CA9h' = 0100 1100 1010 1001
Move BH to DL
DL = 0100 1100
Shift DL 4 times to the right
DL = 0000 0100
Convert to character and output
DL = 0011 0100 = 34h = '4'
Rotate BX left. 4 times
BX = 1100 1010 1001 0100
Move BH to DL
DL = 1100 1010
Shift DL 4 times to the right
DL = 0000 1100
Convert to character and output
DL = 0100 0011 = 43h = 'C
Rotate BX left 4 times
BX = '1010 1001 0100 1100
Move BH to DL
DL = 1010'1001
Shift DL 4 times to the right
DL = 0000 1010
Convert. to character and output
DL = 0100 0010' = 42h = 'B'
Rotate BX left 4 times
BX = 1001 '0100 1100 1010
Move BH to DL
DL = 1001 0100
Shift DL 4 times to the right
DL = 0000 1001 Convert to character and output DL = 0011 1001 = 39h = '9' Rotate BX 4 times to the left BX = 0100 1100 1010 1001 = original contents Coding the algorithm is left to be done as an exercise.
-
The five logic instructions are AND, OR, NOT, XOR, and TEST.
-
The AND instruction can be used to clear individual bits in the destination.
-
The OR instruction is useful in setting individual bits in the destination. It can also be used to test the destination for zero.
-
The XOR instruction can be used to complement individual bits in the destination. It can also be used to zero out the destination.
-
The NOT instruction performs the one's complement operation on the destination.
-
The TEST instruction can be used to examine individual bits of the destination. For example, it can determine if the destination contains an even or odd number.
-
SAL and SHL shift each destination bit left one place. The most significant bit goes into CF, and a 0 is shifted into the least significant bit.
-
SHR shifts each destination bit right one place. The least significant bit goes into CF, and a 0 is shifted into the most significant bit.
-
SAR operates like SHR, except that the value of the most significant bit is preserved.
-
The shift instructions can be used to do multiplication and division by 2. SHL and SAL double the destination's value unless overflow occurs. SHR and SAR halve the destination's value if it is even; if odd, they halve the destination's value and round down to the nearest integer. SHR should be used for unsigned arithmetic, and SAR for signed arithmetic.
-
ROL shifts each destination bit left one position; the most significant bit is rotated into the least significant bit. For ROR, each bit goes right one position, and the least significant bit is r-tate: into the most significant bit. For both instructions, CF gets the least bit rotated out.
RCL and RCR operate like ROL and ROR, except that a bit rotated out goes into CF, and the value of CF rotates into the destination.
Multiple shifts and rotates can be performed. CL must contain the number of times the shift or rotate is to be executed.
- The shift and rotate instructions are useful in doing binary and hex I/O.
Glossary.
clear
- complement
mask
set
AND
NOT
OR
RCL
- Perform the following logic operations
a. 10101111 AND 10001011
b. 10110001 OR 01001001
c. 01111100 XOR 11011010
d. NOT 01011110
- Give a logic instruction to do each of the following.
a. Clear the even-numbered bits of AX, leaving the other bits unchanged:
b. Set the most and least significant bits of BL, leaving the other bits unchanged.
c. Complement the msb of DX, leaving the other bits unchanged.
d. Replace the value of the word variable WORD1 by its one's complement.
- Use the TEST instruction to do each of the following.
a. Set ZF if the contents of AX is zero.
b. Clear ZF if BX contains an odd number.
c. Set SF if DX contains a negative number.
d. Set ZF if DX contains a zero or positive number.
e. Set PF if BL contains an even number of 1 bits.
- Suppose AL contains 11001011b and CF = 1. Give the new contents of AL after each of the following instructions is executed. Assume the preceding initial conditions for each part of this question.
a. SHL'AL;1.
b. SHR AL;1.
c. ROL'AL,CL if CL contains 2
d. ROR AL,CL if CL contains 3
e. SAR AL,CL if CL contains 2
f. RCL AL;1.
g. RCR AL,CL if CL contains 3
- Write one or more instructions to do each of the following. Assume overflow does not occur.
a. Double the value of byte variable BS.
b. Multiply the value of AL by 8.
c. Divide 32142 by 4 and put the quotient in AX.
d. Divide -2145 by 16 and put the quotient in BX.
- Write instructions to do each of the following:
a. Assuming AL has a value less than 10, convert it to a decimal character.
b. Assuming DL contains the ASCII code of an uppercase letter, convert it to lower case.
- Write instructions to do each of the following.
a. Multiply the value of BL by 10d. Assume overflow does not occur.
b. Suppose AL contains a positive number. Divide AL by 8, and put the remainder in AH. (Hint: use ROR.)
- Write a program that prompts the user to enter a character, and on subsequent lines prints its ASCII code in binary, and the number of 1 bits in its ASCII code.
Sample execution:
TYPE A CHARACTER: A
THE ASCII CODE OF A IN BINARY IS 01000001
THE NUMBER OF 1 BITS IS 2
- Write a program that prompts the user to enter a character and prints the ASCII code of the character in hex on the next line. Repeat this process until the user types a carriage return.
Sample execution:
TYPE A CHARACTER: Z
THE ASCII CODE OF Z IN HEX IS 5A.
TYPE A CHARACTER:
- Write a program that prompts the user to type a hex number of four hex digits or less, and outputs it in binary on the next line. If the user enters an illegal character, he or she should be prompted to begin again. Accept only uppercase letters.
Sample execution:
TYPE A HEX NUMBER (0 TO FFFF): 1a
ILLEGAL HEX DIGIT, TRY AGAIN: 1ABC
IN BINARY IT IS 0001101010111100
Your program may ignore any input beyond four characters.
- Write a program that prompts the user to type a binary number of 16 digits or less, and outputs it in hex on the next line. If the user enters an illegal character, he or she should be prompted to begin again.
Sample execution:
TYPE A BINARY NUMBER, UP TO 16 DIGITS: 11100001 IN HEX IT IS E1
Your program may ignore any input beyond 16 characters.
- Write a program that prompts the user to enter two binary numbers of up to 8 digits each, and prints their sum on the next line in binary. If the user enters an illegal character, he or she should be prompted to begin again. Each input ends with a carriage return.
Sample execution:
TYPE A BINARY NUMBER, UP TO 8 DIGITS:11001010 TYPE A BINARY NUMBER, UP TO 8 DIGITS:10011100 THE BINARY SUM IS 101100110
- Write a program that prompts the user to enter two unsigned hex numbers, 0 to FFFFh, and prints their sum in hex on the next line. If the user enters an illegal character, he or she should be prompted to begin again. Your program should be able to handle the possibility of unsigned overflow. Each input ends with a carriage return.
Sample execution:
TYPE A HEX NUMBER, 0 - FFFF: 21AB TYPE A HEX NUMBER, 0 - FFFF: FE03 THE SUM IS 11FAE
- Write a program that prompts the user to enter a string of decimal digits, ending with a carriage return, and prints their sum in hex on the next line. If the user enters an illegal character, he or she should be prompted to begin again.
Sample execution:
Sample execution:
ENTER A DECIMAL DIGIT STRING: 1299843
THE SUM OF THE DIGITS IN HEX IS 0024
The stack segment of a program is used for temporary storage of data and addresses. In this chapter we show how the stack can be manipulated, and how it is used to implement procedures.
In section 8.1, we introduce the PUSH and POP instructions that add and remove words from the stack. Because the last word to be added to the stack is the first to be removed, a stack can be used to reverse a list of data; this property is exploited in section 8.2.
Procedures are extremely important in high- level language programming, and the same is true in assembly language. Sections 8.3 and 8.4 discuss the essentials of assembly language procedures. At the machine level, we can see exactly how a procedure is called and how it returns to the calling program. In section 8.5, we present an example of a procedure that performs binary multiplication by bit shifting and addition. This example also gives us an excuse to learn a little more about the DEBUG program.
A stack is one- dimensional data structure. Items are added and removed from one end of the structure; that is, it is processed in a "last- in, first- out" manner. The most recent addition to the stack is called the top of the stack. A familiar example is a stack of dishes; the last dish to go on the stack is the top one, and it's the only one that can be removed easily.
A program must set aside a block of memory to hold the stack. We have been doing this by declaring a.stack segment; for example,
.STACK 100H.
When the program is assembled and loaded in memory, SS will contain the segment number of the stack segment. For the preceding stack declaration, SP, the stack pointer, is initialized to 100h. This represents the empty stack position: When the stack is not empty, SP contains the offset address of the top of the stack.
To add a new word to the stack we PUSH it on, The syntax is.
PUSH source
where source is a 16- bit register or memory word. For example,
PUSH AX
Execution of PUSH causes the following to happen:
-
SP is decreased by 2.
-
A copy of the source content is moved to the address specified by SS:SP. The source is unchanged.
The instruction PUSHF, which has no operands, pushes the contents of the FLAGS register onto the stack.
Initially, SP contains the offset address of the memory location immediately following the stack segment; the first PUSH decreases SP by 2, making it noint to the last word in the stack segment. Because each PUSH
1C After PUSH BX
To remove the top item from the stack, we POP It. The syntax is
POP destination
where destination is a 16- bit register (except IP) or memory word. For example, POP BX
Executing POP causes this to happen:
-
The content of SS.SP (the top of the stack) is moved to the destination.
-
SP is increased by 2.
Figure 8.2 shows how POP works.
The instruction POPF pops the top of the stack into the FLAGS register. There is no effect of PUSH, PUSHF, POP, POPF on the flags.
Note that PUSH and POP are word operations, so a byte instruction such as
Illegal: PUSH DL
is illegal. So is a push of immediate data, such as
Illegal: PUSH 2
Note: an immediate data push is legal for the 80186/80486 processors. These processors are discussed in Chapter 20.
In addition to the user's program, the operating system uses the stack for its own purposes. For example, to implement the INT 21h functions, DOS saves any registers it uses on the stack and restores them when the interrupt routine is completed. This does not cause a problem for the user
because any values DOS pushes onto the stack are popped off by DOS before It returns control to the user's program.
Because the stack behaves in a last- in, first- out manner, the order :hat items come off the stack is the reverse of the order they enter it. The following program uses this property to read a sequence of characters and display them in reverse order on the next line.
Display a
Initialize count to 0
Read a character
WHILE character is not a carriage return DO
push character onto the stack
increment count
read a character
END WHILE;
Go to a new line
FOR count times DO
pop a character from the stack;
display it;
END FOR
Here is the program:
TITLE PGM8_1:REVERSE INPUT
2: .MODEL SMALL
3: .STACK IOCH
4: .CODE
5: MAIN PROC
6: ;display user prompt
7: MOV AH,2 ;prepare to display
8: MOV DL,'?' ;char to display
9: INT 21H ;display '?'
10: ;initialize character count
11: .XOR CX,CX ;count = 0.
12: ;read a character
13: MOV AH,1 ;prepare to read
14: INT 21H ;read a char
15: ;while character is not a carriage return do
16: WHILE
17: CMP AL,ODH ;CR?
18: .JE END WHILE ;yes, exit loop
19: ;save character on the stack and increment count
20: PUSH AX ;push it on stack
21: INC CX ;count = count + 1
22: ;read a character
23: INT 21H ;read a char
24: JMP WHILE ;loop back
25: END WHILE:
26: ;go to a new line
27: .MOV AH,2 ;display char fcn
28: .MOV DL,ODH ;CR
29: INT 21H ;execute
30: MOV DL,0AH ;LF
Because the number of characters to be entered is unknown; the program uses CX to count them. CX controls the FOR loop that displays the characters in reverse order.
In lines 16- 24, the program executes a WHILE loop that pushes characters on the stack and reads new ones, until a carriage return is entered. Even though the input characters are in AL, it's necessary to save all of AX on the stack, because the operand of PUSH must be a word.
When the program exits the WHILE loop (line 25), all the characters are on the stack, with the low byte of the top of the stack containing the last character to be entered. AL contains the ASCII code of the carriage return.
At line 32, the program checks to see if any characters were read. If not, CX contains 0 and the program jumps to the DOS exit. If any characters were read, the program enters a FOR loop that repeatedly pops the stack into DX (so that DL will get a character code), and displays a character.
Sample executions:
In Chapter 6, we mentioned the idea of top- down program design. The idea is to take the original problem and decompose it into a series of subproblems that are easier to solve than the original problem. High- level languages usually employ procedures to solve these subproblems, and we can do the same thing in assembly language. Thus an assembly language program can be structured as a collection of procedures.
One of the procedures is the main procedure, and it contains the entry point to the program. To carry out a task, the main procedure calls one of the other procedures. It is also possible for these procedures to call each other, or for a procedure to call itself.
When one procedure calls another, control transfers to the called procedure and its instructions are executed; the called procedure usually returns control to the caller at the next instruction after the call statement (Figure 8.3). For high- level languages, the mechanism by which call and return are implemented is hidden from the programmer, but in assembly language we can see how it works (see section 8.4).
The syntax of procedure declaration is the following:
name PROC type
;body of the procedure
RET
name ENDP
Name is the user- defined name of the procedure. The optional operand type is NEAR or FAR (if type is omitted, NEAR is assumed). NEAR means that the statement that calls the procedure is in the same segment as the procedure itself; FAR means that the calling statement is in a different segment. In the following, we assume all procedures are NEAR; FAR procedures are discussed in Chapter 14. .
Figure 8.3 Procedure Call and Return
The RET (return) instruction causes control to transfer back to the calling procedure. Every procedure (except the main procedure) should have a RET someplace; usually it's the last statement in the procedure.
A procedure must have a way to receive values from the procedure that calls it, and a way to return results. Unlike high- level language procedures, assembly language procedures do not have parameter lists, so it's up to the programmer to devise a way for procedures to communicate. For example, if there are only a few input and output values, they can be placed in registers. The general issue of procedure communication is discussed in Chapter 14.
In addition to the required procedure syntax, it's a good idea to document a procedure so that anyone reading the program listing will know what the procedure does, where it gets its input, and where it delivers its output. In this book, we generally document procedures with a commer! block like this:
; (describe what the procedure does)
; input: (where it receives information from
the calling program)
; output: (where it delivers results to
the calling program)
; uses: (a list of procedures that it calls)
To invoke a procedure, the CALL instruction is used. There are two kinds of procedure calls, direct and indirect. The syntax of a direct procedure call is
CALL name
where name is the name of a procedure. The syntax of an indirect procedure call is
CALL address_expression
where address_expression specifies a register or memory location containing the address of a procedure.
Executing a CALL instruction causes the following to happen
- The return address to the calling program is saved on the stack
This is the offset of the next instruction after the CALL state
ment. The segment offset of this instruction is in CS:IP at the
time the call is executed.
- IP gets the offset address of the first instruction of the procedure. This transfers control to the procedure. See Figures 8.4A and 8.4B. To return from a pop value, the instruction
RET pop_value
is executed. The integer argument pop_value is optional. For a NEAR procedure, execution of RET causes the stack to be popped into IP. If a pop_value N is specified, it is added to SP, and thus has the effect of removing N additional bytes from the stack. CS:IP now contains the segment:offset of the return address, and control returns to the calling program. See Figures 8.5A and 8.5B.
As an example, we will write a procedure for finding the product of two positive integers A and B by addition and bit shifting. This is one way unsigned multiplication may be implemented on the computer (in Chapter 9 we introduce the multiplication instructions).
Product = 0 REPEAT IF 1sb of B is 1 (Recall 1sb = least significant bit) THEN
THEN
Product = Product + A
END_IF
Shift left A
Shift right B
UNTIL B = 0
For example, if A = 111b = 7 and B = 1101b = 13
Product = 0
Since 1sb of B is 1, Product = 0 + 111b = 111b
Shift left A: A = 1110b
Shift right B: B = 110b
Since 1sb of B is 0,
Shift left A: A = 11100b
Shift right B: B = 11b
Since 1sb of B is 1
Product = 111b + 11100b = 109011b
Shift left A: A = 111000b
Shift right B: B = 1
Since 1sb of B is 1
Product = 109011b + 111000b = 1011011b
Shift left A: A = 1110000b
Shift right B: B = 0
Since 1sb of B = 0
Return Product = 1011011b = 91d
Note that we get the same answer by performing the usual decimal multiplication process on the binary numbers:
In the following program, the algorithm is coded as a procedure MULTIPLY. The main program has no input or output; we will use DEBUG for the 1/0
1: TITLE PGM8_2: MULTIPLICATION BY ADD AND SHIFT
2: MODEL SMALL
3: STACK 100H
4: CODE-
5: MAIN PROC
6: ;execute in DEBUG. Place A in AX and B in BX
7: CALL MULTIPLY
8: ;DX will contain the product
9: MOV AH,4CH
10: INT 21H
11: MAIN ENDP
12: MULTIPLY PROC
13: ;multiplies two nos. A and B by shifting and addition
14: ;input: AX = A, BX = B. Nos..in range 0 - . FFh
15: ;output: DX = product
16: PUSH AX
17: PUSH BX
18: XOR DX,DX ;product = 0
19: REPEAT:
20: ;if B is odd
21: TEST BX,1 ;is B odd?
22: JZ END_IF no,even
23: ;then
24: ADD DX,AX ;prod = prod + A
25: END_IF:
26: SHL AX,1 ;shift left A
27: SHR BX,1 ;Shift right B
28: ;until
29: JNZ REPEAT
30: POP BX
31: POP AX
32: RET
33: MULTIPLY ENDP
34: END MAIN
"Procedure MULTIPLY receives its input A and B through registers AX and BX, respectively. Values are placed- in these registers by the user inside the DEBUG program; the product is returned in DX- In order to avoid overflow, A and B are restricted to range from 0 to FFh.
A procedure usually begins by saving all the registers it uses on the stack and ends by restoring these registers. This is done because the calling program may have data stored in registers, and the actions of the procedure could cause unwanted side effects if the registers are not preserved. Even though it's not really necessary in this program, we illustrate this practice by pushing AX and BX on the stack in lines 16 and 17, and restoring them in lines 30 and 31. The registers are popped off the stack in the reverse order, that they were pushed on.
After clearing DX, which will hold the product, the procedure enters a REPEAT loop (lines 19- 29). At line 22, the procedure checks BX's least significant bit. If the lsb of BX is 1, then AX is added to the product in DX; if the lsb of BX is 0, the procedure skips to line 26. Here AX is shifted left, and BX is shifted right; the loop continues until BX = 0. The procedure exits with the product in DX.
After assembling and linking the program, we take it into DEBUG (in the following, the user's response appears in boldface):
C> DEBUG PGM8_2. exe
DEBUG responds with its command prompt "- ". To get a listing of the program, we use the U (unassemble) command.
- U
177F:0000 E80400 CALL 0007
177F:0003 B44C MOV AH,4C
177F:0005 CD21 INT 21
177F:0007 50 PUSH AX
177F:0008 53 PUSH BX
177F:0009 33D2 XOR DX,DX
177F:000B F7C30100 TEST BX,0001
177F:000F 7402 JZ 0013
177F:0011 03D0 ADD DX,AX
177F:0013 D1E0 SHL AX,1
1.77F:0015 D1EB SHR BX,1
177F:0017 75F2 JNZ 0008
177F:0019 5B POP BX
177F:001A 58 POP AX
177F:001B C3 RET
177F:001C E3D1 JCX2 FFET
177F:001E E38B JCX2 FFAB
The U command causes DEBUG to interpret the contents of memory as machine language instructions. The display gives the segment:offset of each instruction, the machine code, and the assembly code. All numbers are expressed in hex. From the first statement, CALL 0007, we can see that procedure MAIN extends from 0000 to 0005; procedure MULTIPLY begins at 0007 and ends at 001B with RET. The instructions after this are garbage.
Before entering the data, let's look at the registers.
- R
AX=0000 BX=0002 CX=001C DX=0000 SP=0100 BP=0000 SI=0000 DI=0000
DS=176F ES=176F SS=1781 CS=177T IP=0000 NV UP EI PL NZ NA PO N.
177F:0000 E80400 CALL 0007
The initial value of
DSS:F0 FF
1781:00F0 00 00 00 00 00 00 6F 17- A4 13 07.00 6F 17 00 00
The command DSS:F0 FF means to display the memory bytes from SS:F0 to SS:FF. This is the last 16 bytes in the stack segment. The contents of each byte is displayed as two hex digits. Because the stack is empty, everything in this display is garbage.
Before executing the program, we need to place the numbers A and B in AX and BX, respectively. We will use A = 7 and B = 13 = Dh. To enter A, we use the R command:
- RAX AX 0000:7
The command RAX means that we want to change the content of AX. DEBUG displays the current value (0000), followed by a colon, and waits for us to enter the new value. Similarly we can change the initial value of B in BX:
- RBX
BX 0000:0
Now let's look at the registers again.
- R
AX=0007 BX=000D CX=001C DX=0000 SP=0100 BP=0000 SI=0000 DI=0000 DS=176F ES=176F SS=1781 CS=177F IP=0000 NV UP EI PL NZ NA PO NC 177F:0000 E80400 CALL 0007
We see that AX ad BX now contain the initial values.
To see the effect of the first instruction, CALL 0007, we use the T (trace) command. It will execute a single instruction and display the registers.
We notice two changes in the registers: (1) IP now contains 0007, the starting offset of procedure MULTIPLY; and (2) because the CALL instruction pushes the return address to procedure AAIN on the stack, SP has decreased from 0100h to 00FEh. Here are the last 16 bytes of the stack segment again:
The return address is 0003, but is displayed as 03 00. This is because DEBUG displays the low byte of a word before the high byte.
The first three instructions of procedure MULTIPLY push AX and BX onto the stack, and clear DX. To see the effect, we use the G (go) command. The syntax is
G offset
It causes the program to execute instructions and stop at the specified offset. From the unassembled listing given earlier, we can see that the next instruction after XOR DX,DX is at offset 000Bh.
We see that the two PUSHes have caused SP to decrease by 4, from 00FEh to 00FAh. Now the stack looks like this:
The stack now contains three words; the values of BX (000D), AX (0007), and the return address (0003). These are shown as 0D 00 07 00 03 00.
Now let's watch the procedure in action. To do so, we will execute to the end of the REPEAT loop at offset 0017h:
- G17
AX=000E .BX=0006 CX=001C DX=0007 SP=00FA BP=0000 SI=0000 DI=0000 DS=176F ES=176F SS=1791 CS=177F IP=0017 NV UP EI PL NZ AC PE CY 177F:0017 75F2 JNZ 000B
Because the initial value of B in BX was 0Dh = 1101b, the 1sb of BX is 1, so AX is added to the product in DX, giving 111b = 0007h. AX is shifted left, which doubles A to 14d = 000Eh, and BX is shifted right, which halves BX (and rounds down) to 0006h = 110b.
To get to the top of the loop, we'll use the T command again:
- T
AX=000L BX=0006 CX=001C DX=0007 SP=00FA BP=0000 SI=0000 DI=0000 DS=176F ES=176F SS=1781 CS=177F IP=000B NV UP EI PL NZ AC PE CY 177F:000B F7C30100 TEST BX,0001
and execute again to the bottom:
- G17
AX=001C BX=0003 CX=001C DX=0007 SP=00FA BP=0000 SI=0000 DI=0000 DS=176F ES=176F SS=1781 CS=177F IP=0017 NV UP EI PL NZ AC PE NC 177F:0017 75F2 JNZ 000B
Because BX = 0000h = 110b, the 1sb of BX is 0, so the product in DX stays the same. AX is shifted left to 11100b = 1Ch and BX is shifted right to 11b = 3h.
After two more trips through the loop, the product is in DX. Watch AX, BX, and DX change:
AX=001C EX=0003 CX=001C DX=0007 SP=00FA BP=0000 SI=0000 DI=0000 DS=176F ES=176F SS=1781 CS=177F IP=000B NV UP EI PL NZ AC PE NC 177F:000B F7C30100a- TEST BX,0001
- G17
AX=0038 BX=0001 CX=001C DX=0023 SP=00FA BP=0000 SI=0000 DI=0000 DS=176F ES=176F SS=1781 CS=177F IP=0017 NV UP EI PL NZ AC PO CY 177F:0017 75F2 JNZ 000B
- T
AX=0038 BX=0001 CX=001C DX=0023 SP=00FA BP=0000 SI=0000 DI=0000
DS=176F ES=176F SS=1781 CS=177F IP=000B NV UP EI PL NZ AC PO CY
177F:000B F7C30100 TEST BX,0001
- G17
AX=0070 BX=0000 CX=001C DX=005B SP=00FA BP=0000 SI=0000 DI=0000
DS=176F ES=176F SS=1791 CS=177F IP=0017 NV UP EI PL ZR AC PE CY
177F:0017 75F2 JNZ 000B
The last right shift made
To terminate the procedure, we trace through the JNZ and the two
POP instructions:
- T
AX=0070 BX=0000 CX=001C DX=005B SP=00FA BP=0000 SI=0000 DI=0000
DS=176F ES=176F SS=1781 CS=177F IP=0019 NV UP EI PL ZR AC PE CY
177F:0019 5B POP BX
- T
AX=0070 BX=000D CX=001C DX=005B SP=00FC BP=0000 SI=0000 DI=0000
DS=176F ES=176F SS=1781 CS=177F IP=001A NV UP EI PL ZR AC PE CY
177F:001A 58 POP AX
- T
AX=0007 BX=000D CX=001C DX=005B SP=00FE BP=0000 SI=0000 DI=0000
DS=176F ES=176F SS=1781 CS=177F IP=001B NV UP EI PL ZR AC PE CY
177F:001B C3 RET
The two POPs have restored AX and BX to their original values. Let's look at the stack:
- DSS:F0 FF
1781:00F0 70 00 70 CO 07 00 00 00- 1B 00 7F 17 A4 13 03 00
The values 000D and 0007 are no longer in the display. This is not a result of the POP instruction; it's because DEBUG is also using the stack. Finally, we trace the RET:
RET causes IP to get 0003, the return address to MAIN. SP goes back to 100h, its original value. To finish executing the program, we just type G:
and we exit DEBUG by typing Q (quit).
The stack is a temporary storage area used by both application programs and the operating system.
The stack is a last- in, first- out data structure. SS:SP points to the top of the stack. . .
The stack- altering instructions are PUSH, PUSHF, POP, and POPF. PUSH adds a new top word to the stack, and POP removes the top word. PUSHF saves the FLAGS register on the stack and POPF puts the stack top into the FLAGS register.
SP decreases by 2 when PUSH or PUSHF is executed, and it increases by 2 when POP or POPF is executed. SP is initialized to the first word after stack segment when the program is loaded.
A procedure is a subprogram. Assembly language programs are typically broken into two procedures. One of the procedures is the main procedure, which contains the entry point to the program. Procedures may call other procedures, or themselves.
There are two kinds of procedures, NEAR and FAR. A NEAR proce dure is in the same code segment as the calling program, and a FAR procedure is in a different segment.
The CALL instruction is used to invoke a procedure. For a NEAR procedure, execution of CALL causes the offset address of the next instruction in line after the CALL to be saved on the stack, and the IP gets the offset of the first instruction in the procedure.
-
Procedures end with a RET instruction. Its execution causes the stack to be popped into IP, and control returns to the calling program. In order for the return address to be accessible, the procedure must ensure that it is at the top of the stack when RET is executed.
-
In assembly language, procedures often pass data through registers.
direct procedure call
FAR procedure
indirect procedure call NEAR procedure
top of the stack
A procedure call of form CALL name A procedure that can be called by procedures residing in any segment A procedure call of form CALL addr_exp A procedure that can only be called by another procedure residing in the same segment The last word of data added to the stack
New Instructions
CALL POPF PUSHF
POP PUSH RET
- Suppose the stack segment is declared as follows:
.STACK 100h
a. What is the hex contents of SP when the program begins?
b. What is the maximum hex number of words that the stack may contain?
- Suppose that AX = 1234h, BX = 5678h, CX = 9ABCh, and SP = 100h. Give the contents of AX, BX, CX, and SP after executing the following instructions:
PUSH AX
PUSH BX
XCIG AX,CX
POP CX
PUSH AX.
POP BX
-
When the stack has completely filled the stack area, SP = 0. If another word is pushed onto the stack, what would happen to SP? What might happen to the program?
-
Suppose a program contains the lines
CALL PROC1
MOV AX,BX
and (a) instruction MOV AX,BX is stored at 08FD:0203h, (b) PROC1 is a NEAR procedure that begins at 08FD:300h, (c) SP = 010Ah.
What are the contents of IP and SP just after CALL PROC1 is executed? What word is on top of the stack?
- Suppose
$\mathsf{S P} = 0200\mathsf{h}$ top of stack$= 012\mathsf{A h}$ . What are the contents of IP and SP
a. after RET is executed, where RET appears in a NEAR procedure?
b. after RET 4 is executed, where RET appears in a NEAR procedure
- Write some code to
a. place the top of the stack into AX, without changing the stack contents.
b. place the word that is below the stack top into CX, without changing the stack contents. You may use AX.
c. exchange the top two words on the stack. You may use AX and BX.
- Procedures are supposed to return the stack to the calling program in the same condition that they received it. However, it may be useful to have procedures that alter the stack. For example, suppose we would like to write a NEAR procedure SAVE REGS that saves BX, CX, DX, SI, DI, BP, DS, and ES on the stack. After pushing these registers, the stack would look like this:
ES ·content
DX content
CX content
BX content
return_address (offset)
Now, unfortunately, SAVE REGS can't return to the calling program, because the return address is not at the top of the stack.
a. Device a way to implement a procedure SAVE_REGS that gets around this problem (you may use AX to do this).
b. Write a procedure RESTORE REGS that restores the registers that SAVE REGS has saved.
-
Write a program that lets the user type some text, consisting of words separated by blanks, ending with a carriage return, and displays the text in the same word order as entered, but with the letters in each word reversed. For example, "this is a test" becomes "sint si a tset". Hint: modify program PGM8,2.ASM in section 8.3.
-
A problem in elementary algebra is to decide if an expression containing several kinds of brackets, such as,
$\left[.,.\right]\left.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.$ is correctly bracketed. This is the case if (a) there are the same number of left and right brackets of each kind, and (b) when a right bracket appears, the most recent preceding unmatched left bracket should be of the same type. For example,
Correct bracketing can be decided by using a stack. The expression is scanned left to right. When a left bracket is encountered, it is pushed onto the stack. When a right bracket is encountered,
the stack is popped (if the stack is empty, there are too many right brackets) and the brackets are compared. If they are of the same type, the scanning continues. If there is a mismatch, the expression is incorrectly bracketed. At the end of the expression, if the stack is empty the expression is correctly bracketed. If the stack is not empty, there are too many left brackets.
Write a program that lets the user type in an algebraic expression, ending with a carriage return, that contains round (parentheses), square, and curly brackets. As the expression is being typed in, the program evaluates each character. If at any point the expression is incorrectly bracketed (too many right brackets or a mismatch between left and right brackets), the program tells the user to start over. After the carriage return is typed, if the expression is correct, the program displays "expression is correct." If not, the program displays "too many left brackets". In both cases, the program asks the user if he or she wants to continue. If the user types 'Y', the program runs again.
Your program does not need to store the input string, only check it for correctness.
Sample execution:
ENTER AN ALGEBRAIC EXPRESSION:
(a + b)I CO MANY RIGHT BRACKETS. BEGIN AGAIN!
ENTER AN ALGEBRAIC EXPRESSION
(a + {b + c} x d)
EXPRESSION IS CORRECT
TYPE Y IF YOU WANT TO CONTINUE:Y
ENTER AN ALGEBRAIC EXPRESSION:
[a + b x (c - d) - e]ERACKET MISMATCH. BEGIN AGAIN!
ENTER AN ALGEBRAIC EXPRESSION:
(a + {b - {c x (d - e) } ] + f)
TOO MANY LEFT BRACKETS. BEGIN AGAIN!
ENTER AN ALGEBRAIC EXPRESSION:
I'VE HAD ENOUGH
EXPRESSION IS CORRECT
TYPE Y IF YOU WANT TO CONTINUE:N
- The following method can be used to generate random numbers in the range 1 to 32767.
Start with any number in this range.
Shift left once.
Replace bit 0 by the XOR of bits 14 and 15.
Clear bit 15.
Write the following procedures:
a. A procedure READ that lets the user enter a binary number and stores it in AX. You may use the code for binary input given in section 7.4.
b. A procedure RANDOM that receives a number in AX and returns a random number in AX.
c. A procedure WRITE that displays AX in binary. You may use the algorithm given in section 7.4.
Write a program that displays a '?', calls READ to read a binary number, and calls RANDOM and WRITE to compute and display 100 random numbers. The numbers should be displayed four per line, with four blanks separating the numbers.
In.Chapter 7, we.saw how to do multiplication and division by shifting the bits in a byte or word. Left and right shifts can be used for multiplying and dividing by powers of 2. In this chapter, we introduce instructions for multiplying and dividing any numbers.
The process of multiplication and division is different for signed and unsigned numbers, so there are different instructions for signed and unsigned multiplication and division. Also, these instructions have byte and word forms. Sections 9.1 through 9.4 cover the details.
One of the most useful applications of multiplication and division is to implement decimal input and output. In section 9.5, we write procedures to carry out these operations. This application greatly extends our program's I/O capability.
9.1 MUL and IMUL
In binary multiplication, signed and unsigned numbers must be treated differently. For example, suppose we want to multiply the eight- bit numbers 1000000 and 1111111. Interpreted as unsigned numbers, they represent 128 and 255, respectively. The product is 32,640 = 0111111110000000. However, taken as signed numbers, they represent - 128 and - 1, respectively, and the product is 128 = 00000000100000000.
Because signed and unsigned multiplication lead to different results, there are two multiplication instructions: MUL (multiply) for unsigned
multiplication and IMUL (integer multiply) for signed multiplication. These instructions multiply bytes or words. If two bytes are multiplied, the product is a word (16 bits). If two words are multiplied, the product is a doubleword (32 bits). The syntax of these instructions is
MUL source and IMUL source
For byte multiplication, one number is contained in the source and the other is assumed to be in AL. The 16- bit product will be in AX. The source may be a byte register or memory byte, but not a constant.
For word multiplication, one number is contained in the source and the other is assumed to be in AX. The most significant 16 bits of the doubleword product will be in DX, and the least significant 16 bits will be in AX (we sometimes write this as DX:AX). The source may be a 16- bit register or memory word, but not a constant.
For multiplication of positive numbers (O in the most significant bit), MUL and IMUL give the same result.
SF,ZF;AF,PF: undefined
CF/OF:
After MUL, CF/OF
= 0 if the upper half of the result is zero.
= 1 otherwise.
After IMUL, CF/OF
= 0 if the upper half of the result is the sign extension of the lower half (this means that the bits of the upper half are the same as the sign bit of the lower half).
= 1 otherwise.
= 1 otherwise.
For both MUL and IMUL, CF/OF = 1 means that the product is too big to fit in the lower half of the destination (AL for byte multiplication, AX for word multiplication).
To illustrate MUL and IMUL, we will do several examples. Because hex multiplication is usually difficult to do, we'll predict the product by converting the hex values of multiplier and multiplicand to decimal, doing decimal multiplication, and converting the product back to hex.
Example 9.1 Suppose AX contains 1 and BX contains FFFFh:
Instruction Decimal product . Hex product DX AX CF/OF MUL BX 65535 0000FFFF O000 FFFF 0 IMUL BX FFFFFFF FFFF FFFF 0
For MUL, DX = 0, so CF/OF = 0.
For IMUL, the signed interpretation of BX is - 1, and the product is also - 1. In 32 bits, this is FFFFHHFh. CF/OF = 0 because DX is the sign extension of AX.
Example 9.2 Suppose AX contains FFFFh and BX contains FFFFh:
Instruction Decimal product Hex product DX AX CF/OF
MUL BX 4294836225 FFEE0001 FFFE 0001 1
IMUL BX 1 00000001 0000 0001 0
For MUL, CF/OF = 1 because DX is not 0. This reflects the fact that the product FFFE0001h is too big to fit in AX.
For IMUL, AX and BX both contain - 1, so the product is 1. DX has the sign extension of AX, so CF/OF = 0.
Example 9.3 Suppose AX contains OFFH:
Instruction Decimal product Hex product DX AX CF/OF
MUL AX 16769025 00FFE001 00FF E001 1
IMUL AX 16769025 00FFE001 00FF E001 1
Because the msb of AX is 0, both MUL and IMUL give the same product. Because the product is too big to fit in AX, CF/OF = 1.
Example 9.4 Suppose AX contains 0100h and CX contains FFFFh:
Instruction Decimal product Hex product DX AX CF/OF
MUL CX 16776960 00FFFF00 00FF FF00 1
IMUL CX - 256 FFFFF00 FFFF FF00 0
For MUL, the product FFFF00 is obtained by attaching two zeros to the source value FFFH. Because the product is too big to fit in AX, CF/OF = 1.
For IMUL, AX contains 256 and CX contains - 1, so the product is - 256, which may be expressed as FFO0h in 16 bits. DX has the sign extension of AX, so CF/OF = 0.
Example 9.5 Suppose AL contains 00h and Bl. contains FfH:
Instruction Decimal product Hex product AH AL CF/OF
MUL BL 128 7F80 7F 80 1
IMUL BI. 128 0080 00 80 1
For byte multiplication, the 16- bit product is contained in AX.
For MUL, the product is 7F80. Because the high eight bits are not (1, CF/OF = 1.
For IMUL, we have a curious situation. 80h = - 128, FfH = - 1, so the product is 128 = 0080h. AH does not have the sign extension of AL, so CF/OF = 1. This reflects the fact that AL does not contain the correct answer in a signed sense, because the signed decimal interpretation of 80h is - 128.
To get used to programming with MUL and IMUL, we'll show how some simple operations can be carried out with these instructions.
Example 9.6 Translate the high- level language assignment statement A
MOV AX,5 ;AX = 5
IMUL A ;AX = 5 x A
MOV A,AX ;A = 5 x A
MOV AX,12 ;AX = 12
IMUL. B ;AX = 12 x B
SUB A,AX ;A = 5 x A - 12 x B
Example 9.7 Write a procedure FACTORIAL that will compute N! for a positive integer N. The procedure should receive N in CX and return N! in AX. Suppose that overflow does not occur.
Solution: The definition of N! is
Here is an algorithm:
product = 1
term = N
FOR N times DO
product = product x term
term = term - 1
ENDFOR
It can be coded as follows:
FACTORIAL PROC
;computes N!
;input: CX = N
;output: AX = N!
MOV AX,1 ;AX holds product
TOP:
MUL CX ;product = product x term
LOOP TOP
RET
FACTORIAL ENDP
Here CX is both loop counter and term; the LOOP instruction automatically "decrements"it on "ach"iteration through"the loop. We assume the product does not overflow 16 bits.
When division is performed, we obtain two results, the quotient and the remainder. As with multiplication, there are separate instructions for unsigned and signed division; DIV (divide) is used for unsigned division and IDIV (integer divide) for signed division. The syntax is
DIY divisor
and
IDIV divisor
These instructions divide 8 (or 16) bits into 16 (or 32) bits. The quotient and remainder have the same size as the divisor.
In this form, the divisor is an 8- bit register or memory byte. The 16- bit dividend is assumed to be in AX. After division, the 8- bit quotient is in AL and the 8- bit remainder is in AH. The divisor may not be a constant.
Here the divisor is a 16- bit register or memory word. The 32- bit dividend is assumed to be in DX:AX. After division, the 16- bit quotient is in AX and the 16- bit remainder is in DX. The divisor may not be a constant.
For signed division, the remainder has the same sign as the dividend. If both dividend and divisor are positive, DIV and IDIV give the same result.
The effect of DIV/IDIV on the flags is that all status flags are undefined.
It is possible that the quotient will be too big to fit in the specified destination (AL or AX). This can happen if the divisor is much smaller than the dividend. When this happens, the program terminates (as shown later) and the system displays the message "Divide Overflow".
Example 9.8 Suppose DX contains 0000h, AX contains 0005h, and BX contains 0002h.
Dividing 5 by 2 yields a quotient of 2 and a remainder of 1. Because both dividend and divisor are positive, DIV and IDIV give the same results.
Example 9.9 Suppose DX contains 0000h, AX contains 0005h, and BX contains FFEh.
Instruction Decimal Decimal AX DX quotient remainder DIV BX 0 5 0000 0005 IDIV BX - 2 1 FFIE 0001
For DIV, the dividend is 5 and the divisor is FFFEh = 65534; 5 divided by 65534 yields a quotient of 0 and a remainder of 5.
For IDIV, the dividend is 5 and the divisor is FFFEh = - 2; 5 divided by - 2 gives a quotient of - 2 and a remainder of 1.
Example 9.10 Suppose DX contains FFFFh, AX contains FFFBh, and BX contains 0002.
Instruction Decimal Decimal AX DX quotient remainder IDIV BX - 2 - 1 FFFE FFFF DIV BX DIVIDE OVERFLOW
For IDIV, DX:AX = FFFFFFFBh = - 5, BX = 2 - 5 divided by 2 gives a quotient of - 2 = FFFEh and a remainder of - 1 = FFFFh.
For DIV, the dividend DX:AX = FFFFFFFBh = 4294967291 and the divisor = 2. The actual quotient is 2147483646 = 7FFFFFFEh. This is too big to fit in AX, so the computer prints DIVIDE OVERFLOW and the program terminates. This shows what can happen if the divisor is a lot smaller than the dividend.
Example 9.11 Suppose AX contains OOFBh and BL contains FFh.
Instruction Decimal Decimal AX AL quotient remainder DIV BL 0 251 FB 00 IDIV BL DIVIDE OVERFLOW
For byte division, the dividend is in AX; the quotient is in AL and the remainder in AH.
For DIV, the dividend is OOFBh = 251 and the divisor is FFh = 256. Dividing 251 by 256 yields a quotient of 0 and a remainder of 251 = FBh.
For IDIV, the dividend is OOFBh = 251 and the divisor is FFh = - 1. Dividing 251 by - 1 yields a quotient of - 251, which is too big to fit in AL, so the message DIVIDE OVERFLOW is printed.
In word division, the dividend is in DX:AX even if the actual dividend will fit in AX. In this case DX should be prepared as follows:
-
For DIV, DX should be cleared.
-
For IDIV, DX should be made the sign extension of AX. The instruction CWD (convert word to doubleword) will do the extension.
Example 9.12 Divide - 1250 by 7:
MOV AX,- 1250
CWD
MOV BX,7
IDIV BX
;AX gets dividend ;Extend sign to DX
;BX has divisor
;AX gets quotient, DX has remainder
In byte division, the dividend is in AX. If the actual dividend is a byte, then AH should be prepared as follows:
-
For DIV, AH should be cleared.
-
For IDIV, AH should the sign extension of AL. The instruction
CBW (convert byte to word) will do the extension.
Example 9.13 Divide the signed value of the byte variable XBYTE by - 7.
MOV AL,XBYTE AL has dividend
CBW Extend sign to AH
MOV BL,- 7 BL has divisor
IDIV BL AL has quotient, AH has remainder
There is no effect of CBW and CWD on the flags.
Even though the computer represents everything in binary, it's more convenient for the user to see input and output expressed in decimal. In this section, we write procedures for handling decimal 1/O.
On input, if we type 21543, for example, then we are actually typing a character string, which must be converted internally to the binary equivalent of the decimal integer 21543. Conversely on output, the binary contents of a register or memory location must be converted to a character string representing a decimal integer before being printed.
We will write a procedure.OUTDEC to print the contents of AX as a signed decimal integer. If AX >= 0, OUTDEC will print the contents in decimal; if AX < 0, OUTDEC will print a minus sign, replace AX by - AX (so that AX now contains a positive number), and print the contents in decimal. Thus in either case, the problem comes down to printing the decimal equivalent of a positive binary number. Here is the algorithm:
-
IF AX < 0 /* AX holds output value */
-
THEN
-
print a minus sign
-
replace AX by its two's complement
-
END_IF
-
Get the digits in AX's decimal representation
-
Convert these digits to characters and print them
To see what line'6 entails, suppose the content of AX, expressed in decimal, is 24168. To get the digits in the decimal representation, we can proceed as follows:
Divide 24618 by 10. Quotient
Divide 2461 by 10. Quotient
Divide 246 by 10. Quotient
Divide 24 by 10. Quotient
Divide 2 by 10. Quotient
Thus, the digits we want appear as remainders after repeated division by 10. However, they appear in reverse order; to turn them around, we can save them on the stack. Here's how line 6 breaks down:
count = 0 /* will count decimal digits */ REPEAT
divide quotient by 10
push remainder on the stack
count = count + 1
UNTIL quotient = 0
where the initial value of quotient is the original contents of AX.
Once the digits are on the stack, all we have to do is pop them off, convert them to characters, and print them. Line 7 may be expressed as follows:
FOR count times DO
pop a digit from the stack
convert it to a character
output the character
END FOR
Now we can code the procedure as follows:
1: OUTDEC - PROC
2: ;prints AX as a signed decimal integer
3: ;input: AX
4: ;output: none
5: PUSH AX ; save registers
6: PUSH BX
7: PUSH CX
8: PUSH .DX
9: ;if AX < 0
10: OR AX,AX ;AX < 0?
11: JGC @END_IF1 ;NO, > 0
12: ;then
13: PUSH AX ; save number
14: MOV DL, - - ; get - -
15: MOV AH, 2 ; print char function
16: INT 21H ; print - -
17: POP 'AX ; get AX back
18: NEG AX ; AX = - AX
19: @END_IF1:
20: ; get decimal digits
21: XOR ' CX, CX ; CX counts digits
22: MOV BX, 10D ; BX has divisor
23: @REPEAT1:
24: XOR DX, DX ; prepare high word of dividend
25: DIV BX ; AX = quotient, DX = remainder
26: PUSH DX ; save remainder on stack
27: INC CX ; count = count + 1
28: ; until
29: OR AX, AX ; quotient = 0?
30: JNE @REPEAT1 ; no, keep going
31: ; convert digits to characters and print
32: MOV AH, 2 ; print char function
33: ; for count times do
34: @PRINT_LOOP:
35: POP DX ; digit in DL
36: OR DL, 30H ; convert to character
37: INT 21H ; print digit
38: LOOP @PRINT_LOOP ; loop until done
39: ; end_for
40: POP DX ; restore registers
41: POP CX
42: POP BX
43: POP AX
44: RET
45: OUTDEC ENDP
After saving the registers, at line i0 the sign of AX is examined by ORing AX with itself. If AX >= 0, the program jumps to line 19; if AX < 0, a minus sign is prinied and AX is replaced by its two's complement. In either case, at line 19, AX will contain a positive number.
At line 21, OUTDEC prepares for division. Because division by a constant is illegal, we must put the divisor 10 in a register.
The REPEAT loop in lines 23- 30 will get the digits and put them on the stack. Because we'll be doing unsigned division, DX is cleared. After division, the quotient will be in AX and the remainder in DX (actually it is in DL, because the remainder is between 0 and 9). At line 29, AX is tested for 0 by ORing it with itself; repeated division by 10 guarantees a zero quotient eventually.
The FOR loop in lines 34- 38 gets the digits from the stack and prints them. Before a digit is printed, it must first be converted to an ASCII character (line 36).
We can verify OUTDEC by placing it inside a short program and running the program inside DEBUG. To insert OUTDEC into the program without having to type it in, we use the INCLUDE pseudo- op. It has the form
INCLUDE filespec
where filespec identifies a file (with optional drive and path). For example the file containing OUTDEC is PGM9_1. ASM. We could use
INCLUDE A:PGM9_1. ASM
When MASM encounters this line during assembly, it retrieves file PGM9_1. ASM from the disk in drive A and inserts it into the program at the position of the INCLUDE directive. This file is on the Student Data Disk that comes with this book.
Here is the testing program:
TITLE PGM9_2: DECIMAL OUTPUT
.MODEL SMALL
.STACK 100H
.CODE
MAIN PROC
CALL OUTDEC
MOV AH,4CH
INT 21H ;DOS exit
MAIN ENDP
INCLUDE A:PGM9_1. ASM
END MAIN
To test the program, we'll enter DEBUG and run the program twice, first for
C>DEBUG PGM9_2. EXE
- RAX
AX 0000
:9C71
-
G
-
25487 (first output)
Program terminated normally:
- RAX
AX 9C71
:28E
- G
654
(second output)
Note that after the first run, DEBUG automatically resets IP to the beginning of the program.
To. do decimal input, we need to convert a string of ASCll digits to the binary representation of a decimal integer. We will write a procedure INDEC to do this.
In procedure OUTDEC, to output the contents of AX in decimal we repeatedly divided AX by 10. For INDEC we need repeated multiplication by 10. The basic idea is the following:
total = 0
read an ASCII digit
REPEAT
convert character to a binary value
total = 10 x total + value
.read a character
UNTIL character is a carriage return
For example, an input of 123 is processed as follows:
total = 0
read '1'
convert '1'. to 1
total = 10 x 0 + 1 = 1
read '2'
convert '2' to 2
total = 10 x 1 + 2 = 12
read '3'
convert '3' to 3.
total = 10 x 12 + 3 = 123
We will design INDEC so that it can handle signed decimal integers in the range - 32768 to 32767. The program prints a question mark, and lets the - user enter an optional sign, followed by a string of digits, followed by a carriage return. If the user enters a character outside the range "0" ... "9", the procedure goes to a new line and starts over. With these added requirements, the preceding algorithm becomes the following:
Print a question mark
total = 0
negative = false
Read a character
CASE character OF
' negative = true
read a character
' + ' read a character
END CASE
REPEAT
IF character is not between '0' and '9'
THEN
' go to beginning
ELSE
Convert character to a binary value
total = 10 x total + value
END IF
read a character
UNTIL character is a carriage return
IF negative = true
THEN
' total = - total
ENDIF
Note: A jump like this is not really "structured programming." Sometimes it's necessary to violate structure rules for the sake of efficiency; for example, when error conditions occur.
1: INDEC PROC
2: ;reads a number in range - 32768 to 32767
3: ;input: none
4: ;output: AX = binary equivalent of number
5: PUSH BX ; save registers used
6: PUSH CX
7: PUSH DX
8: ;print promp
9: @BEGIN:
10: MOV AH, 2
11: MOV DL, '?'
12: INT 21H ; print '?'
13: ;total = 0
14: XOR BX, BX ; BX holds total
15: ;negative = false
16: XOR CX, CX ; CX holds sign
17: ; read a character
18: MOV AH, 1
19: INT 21H ; character in AL
20: ; case character of
21: CMP AL, '- ' ; minus sign
22: JE @MINUS ; yes, set sign
23: CMP AL, '+' ; plus sign
24: JE @PLUS ; yes, get another character
25: JMP @REPEAT2 ; start processing characters
26: @MINUS:
27: MOV CX, 1 ; negative = true
28: @PLUS:
29: INT 21H ; read a character
30: ; end_case
31: @REPEAT2:
32: ; if character is between '0' and '9'
33: CMP AL, '0' ; character >= '0'?
34: JNGE @NOT_DIGIT ; illegal character
35: CMP AL, '9' ; character <= '9'?
36: JNLE @NOT_DIGIT ; no, illegal character
37: ; then convert character to a digit
38: AND AX, 000FH ; convert to digit
39: PUSH AX ; save on stack
40: ; total = total x 10 + digit
41: MOV AX, 10 ; get 10
42: MUL BX ; AX = total x 10
43: POP BX ; retrieve digit
44: ADD BX, AX ; total = total x 10 + digit
45: ; read a character
46: MOV AH, 1
47: INT 21H
48: CMP AL, 0DH ; carriage return?
49: JNE @REPEAT2 ; no, keep going
50: ; until CR
51: MOV AX, BX ; store number in AX
52: ; if negative
53: OR CX, CX ; negative number
54: JE EXIT ; no, exit . 55: ; then . 56: A NEG AX ; yes, negate 57: ; end_if 58: EXIT: 59: POP DX ; restore registers 60: POP CX 61: POP BX 62: RET ; and return. 63: ; here if illegal character entered 64: ; NOT_DIGIT 65: MOV AH, 2 ; move cursor to a new line. 66: MOV DL, 0DH 67: INT 21H 68: MOV DL, 0AH 69: INT 21H 70: JMP ; go to beginning 71: INDEC ENDP
The procedure begins by saving the registers and printing a “?”. BX holds the total; in line 14, it is cleared.
CX is used to keep track of the sign; 0 means a positive number and 1 means negative. We initially assume the number is positive, so CX is cleared at line 16.
The first character is read at lines 18 and 19. It could be $\pmb{\mathscr{u}}{\pmb{\mathscr{u}}}^{\pmb{\mathscr{u}}}$ , $\pmb{\mathscr{u}}{\pmb{\mathscr{u}}}^{\pmb{\mathscr{u}}}$ or a digit. If it’s a sign, CX is adjusted if necessary and another character is read (line 29). Presumably this next character will be a digit.
At line 31, INDEC enters the REPEAT loop, which processes the current character and reads another one, until a carriage return is typed.
At lines 33- 36, INDEC checks to see if the current character is in fact a digit. If not, the procedure jumps to label @NOT_DIGIT (line 64), moves the cursor to a new line, and jumps to @BEGIN. This means that the user can’t escape from the procedure without entering a legitimate number.
If the current character in AL is a decimal digit, it is converted to a binary value (line 38). Then the value is saved on the stack (line 39), because AX is used when the total is multiplied by 10.
In lines 41 and 42, the total in BX is multiplied by 10. The product will be in DX:AX; however, DX will contain 0 unless the number is out of .range (more about this later). At line 43, the value saved is popped from the stack and 10 times total is added to it.
At line 51, INDEC exits the REPEAT loop with the number in BX. After moving it to AX, INDEC checks the sign in CX; if CX contains 1, AX is negated before the procedure exits.
We can test INDEC by creating a program that uses INDEC for input and OUTDEC for output.
Program Listing PGM9_4. ASM TITLE, PGM9_4: DECIMAL I/O .MODEL SMALL .STACK .CODE MAIN PROC
;input a number CALL INDEC ;number in AX PUSH AX ;save number ;move cursor to a new line MOV AH,2 MOV DL,ODH INT 21H MOV DL,0AH INT 21H ;output the number POP AX ;retrieve number CALL OUTDEC ;dos exit MOV AH,ACH INT 21H MAIN ENDP INCLUDE A:PGM9_1. ASM ;include OUTDEC INCLUDE A:PGM9_3. ASM ;include INDEC END MAIN
C>PGM9_4
?21345
21345Overflow
Procedure INDEC can handle input that contains illegal characters, but it cannot handle input that is outside the range - 32768 to 32767. We call this input overflow.
Overflow can occur in two places in INDEC: (1) when total is multiplied by 10, and (2) when a value is added to total. As an example of the first overflow, the user could enter 99999; overflow occurs when the total = 9999 is multiplied by 10. As an example of the second overflow, if the user types 32769, then when the total = 32760, overflow occurs when 9 is added. The algorithm can be made to perform overflow checks as follows:
Print a question mark
total = 0
negative - false
Read a character
CASE character OF
': negative = true
read a character
': read a character
END CASE
REPEAT
IF character is not between '0' and '9'
THEN. .go .to. beginning ELSE . convert character .to. a. value. total .
The implementation of this algorithm is left to the student as an exercise.
The multiplication instructions are MUL for unsigned multiplication and IMUL for signed multiplication.
For byte multiplication, AL holds one number, and the other is in an 8- bit register or memory byte. For word multiplication, AX holds one number, and the other is in an 16- bit register or memory word.
For byte multiplication, the 16- bit product is in AX. For word multiplication, the 32- bit product is in DX:AX.
The division instructions are DIV for unsigned division and IDIV for signed division.
The divisor may be a memory or register, byte or word. For division by a byte, the dividend is in AX; for division by a word, the dividend is in DX:AX.
After byte division, AL has the quotient and AH the remainder. After word division, AX has the quotient and DX the remainder.
For signed word division, if AX contains the dividend, then CWD can be used to extend the sign into DX. Similarly, for byte division, CBW extends the sign of AL into AH. For unsigned word division, if AX contains the dividend, then DX should be cleared. For unsigned byte division, if AL contains the dividend then AH should be cleared.
Multiply and divide instructions are useful in doing decimal I/O.
The INCLUDE pseudo- op provides a way to insert text from an external file into a program.
CBW DIV IMUL CWD IDIVI MUL
INCLUDE
- If it is a legal instruction, give the values of DX, AX, and CF/OF after each of the following instructions is executed.
a. MUL BX, if AX contains 0008h and BX contains 0003h
b. MUL BX, if AX contains 00FFh and BX contains 1000h
c. IMUL CX, if AX contains 0005h and CX contains FFFFh
d. IMUL WORD1, if AX contains 8000h and WORD1 contains FFFFh
e. MUL 10h, if AX contains FFE0h
- Give the new values of AX and CF/OF for each of the following instructions.
a. MUL BL, if AL contains ABh and BL contains 10h
b. IMUL BL, if AL contains ABh and BL contains 10h
c. MUL AH, if AX contains 01ABh
d. IMUL BYTE1, if AL contains 02h and BYTE1 contains FBh
- Give the new values of AX and DX for each of the following instructions, or tell if overflow occurs
a. DIV BX, if DX contains 0000h, AX contains 0007h, and BX contains 0002h
b. DIV BX, if DX contains 0000h, AX contains FFEh, and BX contains 0010h
c. DIV BX, if DX contains FFFFh, AX contains FFFCh, and BX contains 0003h
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
" Give the new values of AL and AH for each of the following instructions, or tell if overflow occurs
a. DIV BL, if AX contains 000Dh and BL contains 03h
b. DIV BL, if AX contains FFFBh and BL contains FEh
c. DIV BL, if AX contains 00FEh and BL contains 10h
d. DIV BL, if AX contains FFE0h and BL contains 02h
- Give the value of DX after executing CWD if AX contains
a. 7E02h
- -8ABCh
c. 1ABCh
- Give the value of AX after executing CBW if AL contains
a. FOh b. SFh C. 80h
- Write assembly code for each of the following high-level language assignment statements. Suppose that A, B, and C are word veri. ables and all products will fit in 16 bits. Use IMUL for multiplication. It's not necessary to preserve the contents of variables A, B, and C.
a. A 5 x A - 7 b. B = (A - B) x (B + 10) c. A = 6 - 9 x A d. IF A^2 + B^2 = C^2 / * where * denotes exponentiation * / THEN set CF ELSE clear CF END IF
Note: Some of the following exercises ask you to use INDEC and/or OUTDEC for I/O. These procedures are on the student disk and can be inserted into your program by using the INC suicide pseudo- op (see section 9.5). Be sure not to use the same labels as these procedures, or you’ll get a duplicate label assembly error (this should be easy, because all the labels in INDEC and OUTDEC begin with “(w)”.
-
Modify procedure INDEC so that it will check for overflow.
-
Write a program that lets the user enter time in seconds, up to 65535, and outputs the time as hours, minutes, and seconds. Use INDEC and OUTDEC to do the I/O.
-
Write a program to take a number of cents C,
$0< = C< = 99$ , and express C as half-dollars, quarters, dimes, nickels, and pennies. Use INDEC to enter C. -
Write a program to let the user enter a fraction of the form M/N
$(M< N)$ , and the program prints the expansion to$N$ decimal places, according to the following algorithm: -
Print
Execute the following steps N times:
-
Divide 10 x M by N, getting quotient Q and r. m ainder R.
-
Print O.
-
Replace M by R and go to step 2. Use INDEC to read M and N.
-
Write a program to find the greatest common divisor (GCD) of two integers
$M$ and$N$ , according to the following algorithm: -
Divide
$M$ by$N$ , getting quotient$Q$ and remainder$R$ . -
If
$R = 0$ , stop.$N$ is the GCD of$M$ and$N$ . -
If
$R \Leftrightarrow 0$ , replace$M$ by$N$ ,$N$ by$R$ , and repeat step 1.
Use INDEC to enter
Overview
In some applications, it is necessary to treat a collection of values as a group. For example, we might need to read a set of test scores and print the median score. To do so, we would first have to store the scores in ascending order (this could be done as the scores are entered, or they could be sorted after they are all in memory). The advantage of using an array to store the data is that a single name can be given to the whole structure, and an element can be accessed by providing an index.
In section 10.1 we show how one- dimensional arrays are declared in assembly language. To access the elements, in section 10.2 we introduce new ways of expressing operands—the register indirect, based and indexed addressing modes. In section 10.3, we use these addressing modes to sort an array.
A two- dimensional array is a one- dimensional array whose elements are also one- dimensional arrays (an array of arrays). In section 10.4, we show how they are stored. These arrays have two indexes, and are most easily manipulated by the based indexed addressing mode of section 10.5. Section 10.6 provides a simple application.
Section 10.7 introduces the XLAT (translate) instruction. This instruction is useful when we want to do data conversion; we use it to encode and decode a secret message.
A one- dimensional array is an ordered list of elements, all of the same type. By "ordered," we mean that there is a first element, second element, third element, and so on. In mathematics, if A is an array, the elements
Figure 10.1 A One-Dimensional Array A
are usually denoted by A[1], A[2], A[3], and so on. Figure 10.1 shows a one- dimensional array A with six elements.
In Chapter 4, we used the DB and DW pseudo- ops to declare byte and word arrays; for example, a five- character string named MSG,
MSG DB 'abcde'
or a word array W of six integers, initialized to 10,20,30,40,50,60.
W. DW 10,20,30,40,50,60
The address of the array variable is called the base address of the array. If the offset address assigned to W is 0200h, the array looks like this in memory:
Offset address Symbolic address Decimal content 0200h W 10 0202h . W+2h 20 0204h . W+4h 30 0206h W+6h 40 0208h W+8h 50 020Ah W+Ah 60
It is possible to define arrays whose elements share a common initial value by using the DUP (duplicate) operator. It has this form:
repeat_count DUP {value)
This operator causes value to be repeated the number of times specified by repeat_count. For example,
GAMIA DW 100 DUP (0) sets up an array of 100 words, with each entry initialized to 0. Similarly,
DELTA DB 212 DUP (?)
creates an array of 212 uninitialized bytes. DUPs may be nested. For example,
which is equivalent to
LINE DB 5,4,2,0,0,0,1,2,0,0,0,1,2,0,0,0,1
The address of an array element may be specified by adding a constant to the base address. Suppose A.is an array and S denotes the number of bytes in an element (S = 1 for a byte array, S = 2 for a word array). The position of the elements in array A can be determined as follows:
Position Location 1 A 2 A=1xS 3 A=2xS 1 1 N A=(N- 1)xS
Example 10.1: Exchange the 10th and 25th elements in a word array W.
Solution: W[10] is located at address W + 9 × 2 = W + 18 and W[25] is at W + 24 × 2 = W + 48; so we can do the exchange as follows:
MOV AX,W+18 ;AX has W{1G}
XCHG,W+48,AX :AX has W[25]
MOV W+18,AX ;complete- exchange
In many applications, we need to perhun some operation on each element of an array. For example, suppose arr.y A is a 10- element array, and we want to add the elements. In a high- level language, we could do it like this:
sum = 0
N = 1
REPEAT
sum = sum + A(N)
N' = N + 1
UNTIL: N' > 10
To code this in assembly language, we need a way to move from one array
element to the next one. In the next section, we'll see how to accomplish this by indirect addressing.
The way an operand is specified is known as its addressing mode. The addressing modes we have used so far are (1) register mode, which means that an operand is a register; (2) immediate mode, when an operand is a constant; and (3) direct mode, when an operand is a variable. For example
MOV AX, 0
ADD ALPHA, AX
(Destination AX is register mode, source 0 is immediate mode.) (Destination ALPHA is direct mode, source AX is register mode.)
There are four additional addressing modes for the 8086: (1) Register Indirect, (2) Based, (3) Indexed, and (4) Based Indexed. These modes are used to address memory operands indirectly. In this section, we discuss the first three of these modes; they are useful in one- dimensional array processing. Based indexed mode can be used with two- dimensional arrays; it is covered in section 10.5.
In this mode, the offset address of the operand is contained in a register. We say that the register acts as a pointer to the memory location. The operand format is
[register]
The register is BX, SI, DI, or BP. For BX, SI, or DI, the operand's segment number is contained in DS. For BP, SS has the segment number.
For example, suppose that SI contains 0100h, and the word at 0100h contains 1234h. To execute
MOV AX, [SI]
the CPU (1) examines SI and obtains the offset address 100h. (2) uses the address DS:0100h to obtain the value 1234h, and (3) moves 1234h to AX. This is not the same as
MOV AX, SI
which simply moves the value of SI, namely 100h, into AX.
BX contains 1000h
SI contains 2000h
DI contains 3000h
Offset 1000h contains 1BACh Offset 2000h contains 20FEh Offset 3000h contains 031Dh
where the above offsets are in the data segment addressed by DS. Tell which of the following instructions are legal. If legal, give the source offset address and the result or number moved.
a. MOV BX, [BX]
b. MOV CX, [SI]
c. MOV BX, [AX]
d. ADD [SI], [DI]
e. INC [DI]
a. 1000h
b. 2000h
c. illegal source register
d. illegal memory-memory
addition
e. 3000h
031Eh
Now let's return to the problem of adding the elements of an array.
Example 10.3 Write some code to sum in AX the elements of the 10- element array W defined by
W DW 10,20,30,40,50,60,70,80,90,100
Solution: The idea is to set a pointer to the base of the array, and let it move up the array, summing elements as it goes.
XOR AX,AX
;AX holds sum
LEA SI,W ;SI points to array W
MOV CX,10 ;CX has number of elements
ADDNOS:
ADD AX,[SI] ;SUM = sum + element
ADD SI,2 ;move pointer to the next
;element
LOOP ADDNOS ;loop until done
Here we must add 2 to SI on each trip through the loop because W is a word array (recall from Chapter 4 that LEA moves the source offset address into the destination).
The next example shows how register indirect mode can be used in array processing.
Example 10.4 Write a procedure REVERSE that will reverse an array of N words. This means that the Nth word becomes the first, the (N- 1)st word becomes the second, and so on, and the first word becomes the Nth word. The procedure is entered with SI pointing to the array, and BX has the number of words N.
Solution: The idea is to exchange the 1st and Nth words, the 2nd and (N- 1)st words, and so on. The number of exchanges will be N/2 (rounded down to the nearest integer if N is odd). Recall from section 10.1 that the Nth element in a word array A has address A + 2 × (N - 1).
REVERSE PROC
;reverses a word array
;input: SI = offset of array
; BX = number of elements
;output: reversed array
PUSH AX
PUSH BX
PUSH CX
PUSH DI
PUSH DX
;make DI point to nth word
MOV DI,SI ;DI pts to lst word
MOV CX,BX ;CX = n
DEC BX ;BX = n- 1
SHL BN,1 ;BX = 2 x (n- 1)
ADD DI,BX ;DX pts to nth word
SHR CX,1 ;CX = n/2 = no. of swaps to do
;swap elements
XCHG_LOOP:
MOV AX,[SI] ;get an elt in lower half of array
XCHG AX,[DI] ;insert in upper half
MOV [SI],AX ;complete exchange
ADD SI,2 ;move ptr
SUB DI,2 ;move ptr
LOOP XCHG_LOOP ;loop until done
POP DI ;restore registers
POP SI
POP CX
POP BX
POP AX
RET
REVERSE ENDP
10.2.2
Based and Indexed
Addressing Modes
In these modes, the operand's offset address is obtained by adding a number called a displacement to the contents of a register. Displacement may be any of the following:
the offset address of a variable
a constant (positive or negative)
the offset address of a variable plus or minus a constant
If A is a variable, examples of displacements are:
A (offset address of a variable)
- 2 (constant)
A + 4 (offset address of a variable plus a constant)
The syntax of an operand is any of the following equivalent expressions:
[register + displacement]
[displacement + register]
[register] + displacement
displacement + [register]
displacement {register}
The register must be BX, BP, SI, or DI. If BX, SI, or DI is used, DS contains the segment number of the operand's address. If BP is used, SS has the segment number. The addressing mode is called based if BX (base register) or
BP (base pointer) is used; It is called indexed if SI (source index) or DI (destination index) is used.
For example, suppose W is a word array, and BX contains 4. In the instruction
MOV AX, W[BX]
the displacement is the offset address of variable W. The instruction moves the element at address W + 4 to AX. This is the third element in the array. The instruction could also have been written in any of these forms:
MOV AX, [W+BX]
MOV AX, [BX+W]
MOV AX, W+[BX]
MOV AX, [BX]+W
As another example, suppose SI contains the address of a word array W. In the instruction
MOV AX, [SI+2]
the displacement is 2. The instruction moves the contents of W + 2 to AX. This is the second element in the array. The instruction could also have been written in any of these forms:
MOV AX, [2+SI]
MOV AX, 2+SI],
MOV AX, [SI]+2
MOV AX, 2[SI]
Example 10.5 Rework example 10.3 by using based mode.
Solution: The idea is to clear base register BX, then add 2 to it on each trip through the summing loop.
XOR AX, AX ; AX holds sum
XOR BX, BX ; clear base register
MOV CX, 10 ; CX has number of elements
ADDNOS:
ADD AX, W[BX] ; sum = sum + element
ADD BX, 2 ; index next element
LOOP ADDNOS ; loop until done
Example 10.6 Suppose that ALPHA is declared as
ALPHA DW 0123h, 0456h, 0789h, 0ABCDh
in the segment addressed by.DS. Suppose also that
BX contains 2
SI contains 4
DI contains 1
Offset 0002 contains 1084h
Offset 0004 contains 2BACh
Tell which of the following instructions are legal. If legal, give the source offset address and the number moved.
a. MOV AX, {ALPHA+BX}.
b. MOV BX, {BX+2}
c. MOV CX, ALPHA[SI]
d. MOV AX, -2{SI}
e. MOV BX, {ALPHA+3+DI}
f. MOV AX, {BX}2
g. ADD BX, {ALPHA+AX}
Source offset
Number moved
a. ALPHA+2
0456h
b.
2BACh
c. ALPHA+4
0789h
d.
10S4h
e. ALPHA+3+1 = ALPHA+4
0789h
f. Illegal form of source operand
g. Illegal source register
The next two examples illustrate array processing by based and indexed modes.
Example 10.7 Replace each lowercase letter in the following string by its upper case equivalent. Use index addressing mode.
MSG DB 'this is a message'
MOV CX,17
;no. of chars in string
;SI indexes a char
TOP:
CMP MSG[SI], ;blank?
;yes, skip over
;yes, skip over
AND MSG[SI], 0DFh ;no, convert to upper case
NEXT:
INC SI
LOOP TOP ;loop until done
The PIR Operator and the LABEL Pseudo- op
You saw in Chapter 4 that the operands of an instruction must be of the same type; for example, both bytes or both words. If one operand is a constant, the assembler attempts to infer the type from the other operand. For example, the assembler treats the instruction
MOV AX,1
as a word instruction, because AX is a 16- bit register. Similarly, it treats
MOV BH,5
as a byte instruction. However, it can't assemble
MOV {BX},1 ;illegal
because it can't tell whether the destination is the byte pointed to by BX or the word pointed to by BX. If you want the destination to be a byte, you can say,
MOV BYTE PTR [BX], 1
and if you want the destination to be a word, you say.
MOV WORD PTR [BX], 1
Example 10.8 In the string of example 10.7, replace the character "t" by "T":
Solution 1: Using register indirect mode,
LEA SI,MSi ;Si points to MSC MOV BYTE PTR [SI, 'T' ;replace 't' by 'T'
Solution 2: Using index mode
XOR SI,SI ;clear SI
MCV HSG[SI], 'T' ;replace 't' by 'T'
Here it is not necessary to use the PTR operator, because MSG is a byte variable.
In general, the PTR operator can be used to override the declared type of an address expression. The syntax is
type PTR- address_expression
where the type is BYTE, WORD, or DWORD (double- word), and the address expression has been typed as DB, DW, or DD.
For example, suppose you have the following declaration:
DOLI.ARS DB 1Ah CENTS DB 52h
and you'd like to move the contents of DOLI.ARS to AL and CENIS to AH with a single MOV instruction. Now
MOV AX,DOLI.ARS ;illegal
is illegal because the destination is a word and the source has been typed as a byte variable. But you can override the type declaration with WORD PTR as
MOV AX, WORD PTR DOLI.APS ;AL - dollars, AH - cents
and the instruction will move 521Ah to AX.
Actually, there is another way to get around the problem of type conflict in the preceding example. Using the LABEL pseudo- op, we could declare
MONEY LABEL WORD
DOLIARS DB iAh
CENTS DB 52h
This declaration types MONEY as a word variable, and the components DOLLARS and CENTS as byte variables, with MONEY and DOLLARS being assigned the same address by the assembler. The instruction
MOV AX, MONEY ; AL = dollars, AH = cents is now legal. So are the following instructions, which have the same effect:
MOV AL, DOLLARS MOV AH, CENTS
Example 10.9 Suppose the following data are declared:
.DATA
A DW 1234h
B LABEL BYTE
DW 5678h
C LABEL WORD
C1 DB 9Ah
C2 DB OBCh
Tell whether the following instructions are legal, and if so, give the number moved.
a. MOV AX, B
b. MOV AH, B
c. MOV CX, C
d. MOV BX, WORD PTR B
e. MOV DL, WORD PTR C
f. MOV AX, WORD PTR C1
a. illegal-type conflict
b. legal, 78h
c. legal, 0DC9Ah
d. legal, 5678h
e. legal, 9Ah
f. legal, 0lDC9Ah
In register indirect mode, the pointer register BX, SI, or DI specifies an offset address relative to DS. It is also possible to specify an offset relative to one of the other segment registers. The form of an operand is
segment_register:[pointer_register]
For example,
MOV AX, ES: [SI]
If SI contains 0100h, the source address in this instruction is ES:0100h. You might want to do this in a program with two data segments, where ES contains the segment number of the second data segment.
Segment overrides can also be used with based and indexed modes.
We mentioned earlier that when BP specifies an offset in register indirect mode, SS supplies the segment number. This means that BP may be used to access items on the stack.
Example: 10.10 Move:the top three words on the stack into AX, BX, and CX without changing the stack.
MOV BP,SP ;BP points to stacktop
MOV AX,[BP] ;move stacktop to AX
MOV BX,[BP+2] ;move second word to BX
MOV CX,[BP+4] ;move third word to CX
A primary use of BP is to pass values to a procedure (see Chapter 14).
It is much easier to locate an item in an array if the array has been sorted. There are dozens of sorting methods; the method we will discuss here is called selectsort. It is one of the simplest sorting methods.
To sort an array A of N elements, we proceed as follows:
Pass 1. . Find the largest element among A[1] . . . A[N]. Swap it and A[N]. Because this puts the largest element in position N, we need only sort A[1] . . . A[N- 1] to finish.
Pass 2. Find the largest element among A[1] . . . A[N- 1]. Swap it and A[N- 1]. This places the next- to- largest element in its proper position.
Pass N- 1. Find the largest element among A[1], A[2]. Swap it and A[2]. At this point A[2] . . . A[N] are in their proper positions, so A[1] is as well, and the array is sorted.
For example, suppose the array A consists of the following integers:
| Position | 1 | 2 | 3 | 4 | 5 |
| initial data | 21 | 5 | 16 | 40 | 7 |
| pass 1 | 21 | 5 | 16 | 7 | 40 |
| pass 2 | 7 | 5 | 16 | 21 | 40 |
| pass 3 | 7 | 5 | 16 | 21 | 40 |
| pass 4 | 5 | 7 | 16 | 21 | 40 |
i = N
FCR N- 1 times DO
Fina the position k of the largest element
among A[1]. A[i]
(*) Swap A[k] and A[i]
i = i- 1
END FOR
Step (*) will be handled by a procedure SWAP. The code for the procedures is the following (we'll suppose the array to be sorted is a byte array):
1: SELECT PROC
2: ; sorts a byte array by the selectsort method
3: ; input: SI = array offset address
4: ; BX = number of elements
5: ; output: SI = offset of sorted array
6: ; uses: SWAP
7: PUSH BX
8: PUSH CX
9: PUSH DX
10: PUSH SI
11: DEC BX ; N = N- 1
12: JE END_SORT ;exit if l- elt array
13: MOV DX, SI ; save array offset
14: ; for N- 1 times do
15: SORT_LOOP:
16: MOV SI, DX ; SI pts to array
17: MOV CX, BX ; no. of comparisons to make
18: MOV DI, SI ; DI pts to largest element
19: MOV AL, [DI] ; AL has largest element
20: ; locate biggest of remaining elts
21: FIND_BIG:
22: INC SI ; SI pts to next element
23: CMP [SI], AL ; is new element > largest?
24: JNG NEXT ; no, go on
25: MOV DI, SI ; yes, move DI
26: MOV AL, [DI] ; AL has largest element
27: NEXT:
28: LOOP FIND_BIG ; loop until done
29: ; swip biggest elt with iast elt
30: CALL SWAP ; Swap with iast elt
31: DEC BX ; N = N- 1
32: JNE SORT_LOOP ; repeat if N <> 0
33: END_SORT:
34: POP SI
35: POP DX
36: POP CX
37: POP BX
38: RET
39: SELECT ENDP
40: SWAP PROC
41: ; swaps two array elements
42: ; input: SI * one element
43: ; DI * other element
44: ;output: exchange- elements 45: PUSH AX ; save AX 46: MOV AL, [SI] ; get A[i] 47: XCHG AL, [DI] ; place in A[k] 48: MOV [SI], AL ; put A[k] in A[i] 49: POP AX ; restore AX 50: RET 51: SWAP END
Procedure SELECT is entered with the array offset address in SI, and the number of elements
In the general case, the procedure enters as the main processing loop (lines 15- 32). Each pass through this loop places the largest of the remaining unsorted elements in its proper place.
In lines 21- 28, a loop is entered to find the largest of the remaining unsorted elements; the loop is exited with DI pointing to the largest element and SI pointing to the last element in the array. At line 30, procedure SWAP is called to exchange the elements pointed to by SI and DI.
The procedure can be tested by inserting them in a testing program.
TITLE PGM10_3: TEST SI
.MODEL SMALL
.STACK 100H
.DATA
A DB 5,2,1,3,4
.CODE
MAIN PROC
MOV AX, QDATA
MOV DS,AX
LEA SI,A
MOV BX,6
.CALL SELECT,
MOV AH,4CH
INT .21H
MAIN ENDP
;select goes here
END MAIN
After assembling and linking, we enter DEBUG and execute down to the procedure call (the addresses in the following demonstration were determined in a previous DEBUG session):
- GC
AX=1000 BX=0005 CX=0049 DX=0000 SP=0100 BP=0000 SI=0004 DI=0000 DS=1000 ES=OFF9 SS=100E CS=1009 IP=000C NV UP EI PL NZ NA PO NC 1009:000C E80400 CALL 0013
Before calling the procedure, let's look at the unsorted array:
The data appear in the order 5, 2, 1, 3, 4. Now let's execute SELECT:
and look at the array again:
It is now in ascending order.
A two- dimensional array is an array of arrays; that is, a one- dimensional array whose elements are one- dimensional arrays. We can picture the elements as being arranged in rows and columns. Figure 10.2 shows a two- dimensional array B with three rows and four columns (a
Because memory is one- dimensional, the elements of a two- dimensional array must be stored sequentially. There are two commonly used ways: In row- major order, the row 1 elements are stored, followed by the row 2 elements, then the row 3 elements, and so on. In column- major order, the elements of the first column are stored, followed by the second column, third column, and so on. For example, suppose array B has 10, 20, 30, and 40 in the first row, 50, 60, 70, and 80 in the second row, and 90, 100, 110, and 120 in the third row. It could be stored in row- major order as follows:
Figure 10.2 A Two-Dimensional Array B
B DW 10,20;30,40
DW 50,60,70,80
DW 90,100,110,120
or in column- major order as follows:
B DW 10,50,90
DW 20,60,100
DW 30,70,110
DW 40,80,120
Most high- level language compilers store two- dimensional arrays in row- major order. In assembly language, we can do it either way. If the elements of a row are to be processed together sequentially, then row- major order is better, because the next element in a row is the next memory location. Conversely, column- major order is better if the elements of a column are to be processed together.
Suppose an
-
Find where row i begins.
-
Find the location of the jth element in that row.
Here is the first step. Row 1 begins at location A. Because there are N elements in each row, each of size S bytes, Row 2 begins at location A + - N x S, Row 3 begins at location A + 2 x N x S, and in general, Row i begins at location A + (i - 1) x N x S.
Now for the second step. We know from our discussion of one- dimensional arrays that the jth element in a row is stored
Adding the results of steps 1 and 2, we get the final result:
If A is an M x N array, with element size S bytes, stored in row- major order, then
(1) A[i, j] has address A+((i - 1) x N + (j - 1)) x S
There is a similar expression for column- m- jor ordered arrays:
If A is an M x N array, with element size S, stored in column- major
(2) A[i,j] has address A + ((i - 1) + (j - 1) x M) x S
Example 10.12 Suppose A is an
-
Where does row i begin?
-
Where does column j begin?
-
How many bytes are there between elements in a column?
-
Row i begins at A[i, 1]; by formula (1) its address is
$\mathbf{A} + (\mathbf{i} - 1) \times N \times 2$ . -
Column j begins at A[1, j]; by formula (1) the address is
$\mathbf{A} + (\mathbf{j} - 1) \times 2$ . -
Because there are
$N$ columns, there are$2 \times N$ bytes between elements in any given column.
In this mode, the offset address of the operand is the sum of
-
the contents of a base register (BX or B1')
-
the contents of an index register (SI or DI)
-
optionally, a variable's offset address
-
optionally, a constant (positive or negative)
If BX is used, DS contains the segment number of the operand's address; if BP is used, SS has the segment number. The operand may be written several ways; four of them are
-
variable[base_register][index register]
-
[base_register + index_register + variable + constant]
-
variable[base_register + index_register + constant]
-
constant[base_register + index_register + variable]
The order of terms within these brackets is arbitrary.
For example, suppose W is a word variable, BX contains 2, and SI contains 4. The instruction
MOV AX, W[BX][SI]
moves the contents of
MOV AX, [W+BX+SI]
or
MOV AX, W[BX+SI]
Based Indexed mode is especially useful for processing two- dimensional arrays, as the following example shows.
Example 10.13 Suppose A is a
1: From example 10.12, we know that in an
MOV X, 28; BX indexes row 3
XOR SI, SI ; SI will index columns
MOV CX, 7 ; number of elements in a row
CLEAR:
MOV A[BX][SI], 0 ; clear A[3, j]
ADD SI, 2 ; go to next column
LOOP CLEAR ; loop until done
- Again from example 10.12, column j begins at
$A + (j - 1) \times 2$ in
an
order, to get to the next element in column 4 we need to add
MOV SI, 6 ; SI will index column 4.
XOR BX, BX ; BX will index rows
MOV CX, 5 ; number of elements in a column
CLEAR:
MOV A[BX][SI], 0 ; clear A[i, 4]
AUD BX, 1 ; go to next row
LOOP CLEAR ; loop until done
10.6
An Application: Averaging Test Scores
Suppose a class of five students is given four exams. The results are recorded as follows:
Test 1 Test 2 Test 3 Test 4
MARY ALLEN 67 45 98 33
SCOTT BAYLIS 70 56 87 44
GEORGE FRANK 82 72 89 40
BETH HARRIS 80 67 95 50
SAM WONG 78 76 92 60
We will write a program to find the class average on each exam. To do this, we sum the entries in each column and divide by 5.
-
j = 4
-
REPEAT
-
sum the scores in column j
-
divide sum by 5 to get the average in column j
-
j = j- 1
-
UNTIL j = 0
We choose to start summing in column 4 because it makes the code a little shorter. Step 3 may be broken down further as follows:
sum[j] = 0 i = 1 FOR 5 times DO sum[j] = sum[j] + score[i,j] 1 = 1+1. END_FOR
0: TITLE PGM10_4: CLASS AVERAGE
1: .MODEL SMALL
2: .STACK 100H
3: .DATA
4: FIVE DW 5
5: SCORES .DW 67,45,98,33 :Mary Allen
6: DW 70,56,87,44 ;Scott Baylis
7: DW 82,72,89,40 ;George Frank.
8: DW 80,67,95,50 ;Beth Harris
9: DW 78,76,92,60 ;Sam Wong
10: AVG DW 5 DUP (0)
11: .CODE
12: MAIN PROC
13: MOV AX,@DATA
14: MOV DS,AX ;initialize DS
15: ;j- 4
16: MOV SI,6 ;col index, initially col
17: REPEAT:
18: MOV CX,5 ;no. of rows
19: XOR BX,BX ;row index, initially 1.
20: XOR AX,AX ;col_sum, initially 0
21: ;sum scores in column j
22: FOR:
23: ADD AX,SCORES{BX+SI};col_sum=col_sum + score
24: ADD BX,8 ;index next.row
25: LOOP FOR ;keep adding scores
26: ;endfor
27: ;compute average in .column j
28: XOR DX,DX ;clear high part of divnd
29: DIV. FIVE ;AX = average
30: MOV AVG{SI};AX ;store in array
31: SUB SI,2 ;go to next column
32: ;until j=0
33: .JNL REPEAT ;unless SI < 0.
34: ;dos .exit
35: MOV AH,4CH
36: INT 21H
37: MAIN ENDP
38: END MAIN
The test scores are stored in a two- dimensional array (lines 5- 9). In lines 22- 25, a column is summed and the total placed in the array AVG. In lines 28- 30, this total is divided by 5 to compute the column average.
Rows and columns of array SCORE are indexed by BX and SI, respectively. We choose to begin summing column 4; this column begins in SCORES+6, so SI is initialized to 6 (line 16). After a column is summed, SI is decreased by 2, until it is 0.
The execution of the program may be seen in DEBUG. We execute down to the DOS exit, then dump the array AVG (the addresses in this demonstration were determined in a previous DEBUG session).
- G29
AX- 4C4B BX- 0028 CX- 0000 DX- 0002 SP- 0100 BP- 0000 SI- FFFE DI- 0000
DS- 100B ES- 0FF9 SS- 100F CS- 1009 IP- 0029 NV- UP EI NG NZ- AC PO CY
1009:0029 CD21 INT 21
- D36 3D
100B:0030 4B 00- 3F 00 5C 00 2D 00
The averages are 0044h, 003Fh, 005Ch, and 002Dh, or—in decimal 75, 63, 92, and 45.
10.7
In some applications, it is necessary to translate data from one form to another. For example, the IBM PC uses ASCII codes for characters, but IBM mainframes use EBCDIC (Extended Binary Coded Decimal Interchange Code). To translate a character string encoded in ASCII to EBCDIC, a program must replace the ASCII code of each character in the string with the corresponding EBCDIC code.
The instruction XLAT (translate) is a no- operand instruction that can be used to convert a byte value into another value that comes from a table. The byte to be converted must be in AL, and BX has the offset address of the conversion table. The instruction (1) adds the contents of AL to the address in BX to produce an address within the table, and (2) replaces the contents of AL by the value found at that address.
For example, suppose the contents of AL are in the range 0 to Fh and we want to replace it by the ASCII code of its hex equivalent; for example, 6h by 036h = "6", Bh by 042h = "B". The conversion table is
TABLE DB 030h, 031h, 032h, 033h, 034h, 035, 036h, 037h, 038h, 039h DB 041h, 042h, 043h, 044h, 045h, 046h
For instance, to convert OCh to "C", we do the following:
MOV AL, OCh number to convert LEA BX, TABLE BX has table offset XLAT AL has 'C'
Here XLAT computes address TABLE + Ch = TABLE + 12, and replaces the contents of AL by the number stored there, namely 043h = "C".
In this example, if AL contained a value not in the range 0 to 15, XLAT would translate it to some garbage value.
The following program prompts the user to type a message, encodes it in unrecognizable form, prints the coded message, translates it back, and prints the translation. Sample output:
ENTER A MESSAGE:
GATHER YOUR FORCES AND ATTACK AT DAWN, (input)
ZXKGM WULM HUMPGN XJO XKHXPD XK OXSJ, (encoded)
GATHER YOUR FORCES AND ATTACK AT DAWN, (translated)
Print prompt.
Read and encode message
Go to a new line
Print encoded message
Go to a new line
Translate and print message
0: TITLE PGM 10_5: SECRET MESSAGE
1:. MODEL SMALL
2:. STACK 100H
3: .DATA
4: :alphabet ABCDEFGHIJKLMNOPQRSTUVWXYZ
5: CODE_KEY DB 65 DUP (' '), 'XQPOGHZBCADEIJUVFMNKLRSTWY'
6: DB 37 DUP (' ')
7: DECODE_KEY DB 65 DUP (' '), 'JHIKLQEFMNURSDCBVWXOPYAZG'
8: DB 37 DUP (' ')
9: CODED DB 80 DUP (' $')
10: PROMPT DB 'ENTER A MESSAGE: ' ODH, OAH, ' $'
11: CRLF DB ODH, OAH, ' $'
12: .CODE
13: MAIN PROC
14: MOV . AX, @DATA ; ini^laiize DS
15: MOV DS, AX
16: :print input prompt
17: MOV AH, 9 ; print string :fcn
18: LEA . DX, PROMPT ; DX pts to prompt
19: INT 21H ; print message
20: :read and encode message
21: MOV AH, 1 ; read char :fcn
22: LEA BX, CODE_KEY ; BX pts to code key
23: LEA DI, CODED ; DI pts to coded message
24: WHILE_
- 25: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- 26: CMP AL, ODH ; carriage return?
27: JE ENDWHILE ; yes, go to print coded message
28: XLA ; no, encode char
29: MOV .[DI],AL ;store in coded message
30: INC .DI ;move pointer
31: JMP WHILE ;process next char
32: ENDWHILE:
33: ;go to a new line
34: MOV AH,9
35: LEA DX,CRLF
36: INT 21H ;new line
37: ;print encoded message
38: LEA DX,CODED ;DX pts to coded
39: INT 21H ;print coded message
40: ;go to a new line
41: LEA DX,CRLF
42: INT 21H ;new line
43: ;decode message and print it
44: MOV AH,2 ;print char fcn
45: LEA BX,DECODE_KEY ;EX :ts to decode key
46: LEA SI,CODED ;ts to encoded message
47: WHILE1:
48: MOV AL,[SI] ;get a character fr- - message 49: CMP AL,'S' ;end of message
50: JE ENDWHILE1 ;yes, exit
51: XLAT ;no, decode character
52: MOV DL,AL ;put in DL
53: INT 21H ;print translated char
54: INC SI ;move ptr
55: JMP WHILE1 ;process next char
56: ENDWHILE1:
57: MOV AH,4CH
58: INT :21H ;dos exit
59: MAIN ENDP
60: END MAIN
Three arrays are declared in the data segment:
-
CODE_KEY is used to encode English text.
-
CODED holds the encoded message; it is initialized to a string of
dollar signs so that it may be printed with INT 21h, function 9.
- DECODE_KEY is used to translate the encoded text back to English.
Line 4 is a comment line containing the alphabet, which makes it easier to see how characters are encoded and decoded.
In lines 24- 32, characters are read and encoded until a carriage return is typed. AL receives the ASCII code of each input character; XLAT adds it to address CODE_KEY in BX to produce an address within the CODE_KEY table.
CODE_KEY is set up as follows: 65 blanks, followed by the letters to
which A to Z will be encoded, followed by 37 more blanks for a total of 128
bytes (128 bytes are needed, because the standard ASCII characters range
from 0 to 127). Suppose, for example, an "A" is typed. The ASCII code of
"A" is 65: XLAT computes address CODE_KEY+65, picks up the value of that
byte, which is "X", and stores it in AL. At line 33, this value is moved into
byte array CODED. Similarly, "B" is translated into 'Q', 'C' into 'P'... 'Z'
into "Y" (the encoding table was constructed arbitrarily). Characters other than capital letters (including the blank character) have ASCII code in the
ranges 0 to 64 or 92 to 127, and are translated into blanks. In lines 38- 39, the encoded message is printed.
DECODE_KEY also begins with 65 blanks and ends with 37 blanks. The positions of the letters in this array may be deduced as follows. First, lay down the alphabet (line 4). Now since "A" was coded into "X", the letter at position "X" in the decoding sequence should be "A". Similarly, because "B" was coded into "Q", there should be a "B" at position "Q", and so on.
In lines 47- 56, the encoded message is translated. After placing the addresses of DECODE_KEY and CODED in BX and SI, respectively, the program moves a byte of the coded message into AL. If it's a dollar sign, the message has been translated and the program exits. If not, XLAT adds AL to address DECODE_KEY to produce an address within the decoding table, and puts the character found there into AL. At line 52, the character is moved to DL so that it can be printed with INT 21h, function 2.
-
A one-dimensional array is an ordered list of elements of the same type. The DB and DW pseudo-ops are used to declare byte and word arrays.
-
An array element can be located by adding a constant to the base address.
-
The way that an operand is specified is its addressing mode. The addressing modes are register, immediate, direct, register indirect, based, indexed, and based indexed.
-
In register indirect mode, an operand has the form [register], where register is BX, SI, DI, or BP. The operand's offset is contained in the register. For BP, the operand's segment number is in SS; for the other registers, the segment number is in DS.
-
In based or indexed mode, an operand has the form [register + displacement]. Register is BX, BP, SI, or DI. The operand's offset is obtained by adding the displacement to the contents of the register. For BX, SI, or DI, the segment number is in DS; for BP, the segment number is in SS.
-
The operators BYTE PTR and WORD PTR in front of an operand may be used to override the operand's declared type.
-
The LABEL pseudo-op may be used to assign a type to a variable.
-
A two-dimensional array is a one-dimensional array whose elements are one-dimensional arrays. Two-dimensional arrays may be stored row by row (row-major order), or column by column (column-major order).
-
In based indexed mode, the offset address of the operand is the sum of (1) BX or BI; (2) SI or DI; (3) optionally, a memory offset address; (4) optionally, a constant. One (of several) possible forms is [base_register + index_register + memory_location + constant]. DS has the segment number if BX is used; if BP is used, SS has the segment number.
-
Based indexed mode may be used to process two-dimensional arrays.
-
The XLAT instruction can be used to convert a byte value into another value that comes from a table. AL contains the value to be
converted and BX the address of the table. The instruction adds AL to the offset contained in BX to produce a table address. The contents of AL is replaced by the value found at that address.
addressing mode
base address of an array based addressing mode
column- major order
direct mode
displacement
immediate mode
indexed addressing mode
one- dimensional pointer
register mode
row- major order
two- dimensional array
The way the operand is specified
The address of the array variable
An indirect addressing mode in which the contents of BX or BP are added to a displacement to form an operand's offset address
Column by column
The operand is a variable
In based or indexed mode, a number added to the contents of a register to produce an operand's offset address
The operand is constant
An indirect addressing mode in which the contents of SI or DI are added to a displacement to form an operand's offset address
An ordered list of element of the same type A register that contains an offset address of an operand
The operand is a register
Row by row
A one- dimensional array whose elements are one- dimensional arrays
New instructions
XLAT
New Pseudo- Ops
DUP
LABEL PTR
- Suppose
AX contains 0500h
BX contains 1000h
SI contains 1500h
DI contains 2000h
offset 1000h contains 0100h
offset 1500h contains 0150h
offset 2000h contains 0200h
offset 3000h contains 0400h
offset 4000h contains 0300h
and BETA is a word variable whose offset address is 1000h
For each of the following instructions, if it is legal, give the source offset address or register and the result stored in the destination.
a. MOV DI, SI
b. MOV DI, [DI]
C. ADD AX, [SI]
d. SUB BX, [DI]
e. LEA BX, BETA [BX]
f. ADD. [SI], [DI]
g. ADD BH, [BL]
h. ADD AH, [SI]
i. MOV AX, [BX + DI + BETA]
- Given the following declarations
A DW 1,2,3
B DB 4,5,6
C LABEL WORD
MSG DB 'ABC'
and suppose that BX contains the offset address of C. Tell which of the following instructions are legal. If sc, give the number moved.
a. MOV AH, BYTE PTR A
b. MOV AX, WORD PTR B
c. MOV AX, C
d. MOV AX, MSG
e. MOV AH, BYTE PTR C
- Use BP and based mode to do the following stack operations.
(You may use other registers as well, but don't use PUSH or POP.)
a. Replace the contents of the top two words on the stack by zeros.
b. Copy a stack of five words into a word array ST_ARR, so that ST_ARR contains the stack top, ST_ARR + 2 contains the next word on the stack, and so on.
- Write instructions to carry out each of the following operations on a word array A of 10 elements or a byte array B of 15 elements.
a. Move A[i+1] to position i, i = 1 ... 9, and move A[i] to position 10.
b. Count in DX the number of zero entries in array A.
c. Suppose byte array B contains a character string. Search B for the first occurrence of the letter "E". If found, make SI point to its location; if not found, set CF.
- Write a procedure FIND_IJ that returns the offset address of the element in row i and column j in a two-dimensional M×N word array A stored in row-major order. The procedure receives i in AX, j in BX, N in CX, and the offset of A in DX. It returns the offset address of the element in DX. Note: you may ignore the possibility of overflow.
- To sort an array A of N elements by the bubblesort method, we proceed as follows:
Pass 1. For
A[j - 1]. This will place the largest element in position N.
Pass 2. For
A[j - 1]. This will place the second largest element in position N - 1.
Pass N - 1. If A[2] < A[1], then swap A[2] and A[1]. At this point the array is sorted.
initial data 7 5 3 9 1
pass 1 5 3 7 1 9
pass 2 3 5 1 7 9
pass 3 3 1 5 7 9
pass 4 1 3 5 7 9
Write a procedure BUBBLE to sort a byte array by the bubblesort algorithm. The procedure receives the offset address of the array in SI and the number of elements in BX. Write a program that lets the user type a list of single- digit numbers, with one blank between numbers, calls BUBBLE to sort them, and prints the sorted list on the next line. For example,
Your program should be able to handle an array with only one element.
- Suppose the class records in the example of section 10.4.3 are stored as follows
CLASS
DB 'MARY ALLEN 67,45,9 9,33
DB SCOTT BAYLIS'70,56,87,44
DB 'GEORGE FRANK'82,72,89,40
DB 'SAM WCNG 78,76,92,60
Each name occupies 12 bytes. Write a program to print the name of each student and his or her average (truncated to an integer) for the four exams.
- Write a program that starts with an initially undefined byte array of maximum size 100, and lets the user insert single characters
into the array in such a way that the array is always sorted in ascending order. The program should print a question mark, let the user enter a character, and display the array with the new character inserted. Input ends when the user hits the ESC key. Duplicate characters should be ignored.
Sample execution:
?A
A
?D
AD
?B
ABD
?
ABDa
?D
ABDa
?
-
Write a program that uses XLAT to (a) read a line of text, and (b) print it on the next line with all small letters converted to capitals. The input line may contain any characters—small letters, capital, letters, digit characters, punctuation, and so on.
-
Write a procedure PRINTIFEX that uses XLAT to display the content of BX as four-hex digits. Test it in a program that lets the user type a four-digit hex integer, stores it in BX using the hex input algorithm of section 7.4, and calls PRINTHEX to print it on the next line.
In this chapter we consider a special group of instructions called the string instructions. In 8086 assembly language, a memory string or string is simply a byte or word array. Thus, string instructions are designed for array processing.
Here are examples of operations that can be performed with the string instructions:
Copy a string into another string.
Search a string for a particular byte or word.
Store characters in a string.
Compare strings of characters alphabetically.
The tasks carried out by the string instructions can be performed by
The tasks carried out by the string instructions can be performed by ising the register indirect'addressing mode we studied in Chapter 10; how ver.,the string instructions have some built- in advantages. For example, hey provide automatic updating of pointer registers and allow memory ncmory o'operations.
11.1 The Direction Flag
In Chapter 5, we saw that the FLAGS register contains six status flags and three control flags. We know that the status flags reflect the result of an operation that the processor has done. The control flags are used to control the processor's operations.
One of the control flags is the direction flag (DF). Its purpose is to determine the direction in which string operations will proceed. These op erations are implemented by the two index registers SI and DI. Suppose, for example, that the following string his been declared:
STRING1 DB 'ABCDE'
And this string is stored in memory starting at offset 0200h:
Offset address Content ASCII character 0200h 041h A 0201h 042h 8 0202h 043h C 0203h 044h D 0204h 045h E
If
In the DEBUG display,
To make
CLD ;clear directio flag
To make
STD ;set .direction flag
CLD and STD have no effect on the other flags.
11.2 Moving a String
Suppose we have defined two strings as follows:
.DATA
STRING1 DB HELLO'
STRING2 DB 5 DUP (?)
and we would like to move the contents of STRING1 (the source string) into STRING2 (the destination string). This operation is needed for many string operations, such as duplicating a string or concatenating strings (attaching one string to the end of another string).
The MOVSB instruction
MOVSB
;move string byte
copies the contents of the byte addressed by DS:SI, to the byte addressed by IS:DI. The contents of the source byte are unchanged. After the byte has been moved, both SI and DI are automatically incremented if
MOV AX, @DATA
MOV DS, AX ;initialize DS
MOV ES, AX ;and ES
LEA SI, STRING1 ;SI points to source string
LEA DI, STRING2 ;DI points to destination string
CLD ;clear DF
MOVSB ;move first byte
MOVSB ;and second byte
See Figure 11.1.
: MOVSB is the first instruction we have seen that permits a memorymemory operation. It is also the first instruction that involves the ES register.
The REP Prefix
MOVSB moves only a single byte from the source string to the destination string. To move the entire string, first initialize CX to the number N of bytes in the source string and execute .
REP MOVSB
The REP prefix causes MOvSB to be executed N tmes. After each MOvSB, CX is decremented until it becomes 0. For example, to copy STRING1 of the preceding section into STRING2, we execute
CLD
LEA SI, STRING1
LEA DI, STRING2
MOV CX, 5 ; no. of chars in STRING1
REP MOvSB
Example 11.1 Write instructions to copy STRING1 of the preceding section into STRING2 in reverse order.
Solution: The idea is to get SI pointing to the end of STRING1, DI to the beginning of STRING2, then move characters as SI travels to the left across STRING1.
LEA SI, STRING1 + 4 ; SI pts to end of STRING1
LEA DI, STRING2 ; DI pts to beginning of STRING
STD ; right to left processing
MOV CX, 5
MOVE:
; move a byte
ADD DI, 2
LOOP MOVE
Here it is necessary to add 2 to DI after each MOvSB. Because we do this when
There is a word form of MOvSB. It is
MOvSW ; move string word
MOvSW moves a word from the source string to the destination string. Like MOvSB, it expects DS:SI to point to a source string word, and ES:DI to point to a destination string word. After a string word has been moved, both SI and DI are increased by 2 if
MOvSB and MOvSW have no effect on the flags.
Example 11.2 For the following array,
ARR DW 10, 20, 40, 50, 60, ?
write instructions to insert 30 between 20 and 40. (Assume DS and ES have been initialized to the data segment.)
Solution: The idea is to move 40, 50, and 60 forward one position in the array, then insert 30.
3TD ;right to left processing LFA 51,ARR+Bh ;SI pts to 60 EA DJ,ARR+Ah ;DI pts to ? K.Y.3 ;3 elts to move P MvSW ;move 40,50,60 ;V :ORD PTR [31],30 ;insert 30
Note: the PTR operator was introduced in section 10.2.3.
STOSB ; store string byte
Inoves the contents of the Al. register to the byte addressed by ES:DI. DI is incrmented if
STOSW ; store string word
Inoves the contents of AX to the word at address ES:DI and updates DI by 2, according to the direction flag setting.
STOSB and STOSW have no effect on the flags.
As an example of STOSB, the following instructions will store two "A"s in STRING1:
'40V AX, BDATA
IOV E3,AX
;initialize ES
EA DI,STRING: ;DI points to STRING1
CD ; process to the right
IOV ,L, 'A' ;AL has character to store
TCF3 ; store an 'A'
'OS1 : store another one
See Figure 11.2.
INI 21h, tininon 1 reads a character from the keyboard into AL. By repeatedly executting thi. internupt with STOSB, we can read and store a character r'ring. In additio, the chia.ars may be processed bulore storing them.
The loi.wing prccdure READ_STR reads and stores characters in a siring, until a rrnagk r:turn is typed. The proccdure is entered with the string offset idu rss in DI. 11 returns the string offset in 11, and number of chura.iers crr- recl in BX. If the user makes a typing inistake and hits the b.cks- ac: kcy, t. e previous character is removed from the string.
This prccdure is similar to DOS INI 21h, function OAh (see exercise 11.11).
chars_read 0
read a chir
WHILE char is not a carriage return DO
IF char is a backspace
THEN
chars_read : 'sct._read - 1
remove previous char from string
ELSE
store char in string
chars_read = chars_read + 1
END_IF
read a char
END WHILE
1: READ_STR PROC NEAR
2: ; Reads and stores a string
3: ; input: DI offset of string
4: ; output: DI offset of string
5: ; BX number of characters read
6: PUSH AX
7: PUSH DI
8: CLD ; process from left
9: XOR BX, BX ; no. cf chars read
10: MOV AH, 1 ; input char function
11: INT 21H ; read a char into AL
12: WHILE1:
13: CMP AL,0DH ;CR?,
14: JE END WHILE1 ; yes, exit
15: ; if char is backspace
16: CMP AL,8H ; backspace?
17: JNE ELSE1 ; no, store in string
18: ; then
19: DEC DI ; yes, move string ptr back
20: DEC BX ; decrement char counter
21: JMP READ ;and go to read another char
22: ELSE1:
23: STOSB ;store char in string 24: INC BX ;increment char count
25: READ:
26: INT 21H ;read a char into AI.
27: JMP WHILE1 ;and continue loop
28: END WHILE1:
29: POP DI
30: POP AX
31: RET.
32: READ STR ENDP
At line 23, the procedure uses STOSB to store input characters in the string. STOSB automatically increments DI; at line 24, the character count in BX is incremented.
The procedure takes into account the possibility of typing errors. If the user hits the backspace key, then at line 19 the procedure decrements DI and BX. The backspace itself is not stored. When the next legitimate character is read, it replaces the wrong one in the string. Note: if the last characters typed before the carriage return are backspaces, the wrong characters will remain in the string, but the count of legitimate characters in BX will be correct.
We use READ STR for string input in the following sections
LODSB ;load string byte
moves the byte addressed by DS:SI into AL. SI is then incremented if DF = 0 or decremented if DF = 1. The word form is
LODSW ;load string word
it moves the word addressed by DS:SI into AX; SI is increased by 2 if DF = 0 or decreased by 2 if DF = 1.
LODSB can be used to examine the characters of a string, as shown later.
LODSB and LODSW have no effect on the flags.
To illustrate LODSB, suppose STRING1 is defined as
STRING1 DB 'ABC'
The following code successively loads the first and second bytes of STRING1 into AL
MOV AX, @DATA
MOV DS, AX ;initialize DS
LEA SI, STRING1 ;SI' points to STRING1
CLD ;process left to right
LODSB ;load_first byte into AL
LODSB ;load second byte into AL
See Figure 11.3.
The following procedure DISP_STR displays the string pointed to by SI, with the number of characters in BX. It can be used to display all or part of a string.
FOR count times DO /* count = no. of characters to display *
load a string character into AL
move it to DL
output character
END_FOR
;displays a string
;input: SI = offset of string
; BX = no. of chars. to display
;output: none
PUSH AX
PUSH BX
PUSH CX
PUSH DX
PUSH SI
MOV CX,BX ;no. of chars
JCXZ F_EXIT ;exit if none
CI.D ;process left to right
MOV AH,2 ;prepare to print TOP: LODSB ;char in AL MOV DL,AL ;move it to DL INT 21H ;print char LOOP TOP ;loop until done P_EXIT: POP SI POP DX POP CX POP BX POP AX RET DISP_STR ENDP:
To demonstrate READ_STR and DISP_STR, we'll write a program that reads a string (up to 80 characters) and displays the first 10 characters on the next line.
Program Listing PGM11_3. ASM TITLE PGM11_3: TEST READ_STR and PRINT_STR .MODEL SMALL .STACK .DATA STRINGDB 80 DUP (0) CRLF DB ODH, OAH, 'S' .CODE MAIN PROC
MAIN PROC MOV AX,@DATA MOV DS,AX MOV ES,AX ;read a string LEA DI,STRING ;DI pts to string CALL READ_STR ;BX - no. of chars read ;go to a new line LEA DX,CRLF MOV AH,9 INT 21H ;print string LEA SI,STRING ;SI pts to string MOV BX,10 ;display 10 chars CALL DISP_STR ;dos exit MOV AH,4CH INT 21H MAIN ENDP ;READ_STR goes here ;DISP STR goes here END MAIN
Sample execution:
C>PGM11_3 THIS PROGRAM TESTS TWO PROCEDURES THIS PROGR
11.5 Scan String
The instruction
SCASB ; scan string byte
can be used to examine a string for a target byte. The target byte is contained in AL. SCASB subtracts the string byte pointed to by ES:DI from the contents of AL and uses the result to set the flags. The result is not stored. Afterward, DJ is incremented if
The word form is
SCASW ; scan string word
in this case, the target word is in AX. SCASW subtracts the word addressed by ES:DI from AX and sets the flags. DI is increased by 2 if
All the status flags are affected by SCASB and SCASW.
STRING1 DB 'ABC'
is defined, then these instructions examine the first two bytes of STRING1, looking for "B"
MOV AX, @DATA
MOV AX, ES ; initialize ES
CLD ; left to right processing
LEA DI, STRING1 ; DI pts to STRING1
MOV AL, 'B' ; target character
SCASB ; scan first byte
SCASB ; scan second byte
See Figure 11.4. Note that when the target "B" was found,
In looking for a target byte in a string, the string is traversed until the byte is found or the string ends. If CX is initialized to the number of bytes in the string,
REPNE, SCASB ; repeat SCASB while not equal
(to target)
will repeatedly subtract each string byte from AL, update DI, and decrement CX until there is a zero result (the target is found) or CX = 0 (the string ends). Note: REPNZ (repeat while not zero) generates the same machine code as REPNE.
As an example, let's write a program to count the number of vowels and consonants in a string.
Initialize vowel_count an : c 'sonnt_t_count 0:
Read and store a - 1.1.9
REPEAT
Load a st. Inc. t. ra. te:
IF it's a - - - w- 1
THEN
increment. vowel_count
ELSE IF it's a Consonant
THEN increment consonant_count
END IF
UNTIL end of string.
display no. of vowels.
display no. of consonants
We'll use procedure READ_STR (section 11.3) to read the string. It returns with: DI pointing to the string and BX containing the number of characters read. To display the number of vowels and consonants in the string, we'll use procedure OUTDEC of Chapter 9. It displays the contents of AX as a signed decimal integer. For simplicity, we'll suppose the input is in upper case.
0: TITLE PGM 11_4: COUNT VOWELS AND CONSONANTS
1: .MODEL SMALL
2: .STACK 100H
3: .DATA
4: STRING DB 80 DUP (0)
5: 'VOWELS DB 'AEIOU'
6: CONSONANTS DB 'BCDFGHJKLMNPQRSTWXYZ'
7: OUT1 DB ODH, OAH, 'vowels = $'
8: OUT2 DB ', consonants = $'
9: VOWELCT DW 0
10: CONSCT DW 0
12: MAIN PROC
13: MOV AX, @DATA
14: MOV DS, AX ; initialize DS
15: MOV ES, AX ; and ES
16: LEA DI, STRING ; DI pts to string
17: CALL READ_STR ; BX = no. of chars read
18: MOV SI, DI ; SI pts to string
19: CLD ; left to right processing
20: REPEAT:
21: ; load a string character
22: LODSB ; char in AL
23: ; if it's a vowel
24: LEA DI, VOWELS ; DI pts to vowels
25: MOV CX, 5 ; 5 vowels
26: REPNE SCASB ; is char a vowel?
27: JNE CK_CONST ; no other char
28: ; then increment vowel count
29: INC VOWELCT
30: JMP UNTIL
31: ; else if it's a consonant
32: CK_CONST:
33: LEA DI, CONSONANTS ; DI pts to consonants
34: MOV CX, 21 ; 21 consonants
35: REPNE SCASB ; is char a consonant?
36: JNE UNTIL ; no.
37: ; then increment consonant count
38: INC CONSCT
39: UNTIL:
40: DEC BX ; BX has no. chars left in str
41: JNE REPEAT ; loop if chars left
42: ; output no. of vowels
43: MOV AH, 9 ; prepare to print
44: LEA DX, OUT1 ; get vowel message
45: INT 21H ; print it
46: MOV AX, VOWELCT ; get vowel count
47: CALL OUTDEC ; print it
48: ; output no. of consonants
49: MOV AH, 9 ; prepare to print
50: LEA DX, OUT2 ; get consonant message
51: INT 21H ; print it
52: MOV AX, CONSCT ; get consonant count
53: CALL OUTDEC ; print it
54: ; dos exit
55: MOV AH, 4CH
56: INT 21H
57: MAIN ENDP
58: ; READ_STR goes here
59: ; OUTDEC goes here
60: END MAIN
Because the program uses both LODSB, which loads the byte in DS:SI, and SCASB, which scans the byte in ES:DI, both DS and ES must be initialized. BX is used as a loop counter and is set to the number of bytes in the string (CX is used elsewhere in the program).
Line 22. LODSB puts a string character in AL and advances SI to the next one.
Line 26. To see if the character in AL is a vowel, the program scans the string VOWELS by executing REPNE SCASB. This instruction subtracts each byte of VOWELS from AL and sets the flags. The instruction returns
Line 35. If the target was not a vowel, the program scans the string CONSONANIS, in exactly the same way it scanned VOWELS.
Sample execution:
C>PGM11_4
A,E,I,O,U ARE VOWELS.
vowe1c = 9, consonants = 5
CMPSB ; compare string byte
subtracts the byte with address ES:DI from the byte with address DS:SI, and sets the flags. The result is not stored. Afterward, both SI and DI are incremented if
The word version of CMPSB is
CMPSW ; compare string word
It subtracts the word with address ES:DI from the word whose address is DS:SI, and sets the flags. If
All the status flags are affected by CMPSB and CMPSW.
For example, suppose
DATA
STRING1 DB
STRING2 DB
The following instructions compare the first two bytes of the preceding strings:
MOV AX,@DATA
MOV DS,AX
MOV ES,AX
CLD
CLA LEA UI, STRING1
initialize DS
and ES
left to right processing
left to right processing
LEA DI, STRING2 ; DI pts to STRING2 CMPSB ; compare first bytes CMPSB ; compare second bytes
See Figure 11.5.
String comparison may be done by attaching the prefix REPE (repeat while equal) or REPZ (repeat while zero) to CMPSB or CMPSW. CX is initialized to the number of bytes in the shorter string, then
REPE CMPSB ; compare string bytes while equal
REPE CMPSW ; compare string words while equal
repeatedly executes CMPSB or CMPSW and decrements CX until (1) there is a mismatch between corresponding string bytes or words, or (2)
CMPSB may be used to compare two character strings to see which comes first alphabetically, or if they are identical, or if one string is a substring of the other (this means that one string is contained within the other as a sequence of consecutive characters).
As an example, suppose STR1 and STR2 are strings of length 10. The following instructions put 0 in AX if the strings are identical, put 1 in AX if STR1 comes first alphabetically, or put 2 in AX if STR2 comes first alphabetically (assume DS and ES are initialized).
MOV CX,10 ;length of STR1:19s
LEA SI,STR1 ;SI points to STR1
LEA DI,STR2 ;DI points to STR2
CLD ;left to right processing
REPE CMPSB ;compare string bytes
JL STR1_FIRST ;STR1 precedes STR2
JG STR2_FIRST ;STR2 precedes STR1
;here if strings are identical
MOV AX,0 ;put 0 in AX
JMP EXIT ;and exit
;here if STR1 precedes STR2
STR1_FIRST:
MOV AX,1 ;put 1 in AX
JMP EXIT
;and exit
;here if STR2 precedes STR1
STR2_FIRST: MOV AX,2 ;PUT 2 in AX
EXIT:
There are several ways to determine whether one string is a substring of another. The following way is probably the simplest. Suppose we declare
SUB1 DB 'ABC'
SUB2 DB 'CAB'
MAINST DB 'ABABCA'
and we want to see whether SUB1 and SUB2 are substrings of MAINST.
Let's begin with SUB1. We can compare corresponding characters in the strings
SUB1 A B C MA INST A B A B C A
Because there is a mismatch at the third comparison, we backtrack and try to match SUB1 with the part of MAINST from position MAINS1+1 on:
SUB1 A B.C
1
MAINST A B A B C A
There is a mismatch immediately, so we begin again, and at position MAINST+2
SUB1
This time we are successful; SUB1 is a substring of MAINST.
Now let's try with SUB2. The search proceeds as before until we reach
SUB2
There is a mismatch, and there is no need to proceed further, for if we did we would be trying to match the three characters of SUB2 with the two remaining characters "CA" of MAINST. Thus SUB2 is not a substring of MAINST.
Actually, we could have predicted the last place to search. It is
STOP = MAINST + length of MAINST - length of SUB2
Here is an algorithm and a program that searches a main string MAINST for a substring SUBST.
Prompt user to enter SUBST
Read SUBST
Prompt user to enter MAINST
kad MAINST
IF (length of MAINST is 0) OR (length of SUBST is 0) OR (SUBST is longer than MAINST)
THEN
SUBST is not a substring of MAINST
ELSE
c:empute STOP
START = offset of MAINST
REPEAT
compare corresponding characters in MAINST
(from START on) and SUBST
IF all characters match
THEN
SUBST found in MAINST
ELSE
START = START + 1
FND IF
UNTIL (SUBST found in MAINST)
OR (START > STOP)
Display results
After reading SUBST and MAINST, and verifying that neither string is null and SUBST is not longer than MAINST, in lines 44- 50 the program computes STOP (the place in MAINST to stop searching), and initializes START (the place to start searching) to the beginning of MAINST.
1: TITLE PGM11_5: SUBSTRING DEMONSTRATION
2: .MODEL SMALL
3: .STACK 100H
4: .DATA
5: MSG1 DB 'ENTER SUBST',0DH,0AH,'\(’
6: MSG2 DB 0DH,0AH,'ENTER MAINST',0DH,0AH,'\)
7: MAINST DB 80 DUP (0)
8: SUBST DB 80 DUP (0)
9: STOP DW ? ;last place to begin search
10: START DW ? ;place to resume search
11: SUB_LEN DW3 ;substring length
12: YESMSG DB 0DH,0AH,'SUBST IS A SUBSTRING OF MAINST$'
13: NOMSG DB 0DH,0AH,'SUBST IS NOT A SUBSTRING OF MAINST
14: .CODE
15: MAIN PXC
16: MOV AX,@DATA
17: MOV DS,AX
18: MOV ES,AX
19: ;prompt for SUBST
20: MOV AH,9 ;print string fcn
21: LEA DX,MSG1 ;substring prompt
22: INT 21H ;prompt for SUBST
23: ;read SUBST
24: LEA DI,SUBST
25: CALL READ_STR ;BX has SUBST length
26: MOV SUB_LEN,BX ;save in SUB_LEN
27: ;prompt for MAINST
28: LEA DX,MSG2 ;main string prompt
29: INT 21H ;prompt for MAINST
30: ;read MAINST
31: LEA DI,MAINST
32: CALL READ_STR ;BX has MAINST length
33: ;see if string null or SUBST longer than MAINST
34: OR BX,BX ;MAINST null?
35: JE NO ;yes, SUBST not a substring
36: CMP SUB_LEN,0 ;SUBST null?
37: JE NO ;yes, SUBST not a substring
38: CMP SUB_LEN,BX ;substring > main string?
39: JG NO ;yes, SUBST not a substring
40: ;see if SUBST is a substring of MAINST
41: LEA SI,SUBST ;SI pts to SUBST
42: LEA DI,MAINST ;DI pts to MAINST
43: CLD ;left to right processing
44: ;compute STOP
45: MOV STOP,DI ;STOP has MAINST address
46: ADD STOP,BX ;add MAINST length
47: MOV CX,SUB_LEN
48: SUB STOP,CX ;subtract SUBST length
49: ;initialize start
50: MOV START,DI ;place to start search
51: REPEAT:
52: ;compare characters
53: MOV CLEN ;length of substring
54: MOV DI,START ;reset DI
55: LEA SI, SUBST
;reset SI
;compare characters
;SUBST found
58: ;substring not found yet
;update START
59: INC START
;update
60: ;see if start <= stop
61: MOV AX, START
62: CMP AX, STOP ;START <= STOP?
63: JNLE NO ;no, exit
64: JMP REPEAT ;keep going
65: ;display results
66: YES:
67: LEA DX, YESMSG
68: JMP DISPLAY
69: NO:
70: LEA DX, NOMSG
71: DISPLAY:
72: MOV AH, 9
73: INT 21H ;display results
74: ;DOS exit
75: MOV AH, 4CH
76: INT 21H
77: MAIN ENDP
78: ;READ_STR goes here
79: END MAIN
At line 51, the program enters a REPEAT loop where the characters of SUBST are compared with the part of MAINST from START on. In lines 53- 56, CX is set to the length of SUBST, SI is pointed to SUBST, DI is pointed to START; and corresponding characters are compared with REPE_CMI'SB. If ZF = 1, then the match is successful and the program jumps to line 66 where the message "SUBST is a substring of MAINST" is displayed. If ZF = 0, there was a mismatch between characters and START is incremented at line 59. The search continues until SUBST matches part of MAINST or START > STOP; in the latter case, the message "SUBST is not a substring of MAINST" is displayed.
Sample executions:
C>PGM11_5
ENTER SUBST
ABC
ENTER MAINST
XYZABABC
SUBST IS A SUBSTRING OF MAINST
C>PGM11_5
ENTER SUBST
ABD
ENTER MAINST
ABACADACD
SUBST IS NOT A SUBSTRING OF MAINST
Let us summarize the byte and word forms of the string instructions:
Instruction Destination Source Byte form Word form Move string ES:DI DS:SI MOVSB MOVSW Compare string ES:DI DS:SI CMPSB CMPSW Store string ES:DI AL or AX STOSB STOSW Load string AL or AX DS:SI LODSB LODSW Scan string ES:DI AL or AX SCASB SCASW
Result not stored.
The operands of these instructions are implicit; that is, they are not part of the instructions themselves. However, there are forms of the string instructions in which the operands appear explicitly. They are as follows:
Example
MOVS destination_string, source_string
MOVSB
CMPS destination_string, source_string
CMPSB
STOS destination_string
STOS STRING2
LODS source_string
LODS STRING1
SCAS destination_string
SCAS STRING2
When the assembler encounters one of these general forms, it checks to see if (1) the source string is in the segment addressed by DS and the destination string is in the segment addressed by ES, and (2) in the case of MOVS and
CMPS, if the strings are of the same type; that is, both byte strings or word
strings. If so, then the instruction is coded as either a byte form, such as
MOVSB, or a word form, such as MOVSW, to match the data declaration of the string. For example, suppose that DS and ES address the following seg
ment:
.DATA
STRING1 DB 'ABCDE'
STRING2 DB 'EFGH'
STRING3 DB 'IJKL'
STRING4 DB 'MNOP'
STRING5 DW 1,2,3,4,5
STRING6 DW 7,8,9
Then the following pairs of instructions are equivalent
MOVS STRING2,STRING1 MOVS B
MOVS STRING6,STRING5 MOVSW
LODS STRING4 LODSB
LODS STRING5 LODSW
SCAS STRING1 SCASB
STOS STRING6 STOSW
It is important to note that if the general forms are used, it is still necessary to make DS:SI and ES:DI point to the source and destination strings, respectively.
There are advantages and disadvantages in using the general forms of the string instructions. An advantage is that because the operands appear as part of the code, program documentation is improved. A disadvantage is
that only by checking the data definitions is it possible to tell whether a general string instruction is a byte form or a word form. In fact, the operands specified in a general string instruction may not be the actual operands used when the instruction is executed! For example, consider the following code:
LEA SI, STRING1 ;SI PTS TO STRING2
LEA DI, STRING2 ;DI PTS TO STRING2
MOVS STRING4, STRING3
Even though the specified source and destination operands are STRING3 and STRING4, respectively, when MOVS is executed the first byte of STRING1 is moved to the first byte of STRING2. This is because the assembler translates MOVS STRING4, STRING3 into the machine code for MOVS B, and SI and DI are pointing to the first bytes of STRING1 and STRING2, respectively.
-
The string instructions are a special group of array-processing instructions.
-
The setting of the direction flag (DF) determines the direction that string operations will proceed. If DF = 0, they proceed left to right across a string; if DF = 1, they proceed right to left. CLD makes DF = 0 and STD makes it 1.
-
MOVS B moves the string byte pointed to by DS:SI into the byte pointed to by ES:DI, and SI and DI to be updated according to DF. MOVS W is the word form. These instructions may be used with the prefix REP, which causes the instruction to be repeated CX times.
-
REPE and REPNE are conditional prefixes that may be used with string instructions. REPE causes the string instruction that follows to be repeated CX times as long as ZF = 1. REPNE causes the following string instruction to be repeated CX times as long as ZF = 0. REPZ and REPNZ are alternate names for REPE and REPNE, respectively.
-
STOSB moves AL to the byte addressed by ES:DI, and updates DI according to DF. STOSW is the word form. STOSB may be used to read a character string into an array.
-
LODSB moves the byte addressed by DS:SI into AL, and updates SI according to DF. LODSW is the word form. LODSB may be used to examine the contents of a character string.
-
SCASB subtracts the byte pointed to by ES:DI from AL and uses the result to set the flags. The result is not stored, and DI is updated according to DF. SCASW is the word form; it subtracts the word pointed to by ES:DI from AX, sets the flags, and updates DI. The result is not stored. These instructions may be used to scan a string for a target byte or word in AL or AX.
-
CMPSB subtracts the byte pointed to by ES:DI from the byte pointed to by DS:SI, sets the flags, and updates both SI and DI according to DF. The result is not stored. The word form is CMPSW. These instructions may be used to compare character strings alphabetically, to see if two strings are identical, or if one string is a substring of another.
The string instructions have general forms in which the operands are explicit. The assembler uses the operands only to decide whether to code the instructions in byte or word form.
(memory) string A byte or word array
CLD LODSW SCASW CMPS MOVS STD CMP3B MOvSB STOS CMPSW MOvSW STOSB LODS SCAS STOSW LODSB SCASB
REP REPNE REPZ
REPE REPNZ
- Suppose
SI contains 100h Byte 100h contains 10h
DI contains 200h Byte 101h contains 15h
AX'contains 4142h Byte 200h contains 20h
DF = 0 Byte 201h contains 25h
Give the source, destination, and value moved for each of the following instructions. Also give the new contents of SI and DI.
a. MOvSB
b. MOvSW
c. STOSB
d. STOSW
C. LODSB
f. LODSW
- Suppose the following declarations have been made:
STRING1 DB "FGHIJ"
STRING2 DB "ABCDE"
DB 5 DUP (?)
Write instructions to move STRING1 to the end of STRING2, producing the string "ABCDEFGHIJ".
-
Write instructions to exchange STRING1 and STRING2 in exercise 2. You may use the five bytes after STRING2 for temporary storage.
-
An ASCIIz string is a string that ends with a 0 byte; for example,
STR DB THIS IS AN ASCIIZ STRING', C
Write a procedure LENGTH that receives the address of an ASClIZ string in DX, and returns its length in CX.
- Use the addressing modes of Chapter 10 to write instructions equivalent to each of the following string instructions. Assume where necessary that SI already has the offset address of the source string. DI has the offset address of the destination string, and
$\mathbf{DF} = \mathbf{0}$ . You may use AL for temporary storage. For SCASB and CMPSB the flags should reflect the result of the comparison.
a. MOvSB b. STCSB c. LOi.sF d. SCASB e. CMPSB
- Suppose the following string has been declared:
STRING DB TH- S G- S AR B- ASTS'
Write instructions that will cause each "" to be replaced by "E".
- Suppose the following string has been declared:
STRING1 Ls I I S A T 3 STRING2 DE 12 DUP (?)
Write some code that will cause STRING1 to be copied into STRING2 with the blank characters removed.
- A palindrome is a character string that reads the same forward or backward. In deciding if a string is a palindrome, we ignore blanks, punctuation, and letter case. For example "Madam, I'm Adam" or "A man, a plan, a canal, Panama!"
Write a program that (a) lets the user input a string, (b) prints it forward and backward without punctuation and blanks on successive lines, and (c) decides whether it is a palindrome and prints the conclusion.
- In spreadsheet applications, it is useful to display numbers rightjustified in fixed fields. For example, these numbers are right-justified in a field of 10 characters:
1345 2342545 56
Write a program to read ten numbers of up to 10 digits each, and display them as above.
- A character string STRING1 precedes another string STRING2 alphabetically if (a) the first character of STRING1 comes before the first character of STRING2 alphabetically, or (b) the first
$N - 1$ characters of the strings are identical, but the Nth character of STRING1 precedes the Nth character of STRING2, or (c) STRING1 matches the beginning of STRING2, but STRING2 is longer.
Write a program that lets the user enter two character strings on separate lines, and decides which string comes first alphabetically, or if the strings are identical.
- INT 21h, function QAh, can be used to read a character string. The first byte of the array to hold the string (the string buffer) must be initialized to the maximum number of characters expected. After execution of INT 21h, the second byte contains the actual number of characters read. Input ends with a carriage return, which is stored but not included in the character count. If the user enters more than the expected number of characters, the computer keeps.
Write a program that prints a "?", reads a character string of up to 20 characters using INT 21h, function QAh; and prints the string on the next line. Set up the string buffer like this:
SIRING ' LABEL. BYTE MAX LEN DB 20 ; maximum 50. of chars expected ACT LEN DB ? ; actual rc. of chars read CHARS DB 21 DUP (?);20 bytes for string ; etcra bytes for cartridge ; return
- Write a procedure INSERT that will insert a string STRING1 into a string STRING2 at a specified point.
SI offset address of STRING1
DI offset address of STRING2
BX length of STRING1
CX length of STRING2
AX offset address at which to insert STRING1
DI offset address of new string
BX length of new string
The procedure may assume that neither string has 0 length, and that the address in AX is within STRING2.
Write a program that inputs two strings STRING1 and STRING2, a nonnegative decimal integer N,
- Write a procedure DELLETE that will remove N bytes from a string at a specified point and close the gap.
Input
DI offset address of string
BX length of string
CX number of bytes N to be removed
SI offset address within string at which to remove bytes
DI offset address of new string
BX length of new string
The procedure may assume that the string has nonzero length, the number of bytes to be removed is not greater than the length of the string, and that the address in SI is within the string.
Write a program that reads a string STRING, a decimal integer S that represents a position in STRING, a decimal integer N that represents the number of bytes to be removed (both integers between 0 and 80), calls DELETE to remove N bytes at position S, and prints the resulting string. You may assume O ≤ N ≤ L - S, where L = length of STRING.
Part Two
One of the most interesting and useful applications of assembly lan guage is in controlllIng the monitior display. In this chapter, we program such operations as inIovIng the cursor, scrolling windows on the screen, and dis playing characters with various attributes. We also show how to program the keyboard; so that if the user presses a key, a screen control function is per formed; for example, we'll show how to make the arrow keys operate.
The display on the screen is determined by data stored in memory. The chapter begins with a discussion of how the display is generated and how it can be controlled by altering the display memory directly. Next, we'll show how to do screen operations by using BIOS funct. n calls. These func tions car; also be used to detect keys being pressed; as a den. onstration, we'll write a simple screen editor.
A computer monitor operates on the same principle as a TV set. An electron gun is used to shoot a stream.of electrons at a phosphor screen, creating a bright spot. Lines are generated by sweeping the stream across the screen; dots are created by turning the beam on and off as it moves.
A raster of lines is created by starting the beam at the top left corner, sweeping it to the right, then turning it off and repositioning it at the beginning of the next line. This process is repeated until the last line has been traced, at which point the beam is repositioned at the top left corner and the process is repeated.
There are two kinds of monitors: monochrome and color. A monochrome monitor uses a single electron beam and the screen shows only one color, typically amber or green. By varying the intensity of the electron beam, dots of different brightness can be created; this is called a gray scale.
For a color monitor, the screen is coated with three kinds of phosphors capable of displaying the three primary colors of red, green, and blue. Three electron beams are used in writing dots on the screen; each one is used to display a different color. Varying the intensity of the electron beams produces different intensities of red, green, and blue dots. Because the red, green, and blue dots are very close together, the human eye detects a single homogeneous color spot. This is what makes the monitor show different colors.
The display on the monitor is controlled by a circuit in the computer called a video adapter. This circuit, which is usually on an add- in card, has two basic units: a display memory (also called a video buffer) and a video controller.
The display memory stores the information to be displayed. It can be accessed by both the CPU and the video controller. The memory address starts at segment A000h and above, depending on the particular video adapter.
The video controller reads the display memory and generates appropriate video signals for the monitor. For color display, the adapter can either generate three separate signals for red, green, and blue, or can generate a composite output when the three signals are combined. A composite monitor uses the composite output, and an RGB monitor uses the separate signals. The composite output contains a color burst signal, and when this signal is turned off, the monitor displays in black and white.
We commonly see both text and picture images displayed on the monitor. The computer has different techniques and memory requirements for displaying text and picture graphics. So the adapters have two display modes: text and graphics. In text mode, the screen is divided into columns and rows, typically 80 columns by 25 rows, and a character is displayed at each screen position. In graphics mode, the screen is again divided into columns and
Mnemonic Stands For
MDA Monochrome Display Adapter
CGA Color Graphics Adapter
EGA Enhanced Graphics Adapter
MCGA Multi- color Graphics Array
VGA Video Graphics Array
rows, and each screen position is called a pixel. A picture can be displayed by specifying the color of each pixel on the screen. In this chapter we concentrate on text mode; graphics mode is covered in Chapter 16.
Let's take a closer look at character generation in text mode. A character on the screen is created from a dot array called a character cell. The adapter uses a character generator circuit to create the dot patterns. The number of dots in a cell depends on the resolution of the adapter, which refers to the number of dots it can generate on the screen. The monitor also has its own resolution, and it is important that the monitor be compatible with the video adapter.
Table 12.1 lists the video adapters for the IBM PC. They differ in resolution and the number of colors that can be displayed.
IBM introduced two adapters with the original PC, the MDA (Monochrome Display Adapter) and CGA (Color Graphics Adapter). The MDA can only display text and was intended for business software, such as word processors and spread sheets, which at that time did not use graphics. It has good resolution, with each character cell being
In 1984 IBM introduced the EGA (Enhanced Graphics Adapter), which has good resolution and color graphics. The character cell is
In 1988 IBM introduced the PS/2 models, which are equipped with the VGA (Video Graphics Array) and MCGA (Multi- color Graphics Array) adapters. These adapters have better resolution and can display more colors in graphics mode than EGA. The character cell is
Depending on the kind of adapter present, a program can select text or graphics modes. Each mode is identified by a mode number; Table 12.2 lists the text modes for the different kinds of adapters.
Table'12.2 Video Adapter Text Modes
| Mode Number | Description | Adapters |
| 0 | 40 x 25 16-color text (color burst off) | CGA,EGA,MCGA,VGA |
| 1 | 40 x 25 16-color text | CGA,EGA,MCGA,VGA |
| 2 | 80 x 25 16-color text (color burst off) | CGA,EGA,MCGA,VGA |
| 3 | 80 x 25 16-color text | CGA,EGA,MCGA,VGA |
| 7 | 80 x 25 monochrome text MDA,EGA,VGA |
Note: For modes 0 and 2, the color burst signal is turned off for composite monitors; RGB monitors will display 16 colors.
As discussed earlier, the screen in text mode is usually divided into 80 columns by 25 rows. However, a 40- column by 25- row display is also possible for the color graphics adapters.
A position on the screen may be located by giving its (column row) coordinates. The upper left corner has coordinate (0,0); for a
The character displayed at a screen position is specified by the contents of a word in the display memory. The low byte of the word contains the character's ASCII code; the high byte contains its attribute, which tells how the character will be displayed (its color, whether it is blinking; underlined, and so on). Actually, all 256 byte combinations have display characters (see Appendix A). Attributes are discussed later.
For the MDA, the display memory can hold one screenful of data. The graphics adapters, however, can store several screens of text data. This is because graphics display requires more memory, so the memory unit in a graphics adapter is bigger. To fully use the display memory, a graphics adapter divides its display memory into display pages. One page can hold the data for one screen. The pages are numbered, starting with 0; the number of pages available depends on the adapter and the mode selected. If more than one page is available, the program can display one page while updating another one.
Table 12.4 shows the number of display pages for the MDA, CGA, EGA, and VGA in text mode. In the
Table 12.3 Some 80 x 25 Screen Positions
| Position | Decimal | Hex | ||
| Column | Row | Column | Row | |
| Upper left corner | 0 | 0 | 0 | 0 |
| Lower left corner | 0 | 24 | 0 | 18 |
| Upper right corner | 79 | 0 | 4F | 0 |
| Lower right corner | 79 | 24 | 4f | 18 |
| Center of the screen | 39 | 12 | 27 | C |
Table 12.4 Number of Text Mode Display Pages
| Modes | CGA | EGA | VGA |
| -0-1 | 8 | 8 | 8 |
| 2-3 | 4 | 8 | 8 |
| 7 | NA | 8 | 8 |
The Active Display PageThe active display page is the page currently being displayed: for
In a display page, the high byte of the word that specifies a display character is called the attribute byte. It describes the color and intensity of the character, the background color, and whether the character is blinking and/or underlined.
The attribute byte for 16- color text display (modes 0- 3) has the format shown in Figure 12.1. A.1 in a bit position selects an attribute characteristic. Bits 0- 2 specify the color of the character (foreground color) and bits 4- 6 give the color of the background at the character's position. For example, to display a red character on a blue background, the attribute byte should be
By adding red, blue, and green, other colors can be created. On the additive color wheel (Figure 12.2), a complement color can be produced by adding adjacent primary colors; for example, magenta is the sum of red and blue. To display a magenta character on a cyan background, the attribute is
If the intensity bit (bit 3) is 1, the foreground color is lightened. If the blinking bit (bit 7) is 1, the character turns on and off. Table 12.5 shows the possible colors in 16- color display. All the colors can be used for the color of the character; the background can use only the basic colors.
For monochrome display, the possible colors are white and black. For white, the RGB bits are all 1; for black, they are all 0. Normal video is a white character on a black background; the attribute byte is 0000 (0111 = 7h. Reverse video is a black character on a white background, so the attribute is 0111 (000) = 70h.
Figure 12.2 Additive Color Wheel
As with color display, the intensity bit can be used to brighten a white character and the blinking bit can turn it on and off. For the monochrome adapter only, two attributes give an underlined character. They are 01h for normal underline and 09h for bright underline. Table 12.6 lists the possible monochrome attributes.
Table 12.5 Sixteen-Color Text Display
| Basic Colors | I R G B | Color |
| 0 0 0 0 | black | |
| 0 0 0 1 | blue | |
| 0 0 1 0 | green | |
| 0 0 1 1 | cyan | |
| 0 1 0 0 | red | |
| 0 1 0 1 | magenta | |
| 0 1 1 0 | brown | |
| 0 1 1 1 | white | |
| Bright Colors | 0 0 0 0 | black |
| 1 0 0 0 | gray | |
| 1 0 0 1 | light blue | |
| 1 0 1 0 | light green | |
| 1 0 1 1 | light cyan | |
| 1 1 0 0 | light red | |
| 1 1 0 1 | light magenta | |
| 1 1 1 0 | yellow | |
| 1 1 1 1 | interse white |
- Attribute Byte
Binary Hex Result
0000 0000 00 black on black
0000 0111 07 normal (white on black)
0000 0001 01 normal underline
0000 1111 0F bright (intense white on black)
0000 1001 09 bright underline
0111 0000 70 reverse video (black on white)
1000 0111 80 normal blinking
1000.1111 8F bright blinking
1111 1111 FF bright blinking
1111 0000 FO reverse video blinking
To display a character with attribute at any screen position, it is only necessary to store the character and attribute at the corresponding word in the active display page. The following program fills the color screen with red "A" s on a blue background.
1: TITLE PGM12_1: SCREEN DISPLAY_1
2: .MODEL SMALL
3: .STACK 100H
4: .CODE
5: MAIN PROC
6: ;set DS to active display page
7: MOV AX,0B800h ;color active display page
9: MOV DS,AX
9: MOV CX,2000 ;80 x 25 = 2000 words
10: MOV DI,0 ;initialize DI
11: ;fill active display page
12: FILL_BUF:
13: MOV [DI],1441h ;red A on blue
14: ADD DI,2 ;go to next word
15: LOOP FILL_BUF ;loop until done
16: ;dos exit
17: MOV AH,4CH
18: INT 21H
19: MAIN ENDP
20: END MAIN
To display a red "A" on a blue background at a screen position, the corresponding active display page word should contain 14h in the high byte and 41h in the low byte.
The program begins by initializing DS to the video buffer segment, which is B800h for a color adapter. Loop counter CX is set to 2000—the number of words in the active display page—and DI is initialized to 0. At line 13, the program enters a loop that moves 1441h into each word of the video buffer.
After the program is run, the screen positions retain the same attributes unless another program changes it or the computer is reset.
INT 10H
Even though we can display data by moving them directly into the active display page, this is a very tedious way to control the screen.
Instead we use the BIOS video screen routine which is invoked by the INT 10h instruction; a video function is selected by putting a function number in the AH register.
In the following, we discuss the most important INT 10h functions used in text mode and give examples of their use. The INT 10h functions used in graphics mode are discussed in Chapter 16. Appendix C has a more complete list.
INT 10h, Function 0:
Select Display Mode
Input: AH = 0
AI. = mode number (see Table 12.2)
Output: none
Example 12.1 Set the CGA adapter for
XOR AH, A1
; select display mode function
MCV A1, 3
; 0x25 color text mode
INT 10h, Function 0:
when BIOS sets the display mode, it also clears the screen.
INT 10h, Function 1:
Change Cursor Size
Input: AH = 1
CH = starting scan line
C.L. = ending scan line
Output: none
In text mode, the cursor is displayed as a small dot array at a screen position (in graphics mode, there is no cursor). For the MDA and EGA, the dot array has 14 rows (0- 13) and for the CGA, there are 8 rows (0- 7). Normally only rows 6 and 7 are lit for the CGA cursor, and rows 11 and 12 for the MDA, and EGA cursor. To change the cursor size, put the starting and ending numbers of the rows to be lit in CH and CL, respectively.
Example 12.2 Make the cursor as large as possible for the MDA.
MOV .AH,i ;cursor size function MOV CH,0 ;starting. row MOV. CL,13 ;ending.row INT 10H. ;change cursor size
INT 10h,Function 2:
Move Cursor
Input:
DH
DL
0- 39 for
BH
Output: none
This function lets the program move the cursor anywhere on the screen. The page doesn't have to be the one currently being displayed.
Example 12.3 Move the cursor to the center of the
Solution: The center of the
MOV AH,2
;move cursor, function
;page 0
;row = 12, column = 39
;move cursor
INT 10h,Function 3:
Get Cursor Position and Size
Input:
BH
Output:
DI.
CH
CL
For some applications, such as moving the cursor up one row, we need to know its current location.
Example 12.4 Move the cursor up one row if not at the top of the screen on page 0.
MOV AH,3
XOR BH,BH ;page 0
INT 10H ;DH = row, DL = column
OR DH,DH ;cursor at top of screen?
JZ EXIT ;yes, exit
MOV AH,2 ;move cursor function
DEC DH ;row = row - 1
INT 10H ;move cursor
EXIT:
INT 10h, Function 5:
Select Active Display Page
Input: AH = 5
AL = active display page
0- 7 for modes 0, 1
0- 3 for CGA modes 2, 3
0- 7 for EGA, MCGA, VGA modes 2, 3
0- 7 for EGA, VGA mode 7
Output: none
This function selects the page to be displayed.
Example 12.5 Select page 1 for the CGA.
MOV AH,5 ;select active display page function
MOV AL,1 ;page 1
INT 10H ;select page
INT 10h, Function 6:
Scroll the Screen or a Window Up
Input: AH = 6
AL = number of lines to scroll (AL = 0 means scroll the whole screen or window)
BH = attribute for blank lines
CH,CL = row, column for upper left corner of window
DH,DL = row, column for lower right corner of window
Output: none
Scrolling the screen up one line means moving each display line up one row, and bringing in a blank line at the bottom. The previous top row disappears from the screen.
The whole screen, or any rectangular area (window) may be scrolled. AL contains the number of lines to be scrolled. If
Example 12.6 Clear the screen to black for the
MOV. 4, 6 ; scroll up function
XOR AL, AL ; clear whole screen
XOR CX, CX ; upper left corner is (0, 0)
MOV DX, 184Fh ; lower right corner is (4Fh, 18h)
MOV BH, 7 ; normal video attribute
INT 10H ; clear screen
INT 10h, Function 7:
Scroll the Screen or a Window Down
Input: AH = 7
AL = number of lines to scroll (AL = 0; means
scr. the whole screen or window)
BH = attribute for blank lines
CH, CL = row, column for upper left corner of window
DH, DL = row, column for lower right corner of window
Output: none
If the screen or window is scrolled down one line, each line moves down one row, a blank line is brought in at the top, and the bottom row disappears.
INT 10h, Function 8:
Read Character at the Cursor
Input: AH = 8
BH = page number
Output: AH = attribute of character
AL = ASCII code of character
In some applications, we need to know the character at the cursor position. BH contains a page number, which doesn't have to be the one being displayed. After execution, AL contains the ASCII code of the character, and AH contains its attribute. We'll see an example that uses this function in a moment. Let's first look at a function that writes a character.
Input:
Output: none
With function 9, the programmer can specify an attribute for the character. CX contains the number of times to display the character, starting at the cursor position.
Unlike INT 21h, function 2, the cursor doesn't advance after the character is displayed. Also, if AL contains the ASCII code of a control character, a control function is not performed; instead, a display symbol is shown.
The following example shows how functions 8 and 9 can be used together to change the attribute of a character.
Example 12.7 Change the attribute of the character under the cursor to reverse video for monochrome display.
MOV AH,8 ;read character
XOR BH,BH ;on page 0
INT 10H ;character in AL, attribute in AH
MOV AH,9 ;display character with attribute
MOV CX,1 ;display 1 character
MOV BL,70H ;reverse video attribute
INT 10H ;display character
INT 10h, Function Ah:
Display Character at the Cursor with Current Attribute
Input:
AL
CX
Output: none
This function is like function 9, except that the attribute byte is not changed, so the character is displayed with the current attribute.
INT 10h, Function Eh:
Display Character and Advance Cursor
Input:
AH = 0Eh
AL = ASCII code of character
BH = page number
BL = foreground color (graphics mode only)
Output: none
This function displays the character in AL and advances the cursor to the next position in the row, or if at the end of a row, it sends it to the beginning of the next row. If the cursor is in the lower right corner, the screen is scrolled up and the cursor is set to the beginning of the last row. This is the BIOS function used by INT 21h, function 2, to display a character. The control characters bell (07h), backspace (08h), line feed (0Ah), and carriage return (0Dh) cause control functions to be performed.
INT 10h, Function Eh: Get Video Mode
Input: AH = 0Fh
Output: AH = number of screen columns
AL = display mode (see Table 12.2)
BH = active display page
This function can be used with function 5 to switch between pages being displayed.
Example 12.8 Change the display page from page 0 to page 1, or from page 1 to page 0.
MOV AH, OFH ;get video mode
INT 10H ;BH = active page
MOV AL, BH ;move to AL
XOR AL, 1 ;complement bit 0
MOV AH, 5 ;select active page
INT 10H ;select new page
12.3.4
A Comprehensive
Example
To demonstrate several of the INT 10h functions, we write a program to do the following:
-
Set the display to mode 3 (80 × 25 16-color text).
-
Clear a window with upper left corner at column 26, row 8, and lower right corner at column 52, row 16, to red.
-
Move the cursor to column 39, row 12.
-
Print a blinking, cyan "A" at the cursor position.
If you have a color adapter and monitor, you can see the output by running the program in program listing PGM12_2. ASM.
TITLE PGM12_2: SCREEN DISPLAY_2
;red screen with blinking cyan "A" in middle of screen
MODEL SMALL
:STACK 100H
:CODE
MAIN PROC
;set video mode
MOV AH,0 ;select mode function
MOV AL,3 ;80x25 color text
INT IOH ;select mode
;clear window to red
MOV AH,6 ;scroll up function
MOV CX,081Ah ;upper left corner (1Ah,08h)
MOV DX,1034h ;lower right corner (34h,10h)
MOV BH,43H ;cyan chars on red background
MOV AL,0 ;scroll all lines
INT .10H ;clear window
;move cursor
MOV AH,2 ;move cursor function
MOV DX,0C27h ;center of screen
XOR BH,BH ;page 0
INT IOH ;move cursor
;display character with attribute
MOV AH,09 ;display character function
MOV BH,0 ;page 0
MOV BL,0C3H ;blinking cyan char, red back
MOV CX,1 ;display one character
MOV AL,'A';character is'A'
INT IOH ;display character
;dos exit
MOV AH,4CH
INT 21H
MAIN ENDP
END MAIN
12.4 The Keyboard
There are several keyboards in use for the IBM PC. The original keyboard has 83 keys. Now, more computers use the enhanced keyboard with 101 keys. In general, we can group the keys into three categories:
-
ASCII keys; that is, keys that correspond to ASCII display and control characters. These include letters, digits, punctuation, arithmetic and other special symbols; and the control keys Esc (escape), Enter (carriage return), Backspace, and Tab.
-
Shift keys: left and right shifts, Caps Lock, Ctrl, Alt, Num Lock, and Scroll Lock. These keys are usually used in combination with other keys.
-
Function keys: F1-F10 (F1-F12 for the enhanced keyboard), the arrow keys, Home, PgUp, PgDn, End, Ins, and Del. We call them function keys because they are used in programs to perform special functions.
Each key on the keyboard is assigned a unique number called a scan code; when a key is pressed, the keyboard circuit sends the corresponding scan code to the computer. Scan code values start with 1. Table 12.7 shows the scan codes of shift and function keys. A complete list of scan codes for the 101- key keyboard may be found in Appendix H.
You may wonder how the computer detects a combination of keys, such as the Ctrl- Alt- Del combination that resets the computer. There must be a way for the computer to know that a key has been pressed, but not yet released.
To indicate a key's release, the keyboard circuit sends another code called a break code; derived from the key's scan code by changing the msb to 1 (the scan code itself is also known as a make code). For example, the make code for the Esc key is 01h and its break code is 81h.
The computer does not store information on every key that is pressed and not yet released; it only does so for the function key Ins, and the shift keys. This information is saved as individual bits called keyboard flags stored
Table 12.7 Scan Codes for Shift and Function Keys
| Hex | Decimal | Key |
| 1D | 29 | Ctrl |
| 2A | 42 | left Shift |
| 3B | 56 | Alt |
| 3A | 58 | aps Lock |
| 3B-44 | 59-68 | 1-F10 |
| 45 | 69 | jum Lock |
| 46 | 70 | scroll Lock |
| 47 | 71 | tome |
| 48 | 72 | jp arrow |
| 49 | 73 | gUp |
| 4B | 75 | eft arrow |
| 4C | 76 | ieypad S |
| 4D | 77 | ight arrow |
| 4F | 79 | nd |
| 50 | 80 | own arrow |
| 51 | 81 | gOn |
| 52 | 82 | is |
| 53 | 83 | iel |
in the byte at 0040:0017. A program can call a BIOS routine to investigate these flags.
To prevent the user from typing ahead of a program, the computer uses a 15- word block of memory called the keyboard buffer to store keys that have been typed but not yet read by the program. Each keystroke is stored as a word, with the high byte containing the key's scan code, and the low byte containing its ASCII code if it's an ASCII key, or 0 if it's a function key. A shift key is not stored in the buffer. When a left or right shift, Ctrl, or Alt key is down, some keys will cause a combination key scan code to be placed in the keyboard buffer (see Appendix H).
The contents of the buffer are released when a program requests key inputs. The key values are passed onto the program in the same order that they come in; that is, the keyboard buffer is a queue. If a key input is requested and the buffer is empty, the system waits until a key is pressed. If the buffer is full and the user presses a key, the computer sounds a tone.
To summarize the preceding discussion, let's see what happens when you press a key that is read by the current executing program:
-
The keyboard sends a request (interrupt 9) to the computer.
-
The interrupt 9 service routine obtains the scan code from the keyboard I/O port and stores it in a word in the keyboard buffer (high byte = scan code, low byte = ASCII code for an ASCII key, 0 for a function key).
-
The current program may use INT 21h, function 1, to read the ASCII code. This also causes the ASCII code to be displayed (echoed) to the screen.
In the next section, we'll show how a program can process keyboard inputs using INT 16h. To get both the scan code and ASCII code, a program may access the keyboard buffer directly or use the BIOS routine INT 16h.
BIOS INT 16h provides keyboard services. As with INT 10h, a program can request a service by placing the function number in AH before calling INT 16h. In what follows, we use only function 0.
INT 16h, Function 0: Read Keystroke
Input:
Output:
This function transfers the first available key value in the keyboard buffer into AX. If the buffer is empty, the computer waits for the user to press key. ASCII keys are not echoed to the screen.
The function provides a way for the program to decide if a function key is pressed. If
Example 12.9 Move the cursor to the upper left corner if the F1 key is pressed, to the lower right corner if any other function key is pressed. If a character key is pressed, do nothing.
MOV AH,0 ;read keystroke function
INT 16H ;AL = ASCII code or 0,
;AH = scan code
OR AL,AL ;AL = 0 (function key) ?
JNE EXIT ;no, character key
CMP AH,3BH ;scan code for F1 ?
JE F1 ;yes, go to move cursor
;other function key
MOV DX,184FH ;lower right corner
JMP EXECUTE ;go to move cursor
F1:
XOR DX,DX ;upper left corner
EXECUTE:
MOV AH,2 ;move cursor function
XOR BH,BH ;page 0
INT 10H ;move cursor
EXIT:
To show how the function keys may be programmed, here is a program that does some of the things that a basic word processor does. It first clears the screen and puts the cursor in the upper left corner, then lets the user type text on the screen, operate some of the function keys, and finally exits when the Esc key is pressed.
Clear the screen
Move the cursor to the upper left corner
Get a keystroke
WHILE key is not the Esc key DO
IF function key
THEN
perform function
ELSE /* key must be a character key */
display character
END IF
Get a keystroke
END WHILE
The Esc key can be detected by checking for an ASCII code of 1Bh. To demonstrate how the function keys can be programmed, a procedure DO_FUNCTION is written to program the arrow keys. They operate as follows:
Up arrow. Causes the cursor to move up one row unless it's at the top of the screen. If so, the screen scrolls down one line.
Down arrow. Causes the cursor to move down one row unless it's at the bottom of the screen. If so, the screen scrolls up one line.
Right arrow. Causes the cursor to move right one column, unless it's at the right margin. If so, it moves to the beginning of the next row. But if it's in the lower right corner, the screen scrolls up one line.
Left arrow. Causes the cursor to move left one column, unless it's at the left margin. If so, it moves to the end of the previous row. But if it's in the upper right corner, the screen scrolls down one line.
Get cursor position;
Examine scan code of last key pressed;
CASE scan code OF
up arrow:
IF cursor is at the top of the screen /* row 0 */
THEN
scroll screen down
ELSE
move cursor up one row
END_IF
down arrow:
IF cursor is at the bottom of the screen /* row 24 */
THEN
scroll screen up
ELSE
move cursor down
END_IF
left arrow:
IF cursor is not at beginning of a row /* column 0 */
THEN
move cursor to the left
ELSE /* cursor is at beginning of a row */
IF cursor is in row 0 /* position (0,0) */
THEN scroll screen down
ELSE
ELSE
move cursor to the end of previous row
END_IF
END_IF
right arrow:
IF cursor it not at end of a row
THEN
move cursor to the right
ELSE /* cursor is at end of a row */
IF cursor is in last row /* row 24 */
THEN
scroll screen up
ELSE
move cursor to the beginning of next row
END_IF
END_IF
END_CASE
Here is the program:
0: TITLE PGM12_3: SCREEN EDITOR
i: .MODEL SMALL
2: .STACK 100H
3: .CODE
4: MAIN PROC
5: ;set video mode and clear screen
6: MOV AH,0 ;set mode function
7: MOV AL,3 ;80 x 25 color text
8: INT 10H ;set mode
9: ;move cursor to upper left corner
10: MOV AH,2 ;move cursor function
11: XOR DX,DX ;position (0,0)
12: MOV BH,0 ;page 0
13: INT 10H ;move cursor
14: ;get keystroke
15: MOV AH,0 ;keyboard input function
16: INT 16H ;AH=scan code,AL=ASCII code
17: WHILE_
18: CMP AL,1BH ;ESC (exit character)?
19: JE END WHILE ;yes, exit
20: ;if function key
21: CMP AL,0 ;AL = 0?
22: JNE ELSE_ ;no, character key
23: ;then
24: CALL DO_FUNCTION ;execute function
25: JMP NEXT_KEY ;get next keystroke
26: ELSE_ ;display character
27: MOV AH,2 ;display character func
28: MOV DL,AL ;get character
29: INT 21H ;display character
30: NEXT_KEY:
31: MOV AH,0 ;get keystroke function
32: INT 16H ;AH=scan code,AL=ASCII code
33: JMP WHILE_
34: END WHILE:
35: ;dos exit
36: MOV AH,4CH
37: INT 21H
38: MAIN ENDP
39:
40: DO_FUNCTION PRO
41: ; operates the arrow keys
42: ; input: AH = scan code
43: ; output: none
44: PUSH 'BX
45: PUSH CX
46: PUSH DX
47: PUSH AX ;save scan code
48: ;locate cursor
49: MOV AH,3 .; ;get cursor position
50: 'MOV BH,0 ;on page 0
51: INT 10H ;DH = row, DL = col
52: POP AX ;retrieve scan code
53: ;case scan code of
54: CMP AH,72 ;up arrow?
55: JE CURSOR_UP ;yes, execute
56: CMP AH,75 ;left arrow?
57: JE CURSOR_LEFT ;yes, execute
58: CMP AH,77 ;right arrow?
59: JE CURSOR_RIGHT ;yes, execute
60: CMP AH,80 ;down arrow?
61: JE CURSOR_DOWN ;yes, execute
62: CMP EXIT ;other function key
63: CURSOR_UP:
64: CMP DH,0 ;row 0?
65: JE SCROLL_DOWN ;yes, scroll down
66: DEC DH ;no, row = row - 1
67: JMP EXECUTE ;go to execute
68: CURSOR_DOWN:
69: CMP DH,24 ;last row?
70: JE SCROLL_UP ;yes, scroll up
71: INC DH ;no, row = row + 1
72: JMP EXECUTE ;go to execute
73: CURSOR_LEFT:
74: CMP DL,0 ;column 0?
75: JNE GO_LEFT ;no, move to left
76: CMP DH,0 ;row 0?
77: JE SCROLL_DOWN ;yes, scroll down
78: DEC DH ;row = row - 1
79: MOV DL,79 ;last column
80: JMP EXECUTE ;go to execute
81: CURSOR_RIGHT:
82: CMP DL,79 ;last column?
83: JNE GO_RIGHT ;no, move to right
84: CMP DH,24 ;last row?
85: JE SCROLL_UP ;yes, scroll up
86: INC DH ;row = row + 1
87: MOV DL,0 ;col = 0.
88: JMP EXECUTE ;go to execute
89: GO_LEFT:
90: DEC DL ;col = col - 1
91: JMP EXECUTE ;go to execute
92: GO_RIGHT:
93: INC DL ;col = col + 1
94: JMP EXECUTE ;go to execute
95: SCROLL_DOWN:
96: MOV AL,1 ;scroll 1 line
97: XOR CX,CX ;upper left corner = (0,0)
98: MOV DH,24 ;last row
99: MOV DL,79 ;last column
100: MOV BH,07 ;normal video attribute
101: MOV AH,7 ;scroll down function
102: INT 10H ;scroll down 1 line
103: JMP EXIT ;exit procedure
104: SCROLL_UP:
105: MOV AL,1
106: XOR CX,CX ;upper left corner = (0,0)
107: MOV DX, 184FH ; lower rt corner (4Fh, 18h) 108: MOV BH, 07 ; normal video attribute 109: MOV AH, 6 ; scroll up function 110: INT 10H ; scroll up 111: JMP EXIT ; exit procedure 112: EXECUTE: 113: MOV AH, 2 ; cursor mov. function 114: INT 10H ; move cursor 115: EXIT: 116: POP DX 117: POP CX 118: POP BX 119: RET 120: DO_FUNCTION ENDP 121: END MAJN
The program begins by setting the video mode to
Procedure DO_FUNCTION is entered with the scan code of the last keystroke in AH. This is saved on the stack (line 47), while the procedure determines the cursor position (lines 49- 51). Alter restoring the scan code to AH (line 52), the procedure checks to see if it is the scan code of one of the arrow keys (lines 54- 61). If not, the procedure terminates.
If AH contains the scan code of an arrow key, the procedure jumps to a block of code where the appropriate cursor move is executed. DH and DI. contain the row and column of the cursor location, respectively.
If the cursor is not at the edge of the screen (row 0 or 24, column 0 or 79), DH and DI. arc updated. To move up, the row number in DH is decremented; to move down, it is incremented. To move left, the column number in DL is decremented; to move right, it is incremented. After updating DH and DL, the procedure jumps to line 112, where INT 10h, function 2, does the actual cursor move.
For the up arrow key, if the cursor is in row 0 the procedure at line 64 jumps to code block SCROLL_DOWN, which scrolls the screen down one line. Similarly, for the down arrow key, if the cursor is in row 24 the procedure at line 72 jumps to code block SCROLL_UP where the screen is scrolled up one line.
For the left arrow key, if the cursor is in the upper left corner (0,0) the procedure jumps to SCROLL_DOWN (line 77). If it's at the left margin and not row 0, we want to move to the end of the previous row. To do this, the row number in DH is decremented, DL gets 79, and the procedure jumps to line 112 to do the cursor move.
Similarly for the right arrow key, if the cursor is in the lower right corner the procedure jumps to SCROLL_UP (line 85). If it's at the right margin and not row 24, we want to move to the beginning of the next row. To do
this, the row number in DH is incremented, DL gets 0, and the procedure jumps to line 112 to do the cursor move.
The program can be run by assembling and linking file PGM12_3. ASM. As you play with it, its shortcomings become apparent. For example, text scrolled off the screen is lost. It is possible to type over text, but not to insert or delete text.
-
A video adapter contains memory and a video controller, which translates data into an image on the screen. The adapters are the MDA, CGA, FGA, MCGA, and VGA. They differ in resolution and the number of colors they can display.
-
There are two kinds of display modes: text mode and graphics mode. In text mode, a character is displayed at each screen position; in graphics mode, a pixel is displayed.
-
In text mode, a screen position is specified by its (column, row) coordinates. A character and its attribute can be displayed at each position.
-
In
$80 \times 25$ text mode, the memory on the video adapter is divided into 4-KB blocks called display pages. The number of pages available depends on the kind of adapter. The screen can display one page at a time; the page being displayed is called the active display page. -
The display at each screen position is specified by a word in the active display page. The low byte of the word gives the ASCII code of the character and the high byte its attribute.
-
The attribute byte specifies the foreground (color of the character) and background at each screen position. Other attributes are blinking and underline (MDA only).
-
For monochrome display, the foreground and background colors are white (RGB bits all 1's) or black (RGB bits all 0's). Normal video attribute is 07h; reverse video is 70h.
-
BIOS Interrupt INT 10h routine performs screen processing. A number placed in AH identifies the screen function.
-
INT 16h function 0, is a BIOS function for reading keystrokes. AH gets the scan code, and AL the ASCII code for a character key. For a function key, AH gets the scan code and AL = 0.
-
A program can use INT 16h and INT 10h to program the function keys for controlling the screen display.
active display page
attribute
attribute byte
break code
CGA
character cell
display memory
display page
EGA
function keys
graphics mode
gray scale
keyboard buffer
make code
MCGA
MDA
mode number
normal video
resolution
rcversc video
scan codes
tctxt mode
VGA
video adapter
video buffer
video controller
The display page currently being shown on the screen
A number that specifies how a character will be displayed
The high byte of the word that specifies a display character; it contains the character's attribute
Number used to indicate when a key is released- - obtained by putting 1 in the msh of a key's scan code
Color Graphics Adapter
Dot array used to form a character on the screen
Memory unit of a video adapter
The portion of display memory that . holds one screenful of data
Enhanced Graphics Adapter
Keys that don't correspond to ASCII characters or shifts
Display mode that can show pictures
Different levels of brightness in monochrome display
A 15- word block of memory used to hold keystrokes
Same as scan code
Multi- color Graphics Array
Monochrome Display Adapter
A number used to select a text or graphics display mode
White character on a black background
The number of dots a video adapter can display
Black character on a white background
Numbers used to identify a key
Display mode in which only characters are shown
Video Graphics Array
Circuit that controls monitor display
The memory that stores data to be displayed on the monitor; same as display memory
Control unit of a video adapter
- To demonstrate the video buffer, enter DEBUG and do the following:
a. If your machine has a monochrome adapter, use the R command to put B000h in DS; if it has a color adapter, put B800h in DS.
b. We can now enter data directly into the video buffer, and see the results on the screen. To do so, use the E command to enter data, starting at offset 0. For example, to display a blinking reverse video A in the upper left corner of the screen, put 41h in byte 0 and H0h in byte 1. Now enter different character:attribute values in words 2, 4, and so on, and watch the changes on the top row of the screen.
- Write some code to do the following (assume
$80 \times 25$ monochrome display, page 0). Each part of this exercise is independent.
a. Move the cursor to the lower right corner of the screen.
b. Locate the cursor and move it to the end of the current row.
c. Locate the cursor and move it to the top of the screen in the current column.
d. Move the cursor to the left one position if not at the beginning of a row.
e. Clear the row the cursor is in to white.
f. Scroll the column the cursor is in down one line (normal video).
g. Display five blinking reverse video "A"s, starting in the upper left corner of the screen.
- Assuming
$80 \times 25$ color display, write some code to turn the color of each capital letter character in row 0 to red and the local background to brown. Other characters should retain their previous foreground and background colors. Assume page 0.
- Write a program to
a. Clear the screen, make the cursor as large as possible, and move it to the upper left corner. b. Program the following function keys:
Home key: Cursor moves to the upper left corner.
End key: Cursor moves to the lower left corner.
PgUp key: Cursor moves to the upper right corner.
PgDn key: Cursor moves to the lower right corner.
Esc key: Program terminates.
Any other key: Nothing happens.
- Write a program to
a. Clear the screen to black, move the cursor to the upper left corner.
b. Let the user type his or her name.
c. Clear the input line, and display the name vertically in column 40, starting at the top of the screen. Use
- Write a program that does the following:
a. Clear the screen, move the cursor to row 12, column 0.
b. If the user types a character, the character is displayed at the cursor position. Cursor does not advance.
c. Program the following function keys:
Right Arrow: The program moves cursor and character to the right one position, unless it is at the right margin. A blank appears at the cursor's previous position.
Left Arrow: The program moves cursor and character to the left one position, unless it is at the left margin. A blank appears at the cursor's previous position.
Escape: The program terminates.
Other function keys: Nothing happens.
- Write a one-line screen editor that does the following:
a. Clear screen, and position cursor at the beginning of row 12.
b. Let the user type text. Cursor advances after each character is displayed unless cursor is at the right margin.
c. Left arrow moves cursor left except at left margin; right arrow moves cursor right except at right margin. Other arrow keys do not operate.
d. Ins key makes the cursor and each character to the right of the cursor (in the cursor's row) move right one position. A blank appears at the cursor's previous position. The last character in the cursor's row is pushed off the screen.
e. Del key causes each character to the right of the cursor (in the cursor's row) to move left one position, and a blank is brought in at the right.
f. Esc key terminates the program.
Overview
In previous chapters we have shown how programming may be simplified by using procedures. In this chapter, we discuss a program structure called a macro, which is similar to a procedure.
As with procedures, a macro name represents a group of instructions. Whenever the instructions are needed in the program, the name is used. However, the way procedures and macros operate is different. A procedure is called at execution time; control transfers to the procedure and returns after executing its statements. A macro is invoked at assembly time. The assembler copies the macro's statements into the program at the position of the invocation. When the program executes, there is no transfer of control.
Macros are especially useful for carrying out tasks that occur frequently. For example, we can write macros to initialize the DS and ES registers, print a character string, terminate a program, and so on. We can also write macros to eliminate restrictions in existing instructions; for example, the operand of MUL can't be a constant, but we can write a multiplication macro that doesn't have this restriction.
A macro is a block of text that has been given a name. When MASM encounters the name during assembly, it inserts the block into the program. The text may consist of instructions, pseudo- ops, comments, or references to other macros.
The syntax of macro definition is
macro_name MACRO d1,d2,..dn
statements ENDM
Here macro_name is the user- supplied name for the macro. The pseudo- ops MACRO and ENDM indicate the beginning and end of the macro definition; d1, d2, ... dn is an optional list of dummy arguments used by the macro.
One use of macros is to create new instructions. For example, we know that the operands of MOV can't both be word variables, but we can get around this restriction by defining a macro to move a word into a word.
Example 13.1 Define a macro to move a word into a word.
MOVW MACRO WORD1,WORD2
PUSH WORD2
POP WORD1
ENDM
Here the name of the macro is MOVW. WORD1 and WORD2 are the dummy arguments.
To use a macro in a program, we invoke it. The syntax is
macro_name a1,a2, an
where a1, a2, ... an is a list of actual arguments. When MASM encounters the macro name, it expands the macro; that is, it copies the macro statements into the program at the position of the invocation, just as if the user had typed them in. As it copies the statements, MASM replaces each dummy argument di by the corresponding actual argument ai and creates the machine code for any instructions.
A macro definition must come before its invocation in a program listing. To ensure this sequence, macro definitions are usually placed at the beginning of a program. It is also possible to create a library of macros to be used by any program, and we do this later in the chapter.
Example 13.2 Invoke the macro MOVW to move B to A, where A and B are word variables.
Solution: MOVW A, B
To expand this macro, MASM would copy the macro statements into the program at the position of the call, replacing each occurrence of WORD1 by A, and WORD2 by B. The result is
PUSH B
POP A
In expanding a macro, the assembler simply substitutes the character strings defining the actual arguments for the corresponding dummy ones. For example, the following calls to the MOVW macro:
MOVW A, DX and MOVW A+2, B
cause the assembler to insert this code into the program:
PUSH DX and PUSH B
POP A POP A+2
There are often restrictions on the arguments for a macro. For example, the arguments in the MOVW macro must be memory words or 16- bit registers. The macro invocation
MOVW AX,1ABCn
generates the code
PUSH 1ABCn
POP AX
and because an immediate data push is illegal (for the 8086/8088), this results in an assembly error. One way to guard against this situation is to put a comment in the macro; for example,
MOVW MACRO WORD1,WORD2
arguments must be memory words or 16- bit 1egis1s
PUSH WORD2
POP WORD1
ENDM
Good programming practice requires that a procedure should restore the registers it uses, unless they contain output values. The same is usually true for macros. As an example, the following macro exchanges two memory words. Because it uses AX to perform the exchange, this register is restored.
CH MACRO WORD1,WORD2
PUSH AX
MOV AX,WORD1
XCHG AX,WORD2
MOV WORD1,AX
POP AX
ENDM
The .LST file is one of the files that can be generated when a program is assembled ' :ee Appendix D). It shows assembly code and the corresponding machine cule, addresses of variables, and other information about the program. The LS1 file also shows how macros are expanded. To demonstrate this, the foll wing program contains the MOVW macro and two invocations:
TITLE PGM13_1: MACRO DEMO
.MODEL SMALL
MOVW MACRO WORD1,WORD2
PUSH WORD2
POP WORD1
ENDM
.STACK 100H
.DATA
A DW 1,2
B DW 3
.CODE
MAIN PROC
MOV AX, @DATA
MOV DS, AX
MOVW A, DX
MOVW A- 2, B
; dos exit
MOV AH, 4CH
INT 21H
MAIN ENDP
END MAIN
Figure 13.1 shows file PGM13_1. LST. In this file, MASM prints the macro invocations, followed by their expansions (shown in boldface). The digit 1 that appears on each line of the expansions means these macros were invoked at the "top level"; that is, by the program itself. We will show later that a macro may invoke another macro.
Microsoft (R) Macro Assembler Version 5.10
1/18/92 00:03:08
PGM13_1: MACRO DEMO
Page 1- 1
TITLE PGM13_1: MACRO DEMO
MODEL SMALL
MOVW MACRO WORD1, WORD2
PUSH WORD2
POP WORD1
ENDM
STACK 10011
. DATA
0000 0001 0002 A DW 1,2
0004 0003 B DW 3
CODE
0000 MAIN PROC
0000 B8 R MOV AX, @DATA
0003 8E D8 MOV DS, AX
MOVW A, DX
0005 52 1 PUSH DX
0006 8F 06 0000 R 1 POP A
MOVW A+2, B
000A FF 36 0004 R 1 PUSH B
000E 8F 06 0002 R 1 POP A+2
; dos exit
0012 B4 4C MOV AH, 4CH
0014 CD 21 INT 21H
0016: MAIN ENDP
END MAIN
Microsoft (R) Macro Assembler Version 5.10
1/18/92 00:03:08
PGM13_1: MACRO DEMO
Macros:
Symbols- 1
N a m e Lines
MOVW 2
Segments and Groups:
N a m e Length Align Combine Class
DGROUP GROUP
- DATA 0006 WORD PUBLIC 'DATA'
STACK 0100 PARA STACK STACK
TEXT 0016 WORD PUBLIC 'CODE'
Symbols:
N a m e Type Value Attr
A L WORD 0000 DATA
B L WORD 0004 DATA
MAIN N PROC 0000 _TEXT
Length = 0016
@CODE TEXT _TEXT
@CODESIZE TEXT 0
@CPU TEXT 0101h
@DATASIZE TEXT 0
@FILENAME TEXT PGM14_1
@VERSION TEXT 510
21 Source Lines
25 Total Lines
21 Symbols
47930 + 4220033 Bytes symbol space free
0 Warning Errors
0 Severe Errors
Three assembler directives govern how macro expansions appear in the .LST file. These directives pertain to the macros that follow them in the program.
-
After .SALL (suppress all), the assembly code in a macro expansion is not listed. You might want to use this option for large macros, or if there are a lot of macro invocations.
-
After .XALL, only those source lines that generate code or data are listed. For example, comment lines are not listed. This is the default option.
-
After .LALL (list all), all source lines are listed, except those beginning with a double semicolon (;;).
These directives do not affect the machine code generated in the macro invocations, only the way the macro expansion appear in the .LST file.
Example 13.3 Suppose the MOVW macro is rewritten as follows:
MOVW MACRO WORD1,WORD2 ;moves source to destination ;uses the stack PUSH WORD2 POP WORD1 .ENDM
Show how the following macro invocations would appear in a .LST file
.XALL
MOVW DS,CS
.LALL
MOVW P,Q
.SALL
MOVW AX,[SI]
Solution:
.XALL
MOVW DS,CS
PUSH CS
POP DS
.LALL
MOVW P,Q
;moves source to destination
PUSH Q
POP P
.SALL
MOVW AX,[SI]
If MASM finds an error during macro expansion, it indicates an error at the point of the macro invocation; however, it's more likely that the problem is within the macro itself. To find where the mistake really is, you need to inspect the macro expansion in the .LST file. The .LST file is especially helpful if you have a macro that invokes other macros (see discussion later).
13.2 Local Labels
A macro with a loop or decision structure contains one or more labels. If such a macro is invoked more than once in a program, a duplicate label appears, resulting in an assembly error. This problem can be avoided by using local labels in the macro. To declare them, we use the LOCAL pseudo- op, whose syntax is
LOCAL list_of_labels
where list_of_labels is a list of labels, separated by commas. Every time the macro is expanded, MASM assigns different symbols to the labels in the list. The LOCAL directive must appear on the next line after the MACRO statement; not even a comment can precede it.
Example 13.4 Write a macro to place the largest of two words in AX.
GET_BIG MACRO WORD1,WORD2
LOCAL EXIT
MOV AX,WORD1
CMP AX,WORD2
JG EXIT
MOV AX,WORD2
EXIT:
ENDM
Now suppose that FIRST, SECOND, and THIRD are word variables. A macro invocation of the form
GET_BIG FIRST, SECOND
expands as follows:
MOV AX, FIRST
CMP AX, SECOND
JG ??0000
MOV AX, SECOND
??0000:
A later call of the form
GET_BIG SECOND, THIRD
expands to this code:
MOV AX, SECOND
CMP AX, THIRD
JG ??0001
MOV AX, THIRD
??0001:
Subsequent invocations of this macro or to other macros with local labels causes MASM to insert labels ??0002, ??0003, and so on into the program. These labels are unique and not likely to conflict with ones the user would choose.
13.3
Macros that invoke Other Macros
A macro may invoke another macro. Suppose, for example, we have two macros that save and restore three registers:
SAVE REGS MACRO R1, R2, R3 RESTORE REGS MACRO S1, S2, S3
PUSH R1
PUSH R2
PUSH R3
ENDM ENDM
These macros are invoked by the macro in the following example.
Example 13.5 Write a macro to copy a string. Use the SAVE_REGS and RESTORE_REGS macros.
COPY MACRO SOURCE,DESTINATION,LENGTH
SAVE_REGS CX,SI,DI
LEA SI,SOURCE
LEA DI,DESTINATION
CLD
MOV CX,LENGTH
REP MOVSB
RESTORE_REGS DI,SI,CX
ENDM
If MASM encounters the macro invocation
COPY STRING1,STRING2,15
it will copy the following code into the program:
PUSH CX
PUSH SI
PUSH DI
LEA SI,STRING1
LEA DI,STRING2
CLD
MOV CX,15
REP MOVSB
POP DI
POP SI
POP CX
Note: A macro may invoke itself; such macros are called recursive macros. They are not discussed in this book.
The macros that a program invokes may be contained in a separate file. This makes it possible to create a library file of useful macros. For example, suppose the file's name is MACROS, on a disk in drive A. When MASM encounters the pseudo- op
INCLUDE A:MACROS
in a program, it copies all the macro definitions from the file MACROS into the program at the position of the INCLUDE statement (note: the INCLUDE directive was discussed in section 9.5). The INCLUDE statement may appear anywhere in the program, as long as it precedes the invocations of its macros.
If a macro library is included in a program, all its macro definitions will appear in the .LST file, even if they're not invoked in the program. To prevent this, we can insert the following:
IF1
ENDIF
Here, IIf1 and ENDIF are pseudo- ops. The IIf1 directive causes the assembler to access the MACROS file during the first assembly pass, when macros are expanded, but not during the second pass, when the LST file is created. Note: Other conditional pseudo- ops are discussed in section 13.6.
The following are examples of macros that are useful to have in a macro library file.
Example 13.6 Write a macro to return to DOS.
Solution:
DOS RTN MACRO MOV AH,4CH INT 21H ENDM
The macro invocation is
DOS RTN
Example 13.7 Write a macro to execute a carriage return and line feed.
Solution:
NEW LINE MACRO
MOV AH,2
MOV DL,ODH
INT 21H
MOV DL,0AH.
INT 21H
ENDM
The macro invocation is
NEW LINE
The next example is one of the more interesting macros.
Example 13.8 Write a macro to display a character string. The string is the macro parameter.
Solution:
DISP STR MACRO
STRING
LOCAL
START,MSG
;save registers
PUSH AX
PUSH DX
PUSH DS
.JMP START
MSG DB STRING,'$'
START:
MOV AX,CS
MOV DS,AX ;set DS to code seg.
MOV AH,9
LEA DX,MSG
INT
21H
;restore registers
POP DS
POP DX
POP AX
ENDM
DISP_STR 'this is string'
When this macro is invokct, the string parameter replaces the dummy parameter STRING. Because the string is being stored in the code segment, CS must be moved to DS; this take two instructions, because a direct move between segment registers is forbidden.
The preceding macros have been placed in file MACROS on the student disk. They are used in the following program, which displays a message, goes to a new : - , and displays another message.
TITLE PGM13_2: MACRO DEMO
.MODEL SMALL
.STACK 100H
11
INCLUDE
MACROS
ENDIF
.CODE
MAIN PROC
DISP STR 'this is the first line'
NEW_LINE
DISP STR 'and this is the second line'
DOS RTN
MAIN ENDP
END MAIN
Sample execution:
C>PGM13_2
this is the first line and this is the second line
The macro expansions are shown in file PGM13_2. LST (Figure 13.2). To save space, the machine code has been edited out.
TITLE PGM13_2: MACRO DEMO
.MODEL. SMALL
.STACK 100H
.CODE
MAIN PROC
DISP_STR 'this is the first line'
1 PUSH AX
1 PUSH DX
1 PUSH DX
1 JMP ??0000
1 ??0001 DB 'this is the
first line'
$'
1 ??0000:
1 MOV AX,CS
1 MOV DS,AX ;set DX to
code segment
1 MOV AH,9
1 LEA DX, ??0001
1 INT 21H
1 POP DS
1 POP DX
1 POP AX
NEW_LINE
1 MOV AH,2
1 MOV U..ODH
1 INT 21H
1 MOV DL,0AH
1 INT 21H
DISP_STR and this is the second line'
1 PUSH AX
1 PUSH DX
1 PUSH DS
1 JMP ??0002
1 ??0003 DB 'and this is the
second line',$'
1 ??0002:
1 MOV AX,CS
1 MOV DS,AX ;set DX to
code segment
1 MOV AH,9
1 LEA DX, ??0003
1 INT 21H
1 POP DS
1 POP DX
1 POP AX
DOS RTN
1 MOV AH,4CH
1 INT 21H
MAIN END
END MAIN
The REPT macro can be used to repeat a block of statements. Its syntax is
REPT expression statements ENDM
When the assembler encounters this macro, the statements are repeated the number of times given by the value of the expression. A REPT macro may be invoked by placing it in the program at the point that the macro's statements are to be repeated. For example, to declare a word array A of five zeros the following can appear in the data segment:
A LABEL WORD REPT 5 DW 0 ENDM
Note: The LABEL pseudo- op was discussed in section 10.2.3. MASM expands this as follows:
A DW 0 DW 0 DW 0 DW 0 DW 0
Of course, this example is trivial because we can just write
A DW 0
Another way to invoke a REPT macro is to place it an ordinary macro, and invoke that macro.
Example 13.9 Write a macro to initialize a block of memory to the first
BLOCK MACRO N
Note: In this macro, we used the
To define a word array A and initialize it to the first 100 integers, we can place the following statements in the data segment:
A. LABEL WORD BLOCK 100
Invocation of the BLOCK macro initializes K to 1 and the statements inside the REPT are assembled 100 times. The first time, DW 1 is generated and K is increased to 2; the second time, DW 2 is generated and K becomes 3, ... the 100th time, DW 100 is generated and K = 101. The final result is equivalent to
A DW 1 DW 2 V DW 100
Example 13.10 Write a macro to initialize an n- word array to 1!,2!, ... and show how to invoke it.
FACTORIALS MACRO N
M = 1
FAC = 1
REPT = N
DW FAC
M = M+1
FAC = M*FAC
END
END
To define a word array B of the first eight factorials, the data segment can contain
B LABEL WORD
FACTORIALS 8
Because 8' = 40320 is the largest factorial that will fit in a 16- bit word, it doesn't make sense to invoke this macro for larger values of N. The expansion is
B DW 1 DW 2 DW 6 DW 24 DW 120 DW 720 DW 5040 DW 40320
Another repetition macro is IRP (indefinite repeat). It has the form IRP_d, <a1,a2, an>
statements
ENDM
Note: The angle brackets in the above definition are part of the syntax.
When it is expanded, this macro causes the statements to be assembled n times; on the ith expansion, each occurrence of parameter d is replaced by al.
Example 13.11 Write macros to save and restore an arbitrary number of registers.
SAVE_REGS MACRO REGS RESTORE_REGS MACRO REGS
ERR D,
PUSH D POP D
ENDM ENDM
ENDM ENDM
To save AX,BX,CX,DX, we can write
SAVE_REGS <AX,BX,CX,DX>
It has the following expansion:
PUSH AX
PUSH BX
PUSH CX
PUSH DX
To restore these registers, write,
RESTORE_REGS <DX,CX,BX,AX>
To use the macro structures introduced so far, we write a macro HEX_OUT that displays the contents of a word as four hex digits. The hex output algorithm, discussed in Chapter 7, is the following:
1: FOR 4 times DO
2: Move BH to DL
3: shift. DL 4 times to the right
4: IF DL < 10
5: THEN
6: convert contents of DL to a character in '0'..'9'
7: ELSE
8: convert contents of DL to a character in 'A'..'F'
9: END_IF
10: output character
11: Rotate BX left 4 times
12: END_FOR
The following listing contains the macro HEX_OUT and a program to test it. HEX_OUT invokes four other macros: (1) SAVE_REGISTERS and (2) RESTORE_REGISTERS from example 13.11; (3) CONVERT_TO_CHAR, which converts the contents of a byte to a hex digit character (lines 4- 9 in the algorithm); and (4) DISP_CHAR, which displays a character (line 10 in the algorithm).
0: TITLE PGM13_3: HEX OUTPUT MACRO DEMO
1: .MODEL SMALL
2:
3: SAVE_REGS MACRO REGS
4: IRR D,
5: PUSH D
6: ENDM
7: ENDM
8:
9: RESTORE_REGS MACRO REGS
10: IRP D,
11: POP D
12: ENDM
13: ENDM
14:
15: CONVERT_TO_CHAR MACRO BYT
16: LOCAL ELSE, EXIT
17: ;converts contents of BYT to a hex digit char
18: ;if
19: CMP BYT,9 ;contents <= 9?
20: JNLE LSE_ ;no, >> Ah
21: ;then
22: OR BYT,30H ;convert to digit char
23: JMP EXIT
24: ELSE_
25: ADD BYT,37H ;convert to digit char
26: EXIT:
27: ENDM
28:
29: DISP_CHAR MACRO BYT
30: ;displays contents of BYT
31: PUSH AX
32: MOV AH,2
33: MOV DL,BYT
34: INT 21H
35: POP AX
36: ENDM
37:
38: HEX_OUT MACRO WRD
39: ;displays contents of WRD as 4 hex digits
40: SAVE_REGS <BX,CX,DX>
41: MOV BX,WRD
42: MOV CL,4 ;shift and rotate count
43: REPT 4
44: MOV DL,BH
45: SHR DL,CL ;shift right 4 times
46: CONVERT_TO_CHAR DL ;convert DL to digit char
47: DISP_CHAR DL ;display D
48: KOL BX,CL ;rotate left 4 times
49: ENDM
50: RESTORE_REGS.<DX,CX,BX>
51: ENDM
52:
53: .STACK
54 : .CODE
55 : ;program to test above macros
56: MAIN PROC
57: MOV AX,1AF4h ;test data
58: HEX_OUT AX ;display in hex
59: MOV AH,4CH ;dos exit
60: INT 2iH
61: MAIN ENDP
62: END MAIN
Sample execution:
C>PGM13_3
1AF4
To code the FOR loop in the hex output algorithm, HEX_OUT uses a REPT ...ENDM (lines 43- 49). This was done mostly for illustrative purposes; It makes the machine code of the expanded macro longer, but it has the advantage of freeing CL for use as a shift and rotate counter.
At line 46, macro CONVERT_TO_CHAR is invoked to transform the contents of DL to a hex digit character. This macro has two local labels, declared at line 16. At line 47, macro DISP_CHAR is invoked to display the contents of DL.
13.7 Conditionals
Conditional pseudo- ops may be used to assemble certain statements and exclude others. They may be used anywhere in an assembly language program, but are most often used inside macros. The basic forms are
Conditional and Conditional
statements statementsl
ENDIF ELSE
ELSE
statements2
ENDIF
In the first form, if Conditional evaluates to true, the statements are assembled; if not, nothing is assembled. In the second form, if Conditional is true, then statements1 are assembled; if not, statements2 are assembled (ELSE and ENDIF are pseudo- ops).
Table 13.1 gives the forms of the most useful conditional pseudo- ops and what'is required for them to be evaluated as true.
In section 13.4, we used the conditional IF1 to include a macro library in a program. The next examples show how some of the other conditionals may be used.
Form TRUE IF
IF exp
IFE exp
IFB
IFNB
IFDEF sym
IFNDEF sym
IFIDN ,
IFDIF ,
IF1
IF2
Constant expression exp is nonzero.
Exp is zero.
Argument arg is missing (blank). Angle brackets are required.
Arg is not missing (not blank).
Symbol sym is defined in the program (or declared as EXTRN).
Note: The EXTRN directive is discussed in Chapter 14.
Sym is not defined or EXTRN.
Strings str1 and str2 are identical. Angle brackets are required.
Strings str1 and str2 are not identical.
Assembler is making the first assembly pass.
Assembler is making the second assembly pass.
Example 13.12 Write a macro to define a block of memory words with
BLOCK MACRO N,K
I 1
REPT N
IF
DW I
I + 1
ELSE.
DW 0
ENDIF
ENDM
ENDM
If this macro is invoked to define an array
A LABEL WORD
BLOCK 10,5
the expansion initializes
inside REPT 10 times. After five passes, DW 1 ... DW 5 are generated and
A DW 1,2,3,4,5,0,0,0,0
Recall from Chapter 11, exercise 9, that INT 21h, function OAH, stores a string that the user types in the byte array whose offset address is contained in DX. The first byte of the array must contain the maximum number of characters expected. DOS fills in the next byte with the actual number of characters read.
Example 13.13 Write a macro READ MACRO BUF,LEN that either uses INT 21h, function OAH, to read a string into the byte array BUF of length LEN (if both arguments are present), or uses INT 21h, function 1, to read a single character into AL (if both arguments are missing).
READ MACRO BUF, MAXCHARS
; BUF = STRING BUFFER ADDRESS
; LEN = MAX NO. OF CHARS TO READ
IFNB
IFNB
MOV AH, OAH ; read string FCN,
LEA DX, BUF ; DX has string ADDR
MOV BUF, LEN ; lst byte has array size
INT 21H ; read string
ENDIF
ELSE
MOV AR, 1 ; read char FCN
INT 21H ; read char
ENDIF
ENDM
If the preceding macro is invoked by the statement
READ MSG, 10
then since both arguments are present, the code
MOV AH, OAH
LEA DX, MSG
MOV MSG, 10
INT 21H
is assembled. String MSG must be a declared array of at least 13 bytes (1 byte for the maximum number of characters expected, 1 byte for the actual number of characters read, 10 bytes for the characters, and 1 byte for a carriage return). If the macro invocation is
REID
then since both arguments are blank, the code following ELSE, namely
MOV AH, 1
INT 21h
is assembled. If this macro is improperly invoked with only one argument—for example, READ MSG—then no code is assembled.
Because macros may be called in a variety of situations, it's possible they may be invoked incorrectly. The .ERR directive provides a way for the assembler to tell the user about this. If MASM encounters this directive, it displays the message "forced error", which indicates a fatal assembly error.
Example 13.14 Write a program containing a macro to display a character. The macro should produce an assembly error if its parameter is omitted.
Solution:
Program Listing PGM13_4. ASM
TITLE PGM13_4: .ERR DEMO
.MODEL SMALL
.STACK 100H.
DISP_CHAR MACRO CHAR
IFNB
MOV AH,2
MOV DL,CHAR
INT 21H
ELSE
.ERR
ENDIF
ENDH
.CODE
MAIN
PROC
DISP_CHAR 'A' ;legal call
DISP_CHAR ;illegal call
MOV AH,4CH
INT 21H
MAIN
ENDP
END
MAIN
C>MASM PGM13_4;
Microsoft (R) Macro Assembler Version 5.10.
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.
PGM13_4. ASM(15): error A2089: Forced error
50050 + 418663 Bytes symbol space free
0 Warning Errors
1 Severe Errors
Macros and procedures are alike in the sense that both are written to carry out tasks for a program, but it can sometimes be difficult for a programmer to decide which structure is best in a given situation. Here are some considerations:
A program containing macros usually takes longer to assemble than a similar program containing procedures, because it takes time to expand the macros. This is especially true if library macros are involved.
The code generated by a macro expansion generally executes faster than a procedure call, because the latter involves saving the return address, transferring control to the procedure, passing data into the procedure, and returning from the procedure.
A program with macros is generally larger than a similar program with procedures, because each macro invocation causes a separate code block to be copied into the program. However, a procedure is coded only once.
Macros are especially suitable for small, frequently occurring tasks. Liberal use of such macros can result in source code that resembles high- level language. However, big jobs are usually best handled by procedures, because big macros generate large amount of code if they are called very often.
A macro is a named block of text. It may consist of instructions, pseudo- ops, or references to other macros.
A macro is invoked at ass. expand a macro, MASM copies the macro to an into the program at the position of the invocation, just as if the user had typed it in. If the macro has a dummy parameter list, actual parameters replace the dummy ones. MASM replaces any instructions by machine language code.
An important use of macros is to create new instructions.
Macro expansions may be viewed in a program's .LST file. Three asscmblc; directives govern how the expansion will appear. After .SALL, the macro expansion is not listed. After .XALL, only those lines that generate source code are listed. After .LALL, all source lines are listed, except comments that are preceded by :..
Local labels may be used within a macro. Each time the macro is invoked, a different label is generated. This gets around the problem of having duplicate labels resulting from several macro invocations.
A macro may invoke another macro, or itself.
A library file of macros can be created. Its macros may be used in a program if the INCLUDE pseudo- op is used.
The REPT macro may be used to repeat a block of statements. It has a single argument that specifies the number of times to repeat the statements. It can be placed in the program at the point the statements are to be repeated, or enclosed in another macro. The REPT macro has no name field.
The IEP macro may be used to repeat statements an arbitrary number of times.
By using conditional pseudo- ops within macros, MASM can be made to assemble certain statements and exclude others.
The .ERR directive provides a way to inform the user that a macro is being incorrectly called.
Macro and procedures each have advantages. Programs with macros usually take longer to assemble, and they generate more machine code, but execute faster. Small tasks are often best handled by macros, and procedures are better for large tasks.
conditional pseudo- ops
Pseudo- ops used to assemble certain statements and exclude others
expand (a'macro)
When MASM encounters a macro name in a program, it replaces the macro name by its body
invoke (a macro)
Usc the macro name in a program
local label
A label defined with the LOCAL pseudop op inside a macro. Each time the macro is invoked, MASM generates a different numerical label when the local label is encountered
New Pseudo- Ops
conditional macros ENDM LOCAL ELSE .ERR MACRO ENDIF IRP REPT
(see Table 13.1)
- Write the following mecros. All registers used by the macros should be restored, except those that return results.
a. MUL_N MACRO N, which puts the signed 32-bit product of AX and the number N in DX and AX.
b. DIV_N MACRO, N which divides the number in AX by the number N and puts the signed 16-bit quotient in AX. You may assume that N is not 0.
c. MOD MACRO M,N, which returns in AX the remainder after M is divided by N. Note that M and N may be 16-bit words, registers, or constants. You may assume that N is not 0.
d. POWER MACRO N, which takes the number in AX and raises it to the power of N, where N is a positive number. The result should be stored in AX. If the result is too big to fit in 16 bits; the macro should set CF/OF.
-
Write a macro C_TO_F, which takes an argument C (which represents a centigrade temperature), and converts it to Fahrenheit temperature F according to the formula F = (95xC) + 32. To do the multiplication by 9 and division by 5, your macro should invoke the MUL_N and DIV_N macros of exercises 1(a) and 1(b). The result, truncated to an integer, is returned in AX. If overflow occurs on multiplication, CF/OF should be set.
-
Write a macro CGD MACRO M,N that computes the greatest common divisor of arguments M and N. Euclid's algorithm for computing the CGD of M and N is
WHILE N 0 DO
M = M MOD N.
Swap M and N
END WHILE
RETURN M
Your macro should invoke the MOD macro of exercise 1(c)?
- Macros are especially useful in graphics applications. Write the following macros:
a. A macro MOV_CURSOR MACRO R;C that moves the cursor to row R and column C.
b. A macro DISP_CHAR MACRO CHAR,ATTR that displays character CHAR with attribute ATTR once at the cursor position.
c. A macro CLEAR_WINDOW MACRO R1,C1,R2,C2,COLOR that clears a window with upper left corner at (C1,R1), lower right corner at (C2,R2), and attribute COLOR.
d. A macro DRAW_IKXX MACRO R1,C1,R2,C2 that draws a box outline with upper left corner at (C1,R1), and lower right corner at (C2,R2). Use extended ASCII characters for the corners and sides.
- Use a REPT to write the following macros:
a. A macro ALT MACRO N, where N is a positive even integer, that initializes a block of N memory bytes to alternating 0's and 1's, beginning with 0. Show how the macro would be invoked to initialize a 100-byte array BYT.
b. A macro ARITH MACRO B,I,N, where B, I, and N are positive integers, that initializes a block of memory words to the following arithmetic progression: B, B + I, B + 2 × I . . .
c. A macro POWERS_OF_TWO MACRO N, where N is a nonnegative integer, that may be used to initialize a block of N memory words to 1, 2, 4, 8, 16, . . .,2N - 1. Show how the macro is invoked to initialize a 10-word array W.
d. A macro BIN MACRO N, K, where N and K are nonnegative integers, that will move the binomial coefficient
- State what code, if any, would be assembled in the following macro:
MAC1 MACRO M IF M- 1 MOV AX,M M- M- 1 IFE M MOV BX,M ENDIF ENDIF ENDM
a. For the macro invocation MAC1 1?
b. For the macro invocation MAC1 2?
- State what code, if any, would be assembled in the following macro:
MAC2 MACRO M, K REPT M MOV AX,M
a. For the macro invocation MAC2 5,1? b. I'or the macro invocation MAC2 2,2?
- The Fibonacci sequence is 1, 1, 2, 3, 5, 8, 13, 21, 34 . . . Write a macro FIB MACRO N whose invocation will cause the instruction MOV AX,FN to be assembled, where FN is the Nth Fibonacci num ber. For example, the call FIB 8 would cause the instruction MOV AX,21 to be assembled.
Here is an iterative algorithm for producing the Nth Fibonacci number:
IF N = 1 THEN FN = 1
ELSE
LO = 0
HI = 1
REPEAT N- 1 TIMES
X = LO
LO = HI
HI = X + LO
FN = HI
Until now, all our programs have consisted of a code segment, a stack segment, and perhaps a data segment. If there were other procedures besides the main procedure, they were placed in the code segment after the main procedure. In this chapter, you will see that programs can be constructed in other ways.
In section 14.1, we discuss the .COM program format in which code, data, and stack fit into a single segment. .COM programs have a simple structure and don't take up as much disk space as .EXE programs, so system programs are often written in this format.
Section 14.2 shows how procedures can be placed in different modules, assembled separately, and linked into a single program. In this way they can be written and tested separately. The modules containing these procedures may have their own code and data segments; when the modules are linked, the code segments can be combined, as can the data segments.
Section 14.3 covers the full segment definitions. They provide complete control over the ordering, combination, and placement of program segments.
Section 14.4 provides more information about the simplified segment definitions that we have been using throughout the book.
The procedures we've written so far have generally passed data values through registers. Section 14.5 shows other ways for procedures to communicate.
In this section we discuss a program format in which the code, data, and stack segments coincide. This type of program is also known as a .COM program, because that is the extension given the .com file. As you will see, the primary advantages of .COM programs are their simple structure and the
fact that they take up relatively little disk space. The disadvantages are in flexibility and limited size, because everything- code, data, and stack- must fit into the single segment.
A problem with .COM programs is where to place the data, if any, because they are in the same segment as the code. They can be put at the end of the program, but this requires use of the full segment declarations (section 14.3 ). We choose to place the data at the beginning of the program. Here is the form of a .COM program:
0 : TITLE 1 : .MODEL SMALL 2 : .CODE 3 : ORG 100H 4 : START: 5 : JMP MAIN 6 : ; data goes here 7 : MAIN PROC 8 : ; instructions go here 9 : ;dos exit 10 : MOV AH,4CH 11 : INT 21H 12 : MAIN ENDP 13 : ;other procedures go here 14 : END START
Let's look at the differences between this format and the format we've been using up till now (.EXE program format). First, there is only one segment, defined by .CODE. Because the first statement must be an instruction, the procedure begins with a JMP around the 'data. The label START indicates the entry point to the program; this label is also the operand of the END in line 14. The reason for the ORG 100h directive is explained as follows.
In Chapter 4 we mentioned that when an .EXE program is loaded in memory, it is preceded by a 100h- byte information area called the program segment prefix (PSP). The same is true for .COM programs, and for them, the PSP occupies the first 100h bytes of the segment.
The ORG 100h directive assigns 100h to the location counter, which keeps track of the relative location of the statement currently being assem bled. Ordinarily, the location counter is set to 0 at the beginning of a seg ment. ORG 100h makes it start at 100h instead.
Now suppose a .COM program has some data. Without the ORG 100h, the assembler would assign addresses to variables relative to the be ginning of the segment; this would incorrectly place them in the PSP. With the ORG 100h, variables are correctly assigned addresses relative to the be ginning of the program, which starts 100h bytes after the beginning of the segment.
In a .COM program, the stack is in the same segment as the code and data. Unlike an .EXE program, the programmer does not have to define a stack area. When the program is loaded, SP is initialized to FFFEh, the last
Figure 14.1 A.COM Program in Memory
word in the segment. Because the stack grows toward the beginning of memory, there is little danger that the stack will interfere with the code, unless the stack gets very large or there is a lot of code. Figure 14.1 shows how a .COM program looks after it has been loaded in memory, if defined with the preceding format.
As an example, let's rewrite PGM4_2. ASM in .COM format. The program just displays HELLO! on the screen. 'lo aid in the comparison, PGM4_2. ASM is reproduced here and renumbered PGM14_1. ASM.
TITLE PGM14_1: HELLO
.MODEL SMALL
.STACK 100H
.DATA
MSG DB 'HELLO! $'
CODE
KAIN PROC
; initialize DS
MOV AX, @DATA
MOV DS, AX
; initialize DS
; display message
LEA DX, MSG ; get message
MOV AHA 9 ; display: string function
INT 21H ; display: message
; return to DOS
MOV AH, 4CH
INT 21H
MAIN ENDP
END MAIN
Now here is the program written in .COM format.
TITLE PGM14_2. COM DEMO
MODEL SMALL
CODE
ORG 100H
START:
JMP MAIN
MSG DB 'HILLLO'S
MAIN PROC
LEA DX,MSG
MOV AH,9
INT 23H
MCV AH,4CH dos exit
INT 23H
MAIN ENDP
END START
Note that because there is only one segment, the instructions
MCV AX,@DATA
MCV DS,AX
which are required for an .EXE program that has data, are not needed in a .COM program.
The assemble and link steps are the same as before:
A>C:MASM PGM14_2;
Microsoft (R) Macro Assembler Version 5.10
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.
418713 Bytes Symbol Space tree
9 Warning Errors
Severe Errois
A>C:LINK PGM14_2;
Microsoft (R) Overlay Linker Version 3.64
Copyright (C) Microsoft Corp 1983- 1988. All rights reserved.
: warning L422i: no stack segment
This warning may be ignored since a .COM program doesn't have a separate stack segment.
For a .COM program, the .EXE file that is produced by the LINK program is not the run file. It must be converted to .COM file format by running the DOS utility program EXE2BIN.
The first argument to EXE2BIN is PGM14_2. The default extension is .EXE. The second argument PGM14_2. COM is the output file name. The .EXE file that was created in the preceding steps is no longer needed and should be erased before running the program. To execute it. we tvpe
A>PGM14_2
HELLO!
A>
s mentioned before, a primary advantage of .COM programs is their small ize. The size of PGM14_1. EXE is 801 bytes vs. 22 bytes for PGM14_2. COM. he main reason for the size discrepancy is that an .EXE file has a 512- byte cader block, which contains information about the size of the executable ode, where it is to be located in memory, and other data. Another reason ; that an .EXE program contains a separate stack segment.
For large programs with many procedures, it is convenient to put procedures in separate files. There are two primary reasons for doing this:
-
The procedures can be coded, assembled, and tested separately, possibly by different programmers.
-
When procedures are assembled separately, they can use the same names for variables and/or statement labels. This is because the assembler allows a name to be local to a file and it will not conflict with the same name in a different file.
A separately- assembled procedure must be contained in an assembly module. This is an .ASM file consisting of at least one segment definition. The assembler takes an assembly module and produces an .OBJ file called an object module. The linker then combines object modules into an .EXE file that can be executed.
In section 8.3, we noted that the syntax of procedure declaration is
where type is NEAR or FAR (the default is NEAR). A procedure is NEAR if the statement that calls it is in the same segment as the procedure itself; a procedure is FAR if it is called from a different segment.
Because a FAR procedure is in a different segment from its calling statement, the CALL instruction causes first CS and then IP to be saved on the stack, then CS:IP gets the segment:offset of the procedure. To return, REI' pops the stack twice to restore the original CS:IP.
You'll see in a moment that a procedure can be NEAR, even if it's assembled separately. A procedure must be typed as FAR if it's impossible for the calling statement and the procedure to fit into a single memory segment, or if the procedure will be called from a high- level language.
When assembling a module, the assembler must be informed of names which are used in the module but are defined in other modules; otherwise these names will be flagged as undeclared. This is done by the EXTRN pseudo- op, whose syntax is
EXTRN external_name_list
Here, external_name_list is a list of arguments of the form name:type where name is an external name, and type is one of the following: NEAR, FAR, WORD, BYTE, or DWORD. For externally declared procedures, type would be NEAR or FAR. The types WORD, BYTE, and DWORD are used for variables.
For example, to inform MASM of the existence of a NEAR procedure PROC1 and a FAR procedure PROC2 that are defined in separate modules, we would say,
EXTRN PROC1:NEAR, PROC2:FAR
Now suppose MASM encounters the statement
CALL PROC1
MASM knows from the EXTRN list that PROC1 is in another assembly module, and allocates an undefined address to PROC1. The address is filled in when the modules are linked.
The EXTRN pseudo- op may appear anywhere in the program, as long as it precedes the first reference to any of the names in the external name list. We will place it at the beginning of the program.
A procedure or variable must be declared with the PUBLIC pseudoop if it is to be used in a different module. The syntax is
PUBLIC name_1st
where name_list is a list of procedure and variable names that may be referred to in a different module. The PUBLIC pseudo- op can appear anywhere in a module but we will usually place it near the beginning of the module.
The LINK program combines object modules into a single executable machine language program. It tries to match names that are declared in EXTRN directives with PUBLIC declarations in the other modules. It combines code and data segments in different modules according to the segment declarations of these segments (see section 14.3). With the relative positions
of instructions and data known, it is able to fill in the addresses left undefined by MASM.
As an example, we will rewrite PGM43:ASM, which displays a prompt, reads a lowercase letter, and converts it to upper case.
There are two assembly modules. The first module contains the main procedure; it displays a message, lets the user enter the lowercase letter, and calls a procedure CONVERT, which converts the letter to uppercase and displays it with another message. CONVERT is defined in another module.
0: TITLE PGM14_3: CASE CONVERSION
1: EXTRN CONVERT:NEAR
3: .MODEL. .SMALL
3: .STACK 100H
4: .DATA
5: MSG. - DB 'ENTER A LOWERCASE LETTER:S'
6: .CODE
7: MAIN .PROC
8: MOV AX,EDATA
9: MOV DS,AX .Initialize ds
10: MOV AH,9 ;display string fcn
11: LEA DX,MSG ;get MSG
12: INT 21H ;display it
13: MOV AH,1 ;read char fcn
14: INT 21H ;input char
15: CALL CONVERT ;convert to uppercase
16: MOV AH,4CH
17: INT 21H ;DOS exit
18: MAIN ENDP
19: END MAIN
The first module consists of stack, data, and code segments. After initializing DS at lines 8 and 9, the program prints the message "ENTER A LOWERCASE LETTER:" and calls procedure CONVERT. The existence of CONVERT as a procedure in another module is made known to the assembler by the EXTRN directive in line 1. The first module ends with an END directive in line 19, with the entry point MAIN to the program.
0: TITLE PGM14_3A: CONVERT
1: PUBLIC CONVERT
2: MODEI. SMAI.L
3: .DATA
4: MSG . DB - ODH,0AH,'IN UPPERCASE IT IS
5: CHAR DB - 20H,'$'
6: .CODE
7: CONVERT PROC NEAR
8: ;converts char in AL to uppercase
9: PUSH BX
10: PUSH DX
11: ADD CHAR,AL ;convert to uppercase
12: MOV AH,9 ;display string fcn
13: LEA DX,MSG ;get MSG
The module containing CONVERT has its own data and code segments. When the modules are linked, the code segments from the two modules are combined into a single code segment similarly, the data segments are combined into a single segment (you'll see the reason for this in section 14.3).
At line 1, CONVERT is declared PUBLIC, enabling it to be called. from the first module. At line 7, procedure CONVERT is declared as type NEAR because the code segments of the two modules are combined. Because the data segments are also combined, it's not necessary to initialize DS in the second module; this was done in the first module. The module ends with an END directive; unlike the first module, the END has no operand.
After saving the registers used; CONVERT begins at line 11 by adding the lowercase letter in AL to the - 20th stored in byte variable CHAR. This converts the letter to upper case (assuming a lowercase letter was entered). At lines 12- 14, the procedure outputs the final message. Note that the name MSG is used in both modules.
Now let's assemble and link the modules. MASM and LINK will be in drive C, and the source files in drive A. A is the logged drive.
Microsoft (R) Macro Assembler Version 5.10
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.
49984 + 390317 Bytes symbol space free
0 Warning Errors
0 Severe Errors
Microsoft (R) Macro Assembler Version 5.10
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.
49976 + 390325 Bytes symbol - space free
0 Warning Errors
0 Severe Errors
C>LIB MYLIB.
Microsoft (R) Library Manager Version' 3.10
Copyright (C) Microsoft Corp 1983- 1988. Ail rights reserved.
Operations:
List file:MYLIB
When LIB asks for a list file, we reply MYLIB. This creates a listing file MYLIB, which looks like this:
C>TypemyLIB
CONVERT....pgm14_3a
pgm14_3a Offset: 00000010H Code and data size: 29H
CONVERT
The listing shows the object module names and the procedures they contain. In this case, the only object module in the library is PGM14_3A.OBJ, and it contains only procedure CONVERT.
For more information about the LIB utility, consult the Microsoft Codeview and Utilities manual.
14.3
Full Segment Definitions
The simplified segment definitions that we have been using up till now are adequate for most purposes. In this section we consider the full segment definitions. The primary reasons for using them are as follows:
-
Full segment definitions must be used for versions of MASM earlier than version 5.0.
-
With the full segment definitions, the programmer can control how segments are ordered, combined with each other, and aligned relative to each other in memory.
The full form of the segment directive is
name SEQUENT align combine class
The operands alig:, combine, and class are optional types, and a:e discussed in the next section. To end a segment, we say
name ENDS
For example, we could define a data segment called D_SEG as follows:
D_SEG SEGMENT ; data goes here D_SEG ENDS
Now let's look at the segment operands.
The align type of a segment declaration determines how the starting address of the segment is selected when the program is loaded in memory. Table 14.1 gives the options.
The significance of a segment's align type may be illustrated by the following example. Let SEG1 and SEG2 be segments declared like this:
SEG1 SEGMENT PARA 11H DUP (1) SEG1 ENDS SEG2 SEGMENT PARA 10H DUP (2) SEG2 ENDS
Suppose these segments are loaded sequentially, with SEG1 being assigned segment number 1010h. The 11h bytes of SEG1 will extend from 1010:0000h to 1010:0010h. Now, because SEG2 has a PARA align type, it begins at the next available paragraph boundary, which is at 1012:0000 = 1010:0020. Here is a DEBUG display of memory:
We see there is a gap of
PARA Segment begins at the next available paragraph (least significant hex digit of physical address is 0).
BYTE Segment begins at the next available byte.
WORD Segment begins at the next available word (least significant bit of physical address is 0).
PAGE Segment begins at the next available page (two least significant hex digits of physical address are 0).
Now suppose the segments are declared as follows:
SEG1 SEGMENT PARA
DB 11H DUP (1)
SEG1 ENDS
SEG2 SEGMENT BYTE
DB 10H DUP (2)
SEG2 ENDS
where SEG2 is given a BYTE align type. If these segments are loaded sequentially, memory will look like this:
1010:0000 01 01 01 01 01 01 01- 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01
1010:0020 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
The segments have been.combined into a single memory segment with no wasted space.
If a program contains segments of.the same.name, the combine type tells how they are to be combined when the program is loaded in memory. Table 14.2 gives the most frequently used choices.
The assembler indicates an error if a stack segment does not have a STACK combine type. For other segments, if the continue type is omitted the program segment is loaded into its own memory segment.
A frequent use of the PuBlic combine type is to combine code seg. ments with the same name from different modules into a single code seg. ment. This micans that all. procdures can be typcd us NEAR. - Similarly, PuBlic data segments can be combined into a single data segment. The advantage is that DS needs only to be initialized once, and does not need to be modified to access any.of the data. This is what happened in PGM14_3 when data segments were combined.
-
- Data segments in different modules can be given the same name and a COMMON combine type so that variables in one module can share
PUBLIC Segments with the same name are concretated (placed one after.the other) to form a s:ple, continuous memory block COMMON Segments with the same name begin at the same place in memory: that is, are overlaid.
STACK
Has the same c.flect as PuBlic; except that all offset addresses of instructions and data,in the segment are relative to the SS register. SP is initialized to the end of the segment.
AT paragraph Indicates that the segment should begin.at the specified paragraph.
the same memory locations as variables in the other module. To show how COMMON works, suppose we declare
D_SEG SEGMENT COMMON A DB 11H DUP (1) D_SEG ENDS
in FIRST.ASM, and
D_SEG SEGMENT COMMON B DB 10H DUP (2) D_SEG ENDS
in SECOND.ASM. If the modules are assembled and linked as follows:
C>LINK FIRST + SECOND;
then they will be overlaid in memory and variables A and B will be assigned the same address. The size of the common data segment will be that of the larger segment (11h bytes). However, the values of the bytes will be those that appear in SECOND, because it is the last module mentioned on the LINK command line. Memory will look like this:
22020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020220202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020
The class type of a segment declaration determines the order in which segments are loaded in memory. Class type declarations must be enclosed in single quotes.
If two or more segments have the same class, they are loaded in memory one after the other. If classes are not specified in segment declarations, segments are loaded in the order they appear in the source listing.
For example, suppose we declare
C_SEG SEGMENT 'CODE' ;main procedure goes here C_SEG ENDS
in module FIRST.ASM, and
C1_SEG SEGMENT 'CODE'
;another procedure goes here
C1_SEG ENDS
in module SECOND.ASM and these are the only segments of class 'C (0)E'. When the modules are assembled and linked by
then C1_SEG will follow C_SEG in memory. However, there may be a gap between the segments; to eliminate it, C1_SEG could be given a BYTE align type.
Form of an .EXE
Program with Full
Segment Definitions
The form of an .EXE program with the full segment definitions is a little different from the way it is with simplified segment definitions. Here is the standard format:
S_SEG SEGMENT STACK
DB 100H DUP (?)
S_SEG ENDS
D_SEG SEGMENT
;data goes here
D_SEG ENDS
C_SEG SEGMENT
ASSUME CS:C_SEG, SS:S_SEG, DS:D_SEG
MAIN PROC
;initialize DS
MOV AX,DSEG
MOV DS,AX
;other instructions
;dos exit
MOV AH,ICH
INT - 21H
MAIN ENDP 7
;other procedures- can go here
C_SEG ENDS
END MAIN
The segment names in this form are arbitrary. The ASSUME directive is unfamiliar, so we need to explain its role here.
When a program is assembled, MASM needs to be told which segments are the code, data, and stack; the purpose of the ASSUME directive is to associate the CS, SS, DS, and possibly ES registers with the appropriate segment. With the simplified segment directives, the segment registers are automatically associated with the correct segments, so no ASSUME is needed. However, for programs with data we still need to move the data segment number into DS at run time because, as we noted in Chapter 4, DS initially contains the segment number of the.PSP.
14.3.2
Using the Full Segment
Definitions
To show how the full Segment definitions work, we'll use them to rewrite PGM14_3. ASM and PGM14_3A.ASM. We will do this two ways: in the first version, we'll use the default operands of the segment directives.
Program Listing 14_4. ASM: First Module
0: TITLE PGM14_4. CASE CONVERSION
1: EXTRN CONVERT:FAR
2: SSEG SEGMENT STACK
3: DB IOO DUP (0)
4: S_SEG ENDS 5: D_SEG SEGMENT 6: MSG DB 'ENTER A LOWERCASE LETTER:
24: TITLE PGM14_AA: CONVERT 25: PUBLIC CONVERT 26: D_SEG SEGMENT 27: MSG DB ODH, OAH, ' IN UPPERCASE IT IS ' 28: CHAR DB - 20H, ' S' ' 29: D_SEG ENDS 30: C_SEG SEGMENT 31: A5SUME CS:C_SEG, DS:D_SEG 32: CONVERT PROC FAR 33: ;converts char in AL to uppercase 34: PUSH DS ;save DS 35: PUSH DX ;and DX 36: MOV DX, D_SEG ;reset DS 37: MOV DS, DX ;to local data segment 38: ADD CHAR, AL ;convert to uppercase 39: MOV AH, 9 ;display string fcn 40: LEA DX, MSG ;get MSG 41: INT 21H ;display it 42: POP DX ;restore DX 43: POP DS ;and DS 44: RET 45: CONVERT ENDP 46: C SEG ENDS 47: END
-
We chose the same name C_SEG for the code segments in both modules, but because they don’t have combine type PUBLIC, they will occupy separate memory segments when the modules are assembled and linked. This means procedure CONVERT must be typed as FAR (lines 1, 32).
-
Because the data segments are also not PUBLIC, they occupy separate memory segments. This means procedure CONVERT needs to change DS in order to access the data in the second module (lines 36, 37). We use DX (instead of AX) to move the segment number into DS, because CONVERT receives its input in AL.
After assembling and linking the modules, let's look at the .MAP file (Figure 14.3). The segments appear in the order they appear in the source listings. Because the segments were defined with the default (PARA) align type, there are gaps between them.
Now let's rewrite the preceding modules to take full advantage of the SEGMENT directives. Here are the requirements:
-
The code segments from the two programs are combined into a single segment, as are the data segments.
-
Gaps between segments are eliminated.
-
The order of the segments in the final program is: stack, data, code.
- TITLE PGM14_5: CASE CONVERSION
1: EXTRN CONVERT:NEAR
2: S_SEG SEGMENT STACK
3: DB 100 DUP (0)
4: S_SEG ENDS
5: D_SEG SEGMENT BYTE PUBLIC 'DATA'
6: MSG DB 'ENTER A LOWERCASE LETTERS'
7: D_SEG ENDS
8: C SEG SEGMENT BYTE PUBLIC 'CODE'
9: ASUME CS:CSEG,DS:D_SEG,SS:SEG
10: MAIN PROC
11: MOV AX,D_SEG
12: MOV DS,AX. ;init i a l i e D
13: MOV AH,9 ;display string fcn
14: LEA DX,MSG ;ut MSG
15: INT 21H ;display it
16: MOV AH,1 ;read char fcn
17: INT 21H ;input char
18: CALL CONVERT ;convert to uppercase
19: MOV AH,4CH
20: INT 21H ;dos exit
21: MAIN ENDP
22: C_SEG ENDS
23: END MAIN
0: TITLE PGM14_5A: CONVERT
1: PUBLIC CONVERT
2: D_SEG SEGMENT BYTE PUBLIC 'DATA'
3: MSG DB 0DH,0AH,'IN UPPERCASE IT I2
4: CHAR DB - 20H,'$'
5: D_SEG ENDS
6: C_SEG SEGMENT BYTE PUBLIC 'CODE'
7: ASSUME CS:C_SEG,DS:D_SEG
8: CONVERT PROC NEAR
9: CONVERT PROC NEAR
10: PUSH DX
11: ADD CHAR,AL ;convert to uppercase
12: MOV AH,9 ;display string fcn
13: LEA DX,MSG1 ;get MSG1
14: INT 21H ;display it
15: POP DX
16: RET
17: CONVERT END
18: C_SEG END
19: END
As before, we asscmble and link the modules. Figure 14.4 shows the .MAP file. It shows tha! the data and code segments of the two modules have been combined into single segments with no gaps between them. Here's how the SEGMENT operands were used:
-
By using the same names for code and data segments in the two modules, and using a PUBLIC combine type, we formed a program consisting of only three segments. Also, gaps were eliminated by using a BYTE align type. Because the PUBLIC combine type causes segments with the same name to be concatenated, the use of class types 'CODE' and 'DATA' is actually redundant.
-
Because the data for both modules now form a single segment, it wasn't necessary to reset DS in procedure CONVERT, and CONVERT doesn't need to save and restore DS. This is the primary reason for combining data segments.
-
Because there is now only one code segment, we can give CONVERT a NEAR attribute.
Now that we have seen the full segment definitions, we can say more about the features of the simplified segment directives that we have been using throughout the book.
First, as we saw in section 4.7.1, a memory model must be specified when the simplified segment definitions are used. The choice of memory model depends on how many code and data segments there are. The syntax is
. MODEL memory_model
where memory model is one of the choicis listed in Table 14.3. Unless there is a lot of code or data, the .SMALL model is adequate for most assembly language programs.
Second, for the SMALL model, Table 14.4 gives the simplified segments, their default names and align, combine, and class types. In addition to the CODE, .DATA, and .STACK segments we have bec using, uninitialized data can be declared in a separate .DATA? segment, and data that won't be changed by the program may be placed in a .CONST segment. For example,
MODEL SMALL
STACK 1001
DATA X DW 5
DATA? Y DW
Y DW
. CONST
MSG
DB HELLOS'
Table 14.3 Memory Models
Model
SMALL
MEDIUM
COMPACT
LARGE
HUGE
HUGE
Description
Code in one segment
Data in one segment
Code in more than one segment
Data in one segment
Code in one segment
Data in more than one segment
Code in more than one segment
Data in more than one segment
No array larger than 64 KB
Code in more than one segment
Data in more than one segment
Arrays may be larger than 64 KB
Table 14.4 SMALL Model Segments
Default
Segment Name Align Combine Class
CODE _TEXT WORD PUBLIC 'CODE'
DATA _DATA WORD PUBLIC 'DATA'
DATA? _BSS WORD PUBLIC 'BSS'
STACK STACK PARA STACK 'STACK'
CONST CONST WORD PUBLIC 'CONST'
CODE
MAIN PROC
MAIN ENDP
END MAIN
Here the usual initializing statements
MOV AX, @DATA
MOV DS, AX
allow the program access to the .DATA, .DATA?, and .CONST segments. This is because LINK actually combines these program segments into a single memory segment.
Third, for the .SMALL model, when .CODE is used to define code segments in separately assembled modules, these segments have the same default name (.TEXT) and a PUBLIC combine type. Thus when the modules are linked, the code segments combine into a single code segment; likewise, segments defined with .DATA combine into a single data segment. We saw a demonstration of this in PGM14_3.
In section 8.3, we briefly discussed the problem of passing data between procedures. Because assembly language procedures do not have associated parameter lists, as do high- level language procedures, it is up to the programmer to devise strategies for passing data between them. So far, we have been passing data to procedures through registers.
14.5.1 Global Variables
We have used the EXTRN and PUBLIC directives to show how a procedure defined in one module can be called from another. We can also use these directives to have variables defined in one module and referred to in another. Following high- level language practice, these variables are called global variables. An advantage of using global variables is that procedures need not use additional instructions to move data between themselves.
As an example, the following program prints a user prompt, reads two decimal digits whose sum is less than 10, and prints them and their sum on the next line. This problem was exercise 4.7.
0: TITLE PGM14_6: ADD DIGITS
1: EXTRN ADDIOS: N- AR
2: PUBLIC DIGIT1, DIGIT2, ST:
3: .MODEL SMALL
4: .STACK 100H
5: .DATA
6: MSG DB 'ENTER TW. I GITS: S'
7: MSG1 DB 0. .H, C. .H. ' TH- SUM OF
8: DIGIT1 DB ?
9: DB AND
10: DIGIT2 DB ?
11: DE IS
12: SUM DB - 30H,'$'
13: .CODE
14: MAIN PROC
15: ;initialize DS
16: MOV AX,@DATA
17: MOV DS,AX ;initialize DS
18: ;prompt user
19: MOV AH,9 ;display.string fcn
20: LEA DX,MSG ;get prompt
21: INT 21H ;display it
22: ;read two digits
23: MOV AH,1 ;input char fcn
24: INT 21H ;char in AL
25: MOV DIGIT1,AL ;store in DIGIT1
26: INT 21H ;char.in AL
27: MOV DIGIT2,AL ;store in DIGIT2
28: ;add the digits
29: CALL ADDNOS ;add nos
30: ;display results
31: LEA DX,MSG1
32: MOV AH,9
33: INT 21H ;output result
34: MOV AH,4CH
35: INT 21H ;dos exit
36: MAIN END
37: END MAIN
The digits and their sum are contained in variables DIGIT1, DIGIT2, and SUM, declared in the first module. In line 2, they are declared PUBLIC so that external procedure ADDNOS can have access to them.
0: TITLE PGM14_6A: ADDNOS
1: EXTRN DIGIT1:BYTE, DIGIT2:BYTE, SUM:BYTE
2: PUBLIC ADDNOS
3: .MODEL SMALL
4: CODE
5: ADDNOS PROC NEAR
6: ;adds two digits
7: ;input: byte variables DIGIT1, DIGIT2 in PGM14_4
8: ;output: byte variable SUM in PGM14_4
9: PUSH AX
10: MOV AL,DIGIT1
11: ADD AL,DIGIT2
12: ADD SUM,AL
- 13: POP -AX
14: RET
15: ADDNOS ENDF
16: END
DIGIT1, DIGIT2, and SUM appear in the second module's EXTRN list, line 1. The procedure adds them (actually, it adds the ASCII codes of the digit characters), then adds the sum to the - 30h that has been stored in variable SUM. This puts the ASCII code of the sum in SUM.
A second method for passing data to a procedure is to send the address of the data. This method is known as call by reference; it is particularly useful when dealing with arrays. Call by reference is different from call by value in which the actual data values are passed to the called procedure. Both methods can be used in the same procedure; for example, the selectsort procedure discussed in section 10.3 receives the address of the array to be sorted in SI (call by reference), and the number of elements in the array in BX (call by value).
Here is the program to add two digits using call by reference.
0: TITLE PGM14_7: ADD DIUITS
1: EXTRN ADDNOS: NEAR
2: .MODEL SMALL
3: .STACK 100H
4: .DATA
5: MSG DB. 'ENTER TWO DIGITS:S'
6: MSG1 DB ODH, OAH, 'THE SUM OF'
7: DIGIT1 DB ?
8: DB AND
9: DIGIT2 DB ?
10: DB IS
11: SUM DB - 30H,' $'
12: .CODE
13: MAIN PROC
14: ;initialize DS
15: MOV AX,@DATA
16: MOV DS,AX ;initialize DS
17: ;display prompt
18: MOV AH,9 ;display string function
19: LEA DX,MSG ;get prompt
20: INT 21H ;display it
21: ;read two digits
22: MOV AH,1 ;input char function
23: INT 21H ;char in AL
24: MOV DIGIT1,AL ;store in DIGIT1
25: INT 21H ;char in AL
26: MOV DIGIT2,AL ;store in DIGIT2
27: ;add them
28: LEA SI,DIGIT1 ;SI has offset of DIGIT1
29: LEA DI,DIGIT2 ;DI has offset of DIGIT2
'30: LEA ;IX,3UM ;BX has offset of SUM 31: CALL ADDN'S ;add nos 32: ;display results
33: MOV AH, 9 ;display string fcn
34: LEA ;IX,MSG1 ;DX has message
35: INT 11H ;output result
36: ;dos' exit
37: MOV AH,4CH
38: INT 21H
39: MAIN ENDP
40: END MAIN
40: END MAIN
At lines 28- 30, the addresses of the DIGIT1, DIGIT2, and SUM are passed to procedure ADDNOS in pointer registers SI, DI, and BX.
O: TITLE PGM14_7A: ADDNOS
1: PUBLIC ADDNOS
2: .MODEL $MALL
3: .CODE
4: ADDNOS PROC. NEAR
5: ;adds two digits
6: ;input: SI = address of DIGIT1
7: DI = offset of DIGIT2
8: BX = offset.of SUM
9: ;output: [BX] = sum
10: PUSH AX
11: MOV AL,[SI]
12: ADD AL,[DI] ;AL has DIGIT1 + DIGIT2
13: ADD [BX],AL ;add to SUM
14: POP AX
15: RET
16: ADDNOS ENDP
17: END
In lines 11 and 12, ADDNOS uses indirect addressing to place the sum of digits in AL. In line 13, indirect addressing is used to add the sum to the - 30h in variable SUM.
14.5.3 Using the Stack
Instead of using registers, a procedure can place data values and addresses on the stack before calling another procedure. The called procedure then uses BP and indirect addressing to access the data (recall from section - 10.2.1 that if BP is used in register indirect mode, SS has the operand's segment number). This method is used by high- level languages to pass data to assembly language procedures; we use it in Chapter 17 to implement recursive procedures (procedures that call themselves).
Because the CALL instruction causes the return address to be placed on top of the stack, the called procedure begins by saving BP on the stack,
then it moves SP to BP; this makes BP point to the top of the stack. The resulting stack looks like this:
Now BP may be used with indirect addressing to access the data (we use BP because SP can't be used in indirect addressing). To return to the calling procedure, BP is popped off the stack and a RET N is executed, where N is the number of data bytes that the calling procedure pushed onto the stack. This restores CS:IP and removes N more bytes from the stack, leaving it in its original condition.
Here is the program to add two digits using this method:
0: TITLE PGM1_8: ADD DIGITS
1: EXTRN ADDNOS: NEAR
2: .MODEL SMALL
3: .STACK IOH
4: .DATA
5: MSG DB 'ENTER TWO DIGITS:S'
6: MSG1 DB 0DH, OAH, 'THE SUM OF
7: DIGIT1 DB ?
8: DB ' AND '
9: DIGIT2 DB ?
10: DB ' IS '
11: SUM DB - 30H,'S'
12: .CODE
13: MAIN PROC
14: ;initialize DS
15: MOV AX,EDATA
16: MOV DS,AX ;initialize DS
17: ;display prompt
18: MOV AH,9 ;display string function
19: LEA DX,MSG ;get prompt
20: INT 21H ;display it
21: ;read two digits
22: MOV AH,1 ;input char function
23: INT 21H ;char in AL
24: MOV DIGIT1,AL ;store in DIGIT1
25: PUSH AX ;save on stack
26: INT 21H ;char in AL
27: MOV DIGIT2,AL ;store in DIGIT2
28: PUSH AX ;save on stack
29: ;add the digits
30: CALL. ADDNOS ;AX has sum
31: ADD SUM,AL ;store sum
32: ;display results
33: MOV AH, 9 ; f, ; display string fcn
34: LEA DX, MSG1 ; DX has message
35: INT 21H ; output result
36: ; dos exit
37: MOV AH, 4CH
38: INT 21H
39: MAIN ENDP
40: END MAIN
At lines 24- 28, the two digits are read, stored, and pushed onto the stack (because PUSH requires a word operand, we have to push AX). At line 30, ADDNOS is called to add the digits; it returns with the sum in AL, and this is added to the - 30h in SUM.
0: TITLE PGM14_8A: ADDNOS
1: PUBLIC ADDNOS
2: .MODEL SMALL
3: .CODE
4: ADDNOS PROC NEAR
5: ; adds two digits
6: ; stack on entry: ret. addr (top), digit2, digit1
7: ; output: AX = sum
8: PUSH BP ; save BP
9: MOV BP, SP ; BP pts to stack top
10: MOV AX,
11: ADD AX,
12: - POP BP ; restore BP
13: RET 4 ; restore stack, exit
14: ADDNOS END
15: END
At line 9, the stack looks like this:
DIGIT1 and DIGIT2 are in the low bytes of the words on the stack. After adding them, BP is popped and the procedure executes a RET 4, which removes the two data words from the stack.
-
I11 a .COM format program, stack, data, and code all fit into a single segment. A .COM program takes up much less disk space than a comparable .EXE program, but the fact that code, data, and stack must all fit into a single segment limits its versatility.
-
There are two kinds of procedures, NEAR and FAR. A NEAR procedure is in the same code segment as the calling procedure, and a FAR procedure is in a different segment. When a FAR procedure is called, both CS and IP are saved on the stack.
-
The EXTRN pseudo-op is used to inform the assembler of the existence of procedures and variables that are defined in another assembly module.
-
A procedure must be contained in an assembly module, which consists of at least one segment definition. MASM translates an assembly module into a machine language object (.OBJ) module.
-
The PUBLIC pseudo-op is used to inform the assembler that certain names a module may be referred to in another module.
-
The LINK program combines object modules into an executable machine language program. It matches EXTRN declarations in object modules with PUBLIC declarations in other object modules.
-
The LIB program can be used to create and maintain a file of object modules.
-
The SEGMENT directive may have align, combine, and class types.
-
The align type determines how the segment's starting address will be selected when the program is loaded in memory.
-
The combine type determines how segments of the same name are to be combined in memory.
-
If two or more segments have the same class, they are loaded sequentially in memory.
-
Procedures in different modules can communicate through global variables. Other methods are called by value or called by reference; the calling procedure can implement these methods by placing data values and addresses in registers, or pushing them onto the stack.
assembly module
An .ASM file consisting of at least one segment definition
call by reference
Communication with a procedure by passing it the addresses of variables containing the data the procedure needs
call by value
Communication with a procedure by passing the procedure the actual data values it needs
.COM program
global variable
object module
A program in which the code, data, and stack segments coincide
A variable that is declared as PUBLIC, so it can be accessed by statements in other program modules
The .OBJ file that MASM creates by assembling an assembly module
New Pseudo- Ops
ASSUME EXTRN PUBLIC
.CONST ORG SEGMENT
.DATA?
- . Suppose a program contains the lines
CALL PROCI MOV AX,BX
and (a) instruction MOV AX,BX is stored at 08FD:0200h, (b)
PROC1 is a FAR procedure that begins at 1000:0200h, and (c) SI' = 010Ah.
What are the contents of CS, IP, and SI' just after CALL PROCI is executed? What word is on top of the stack?
- Suppose SP = 00FAh, CS = 1000h, top of stack = 0200h, next word on the stack = 08FDh. What are the contents of CS, IP and SP after the following happens:
a. After RET is executed, where RET appears in a NEAR procedure.
b. After RET is executed, where RET appears in a FAR procedure.
c. After RET'4 is executed, where RET appears in a NEAR procedure.
- Consider a program that does the following:
The main procedure MAIN displays the message "INSIDl:
MAIN PROGRAM", calls procedure PROCI, and exits to DOS.
PROC1 displays the message "INSIDE PROCI" on a new line, calls procedure PROC2, and returns to MAIN.
PROC2 displays the message "INSIDE PROC2" on a new line and returns to PROCI.
Write this program in the following ways;
a. As a .COM program.
b. As an .EXE program in which PROCI and PROC2 are NEAR procedures contained in separately assembled modules. Each procedure's module contains the message that the procedure displays.
c. As an .EXE program in which the PROC1 and PROC2 are FAR procedures contained in separately assembly modules. Each procedure's module contains the message that the procedure displays.
d. As an .EXE program in which the three messages are contained in MAIN's module and declared PUBLIC there. The other procedures are NEAR procedures contained in separately assembled modules. These procedures refer to the appropriate messages via an EXTRN directive.
e. As an .EXE program in which the three messages are contained in MAIN's module. PROC1 and PROC2 are separately assembled NEAR procedures. Before calling PROC1, MAIN places the addresses of the messages "INSIDE PROC1" and "INSIDE PROC2" in SI and DI, respectively.
f. As an .EXE program in which the three messages are contained in MAIN's module. PROC1 and PROC2 are separately assembled NEAR procedures. Before calling PROC1, MAIN pushes the addresses of the messages "INSIDE PROC2" and "INSIDE PROC1" onto the stack.
- The position of a substring within a string is the number of bytes from the beginning of the string to the start of the substring.
Write a separately assembled NEAR procedure FIND_SUBST that receives the offset addresses of the first string in SI and the second string in DI and determines whether the second string is a substring of the first; if so, FIND_SUBST returns its position in AX. If the second string is not a substring of the first string, the procedure returns a negative number in AX.
Write a program to test FIND_SUBST; the testing program reads the strings, calls FIND_SUBST, and displays the result. This problem is a variation of PGM11_5. ASM.
In previous chapters, we used the INT (interrupt) instruction to call system routines. In this chapter, we discuss different kinds of interrupts and take a closer look at the operation of the INT instruction. In sections 15.2 and 15.3, we discuss the services provided by various BIOS (basic input/output systems) and DOS interrupt routines.
To demonstrate the use of interrupts, we will write a program that displays the current time on the screen. There are three versions: the first version simply displays the time and then terminates, the second version shows the time updated every second, and the third version is a memory resident program that can be called up when other programs are running.
The notion of interrupt originally was conceived to allow hardware devices to interrupt the operation of the CPU. For example, whenever a key is pressed, the 8086 must be notified to read a key code into the keyboard buffer. The general hardware interrupt goes like this: (1) a hardware that needs service sends an interrupt request signal to the processor; (2) the 8086 suspends the current task it is executing and transfers control to an interrupt routine; (3) the interrupt routine services the hardware device by performing some I/O operation; and (4) control is transferred back to the original executing task at the point where it was suspended.
Some questions to be answered are how does the 8086 find out a device is signaling? How does it know which interrupt routine to execute? How does it resume the previous task?
Because an interrupt signal may come at any time, the 8086 checks tor the signal after executing cach instruction. On detecting the interrupt signal, the 8086 acknowledges it by sending an interrupt acknowledge signal. The interrupting device responds by sending an eight- bit number on the data bus, called an interrupt number. Each device uses a different interrupt number to identify its own service routine. The process of sending control signals back and forth is called hand- shaking; it is needed to identify the interrupt device. We say that a type N interrupt occurs when a device uses an interrupt number
The transfer to an interrupt routine is similar to a procedure call. Before transferring control to the interrupt routine, the 8086 first saves the address of the next instruction on the stack; this is the return address. The 8086 also saves the FLAGS register on the stack; this ensures that the status of the suspended task will be restored. It is the responsibility of the interrupt routine to restore any registers it uses.
Before we talk about how the 8086 uses the interrupt number to locate the interrupt routine, let's look at the other kinds of interrupts.
Software interrupts are used by programs to request system services. A software interrupt occurs when a program calls an interrupt routine using the INT instruction. The format of the INT instruction is
INT interrupt- number
The 8086 treats this interrupt number in the same way as the interrupt number generated by a hardware device. We have already given a number of examples of doing I/O with INT 21h.
There is a third kind of interrupt, called a processor exception. A processor exception occurs when a condition arises inside the processor, such as divide overflow, that requires special handling. Each condition corresponds to a unique interrupt type. For example, divide overflow is type 0, so when overflow occurs in a divide instruction the 8086 automatically executes interrupt 0 to handle the overflow condition.
Next we take on the address calculation for interrupt routines.
The interrupt numbers for the 8086 processor are unsigned byte values. Thus, it is possible to specify a total of 256 types of interrupts. Not every interrupt number has a corresponding interrupt routine. The computer manufacturer provides some hardware device service routines in ROM; these are the BIOS interrupt routines. The high- level system interrupt routines, like INT 21h, are part of DOS and are loaded into memory when the machine is started. Some additional interrupt numbers are reserved by IBM for future use; the remaining numbers are available for the user. See Table 15.1.
The 8086 does not generate the interrupt routine's address directly from the interrupt number. Doing so would mean that a particular interrupt routine must be placed in exactly the same location in every computer—an impossible
Interrupt Types 0- 1Fh:
Interrupt Types 20h- 3Fh:
Interrupt Types 40h- 7Fh:
Interrupt Types 80h- 50h:
Interrupt Types F1h- Fh:
BIOS Interrupts DOS Interrupts reserved ROM BASIC not used
task, given the number of computer models and updated versions of the routines. Instead, the 8086 uses the interrupt number to calculate the address of a memory location that contains the actual address of the interrupt routine. This means that the routine may appear anywhere, so long as its address, called an interrupt vector, is stored in a predefined memory location.
All interrupt vectors are placed in an interrupt vector table, which occupies the first 1 KB of memory. Each interrupt vector is given as segment:offset and occupies four bytes; the first four bytes of memory contain interrupt vector 0. See Figure 15.1.
To find the vector for an interrupt routine, we simply multiply the interrupt number by 4. This gives the memory location containing the offset of the routine; the segment address of the routine is in the next word. For example, take interrupt 9, the keyboard interrupt routine: the offset address is stored in location
Figure 15.1 Interrupt Vector Table
Let's see how the 8086 executes an INT instruction. First, it saves the flags by pushing the contents of the FLAGS rgister onto the stack. Then it clears the control flags IF (interrupt flag) and TF (trup flag); the rcason for this action is explained later. Next it saves the current address by pushing CS and IP on the stack. Finally, it uses the interrupt number to get the interrupt vector from memory and transfers control to the interrupt routine by loading CS:IP with the interrupt vector. The 8086 transfers to a hardware interrupt routine or processor exception routine in a similar fashion.
On completion, an interrupt routine executes an IKEI (interrupt return) instruction that restores the IP, CS, and FLAGS registers.
The control flags IF and TF play an important role in the interrupt process. When TF is set, the 8086 generates a processor exception, interrupt type 1. This interrupt is used by DEBUG in executing the T (trace) command. To trace an instruction, DEBUG first sets the TF; and then transfers control to the instruction to be traced. After the instruction is executed, the processor generates an interrupt type 1 because TF is set. DEBUG uses its own interrupt 1 routine to gain control of the processor.
The IF is used to control hardware interrupts. When IF is set, hardware devices may interrupt the 8086. External interrupts may be disabled (masked out) by clearing IF. Actually, there is a hardware interrupt, called NMI (nonmaskable Interrupt) that cannot be masked out.
Both TF and IF are cleared by the processor before transferring to an interrupt routine so that the routine will not be interrupted. Of course, an interrupt routine can change the flags to enable interrupts to occur during its execution.
As indicated in Table 15.1, interrupt types 0 to 1Fh are known as BIOS interrupts. This is because most of these service routines are BIOS routines residing in the ROM segment F000h.
Interrupt types 0- 7 are reserved by Intel, with types 0- 4 being predefined. IBM uses type 5 for print screen. Types 6 and 7 are not used.
Interrupt 0—Divide Overflow A type 0 interrupt is generated when a DIV or IDIV operation produces an overflow. The interrupt 0 routine displays the message "DIVIDE OVERFLOW" and returns control to DOS.
Interrupt 1—Single Step As discussed in the last section, a type 1 interrupt is generated when the TF is set.
Interrupt 2—Nonmaskable Interrupt Interrupt 2 is the hardware interrupt that cannot be masked out by clearing the IF. The IBM PC uses this interrupt to signal memory and I/O parity errors that indicate bad chips.
Interrupt 3- - Breakpoint- - The INT 3 instruction is the only single- byte interrupt instruction (opcode CCh); other interrupt instructions are two- byte instructions. It is possible to insert an INT 3 instruction anywhere in a program by replacing an existing opcode. The DEBUG program uses this feature to set up breakpoints for the G (go) command.
Interrupt 4- - Overflow A type 4 interrupt is generated by the instruction INTO (interrupt if overflow) when OF is set. Programmers may write their own interrupt routine to handle unexpected overflows.
Interrupt 5- - Print Screen The BIOS interrupt 5 routine sends the video screen information to the printer. An INT 5 instruction is generated by the keyboard interrupt routine (interrupt type 9) when the PrtSc (print screen) key is pressed.
The 8086 has only one terminal for hardware interrupt signals. To allow more devices to interrupt the 8086, IBM uses an interrupt controller, the Intel 8259 chip, which can interface up to eight devices. Interrupt types 8- Fh are generated by hardware devices connected to the 8259. The original version of the PC uses only interrupts 8, 9, and Eh.
Interrupt 8- - Timer The IBM PC contains a timer circuit that generates an interrupt once every 54.92 milliseconds (about 18.2 times per second). The BIOS interrupt 8 routine services the timer circuit. It uses the timer signals (ticks) to keep track of the time of day.
Interrupt 9- - Keyboard This interrupt (9) is generated by the keyboard cach time a key is pressed or released. The BIOS interrupt 9 routine reads a scan code and stores it in the keyboard buffer.
Interrupt E- - Diskeette Error The BIOS interrupt Eh routine handles disk- ette errors.
The interrupt routines 10h- 1Fh can be called by application ...grams to perform various I/O operations and status checking.
Interrupt 10h- Video The BIOS interrupt 10h routine is the video driver. Details are covered in Chapters 12 and 16.
Interrupt 11h- Equipment Check The BIOS interrupt 11h routine returns the equipment configuration of the particular I'C. The return code is placed in AX. Table 15.2 gives the interpretation of the bits returned in AX.
Interrupt 12h- Memory Size The BIOS interrupt 12h routine returns in AX the amount of conventional memory a computer has. Conventional memory refers to memory circuits, ith addresses below 640 K. The unit for the return value is in kilobytes.
15- 14 number of printers installed
13
12
11- 9 number of RS- 232 (serial) ports installed
8 not used
7- 6 number of floppy disk drives (if bit
00 means 1
01 means 2
10 means 3
11 means 4
5- 4 initial video mode
00 not used
01 means
10 means
11 means
3- 2 system board RAM size (for original PC)
00 = means 16 KB
01 = means 32 KB
10 = means 48 KB
11 = means 64 KB
1 = 1 if math coprocessor.installed
0 = 1 if floppy disk drive installed
Example 15.1 Suppose a computer has 512 KB conventional memory. What will be returned in AX if the instruction INT 12H is executed?
Solution:
Interrupt 13h—Disk I/O The BIOS interrupt 13h routine is the disk driver, it allows application programs to do disk I/O.
Interrupt 14h—Communications The BIOS interrupt 14h routine is the communications driver that interacts with the serial ports.
Interrupt 15h—Cassette This interrupt was used by the original PC for cassette interface and by the I^C AT and PS/2 models for various system services.
Interrupt 16h—Keyboard I/O The BIOS interrupt 16h routine is the keyboard driver. Keyboard operations are found in Chapter 12.
Interrupt 17h—Printer I/O The BIOS interrupt 17h routine is the printer driver. The routine supports three functions: 0- 2. Function 0 writes a character to the printer; input values are
Bits in AH : Meaning
7 = 1 printer not busy
6 = 1 print acknowledge
5 = 1 out of paper
4 = 1 printer selected
3 = 1 I/O error
2 not used
1 not used
0 = 1 printer timed- out
printer number. Function 2 gets printer status, input values are AH = 2, DX - printer number. For all functions, the status is returned in AH. Table 15.3 shows the meaning of the bits returned in AH.
Example 15.2 Write instructions to print a ().
Solution: We use function 0 to do the printing. Because printers contain buffers for data, the 0 will not be printed until a carriage return or line feed character is sent. Thus,
MOV AH,0
;function 0, print char
MOV AL,0
;char 0
MOV DX,0 ;printer 0
INT 17H :AH contains return code
MOV AH,0 ;function 0, print char
MOV AL,0AH ;line feed
INT 17H
Interrupt 18h—BASIC The BIOS interrupt 18h routine transfers control to ROM BASIC.
Interrupt 19h—Bootstrap The BIOS interrupt 19h routine reboots the system.
Interrupt 1Ah—Time of Day The BIOS interrupt 1Ah routine allows a program to get and set the timer tick count, and in the case of PC AT and PS/2 models, it allows programs to get and set the time and date for the clock circuit chip.
Interrupt 1Bh—Ctrl- Break This interrupt is called by the INT 9 routine when the Ctrl- break key is pressed. The BIOS interrupt 1Bh routine contains only an IKEI- instruction. Users may write their own routine to handle the Ctrl- break key.
Interrupt 1Ch—Timer Tick INT 1Ch is called by the INT 8 routine each time the timer circuit interrupts. The BIOS interrupt 1Ch routine contains only an IRET instruction. Users may write their own service routine to perform timing operations. In section 15.5, we use it to update the displayed time.
Interrupts 1Dh—1Fh These interrupt vectors point to data instead of instructions. The interrupt 1Dh, 1Eh, and 1Fh vectors pointing to video initialization parameters, diskette parameters, and video graphics characters, respectively.
The interrupt types 20h- 3Fh are serviced by DOS routines that provide high- level service to hardware as well as system resources such as files and directories. The most useful is INT 21h, which provides many functions for doing keyboard, video, and file operations.
Interrupt 20h—Program Terminate Interrupt 20h can be used by a program to return control to DOS. But because CS must be set to the program segment prefix before using INT 20h, it is more convenient to exit a program with INT 21h, function 4Ch.
Interrupt 21h—Function Request The number of functions varies with the DOS version. DOS 1. x has functions 0- 2Eh, DOS 2. x added new functions 2Fh- 57h, and DOS 3. x added new functions 58h- 5Fh. These functions may be classified as character I/O, file access, memory management, disk access, networking, and miscellaneous. More information is found in Appendix C.
Interrupts 22h—26h Interrupt routines 22h- 26h handle Ctrl- Break, critical errors, and direct disk access.
Interrupt 27h—Terminate but Stay Resident Interrupt 27h allows programs to stay in memory after termination. We demonstrate this interrupt in section 15.6.
As an example of using interrupt routines, we now write a program that displays the current time. There are three versions, each getting more complex. In this section, we show the first version, which simply displays the current time in hours, minutes, and seconds. In section 15.5, we write the second version, which shows the time updated every second; and in section 15.6 we write the third version, which is a memory resident program that can display the time while other programs are running.
When the computer is powered up, the current time can be entered by the user or supplied by a real- time clock circuit that is battery powered. This time value is kept in memory and updated by a timer circuit using interrupt 8. A program can call the DOS interrupt 21h, function 2Ch, to access the time.
INT- 21h, Function 2Ch:
Time- of- Day
Input: AH = 2Ch
Output: CH = hours (0- 23),
CL = minutes (0- 59),
DH = seconds (0- 59),
DL = 1/100 seconds (0- 99).
Our time display program has the following steps: (1) obtain the current time, (2) convert the hours, minutes, and seconds into ASCII digits, we ignore the fractions of a second, and (3) display the ASCII digits.
The program is organized into a MAIN procedure in program listing PGM15_1. ASM and two procedures GET_TIME and CONVERT in program listing PGM15_1A.ASM.
A time buffer, TIME_BUF, is initialized with the message of 00:00:00.
The procedure MAIN first calls GET_TIME to store the current time in the time buffer. Then it uses INT 21h, function 9, to print out the string in the time buffer.
The procedure GET_TIME calls INT 21h function 2Ch to get the time, then calls CONVERT to convert the hours, minutes, and seconds into ASCII characters. The first step in procedure CONVERT is to divide the input number in AL by 10; this will put the ten's digit value in AL and unit's digit value in AH (note that the input value is less than 60). The second step is to convert the digits into ASCII.
TITLE PGM15_1: TIME_DISPLAY_VER_1
;program that displays the current time
EXTRN GET_TIME:NEAR
.MODEL SMALL
.STACK 100H
.DATA
TIME_BUF DB '00:00:00s';time buffer hr:min:sec
CODE
MAIN PROC
MOV AX,@DATA
MOV DS,AX
MOV DS,AX ;initialize
;get and display time
LEA BX,TIME_BUF ;BX point to TIME_BUF
CALL GET_TIME ;put current time in TIME_BUF
LEA DX,TIME_BUF ;DX points to TIME_BUF
MOV AH,09H ;display time
INT 21H
EXIT
MOV AH,4CH ;return
INT 21H ;to DOS
MAIN END
END MAIN
TITLE PGM15_1A GET AND CONVERT TIME TO ASCII
PUBLIC GET_TIME
.MODEL SMALL
.CODE
GET_TIME PROC
;get time of day and store ASCII digits in time buffer ;input: BX = address of time buffer
MOV AH,2CH ;gettime
INT 21h ;CH = hr, CL = min, DH = sec
;convert hours into ASCII and store
MOV AL,CH ;hour
CALL CONVERT ;convert to ASCII
MOV [BX],AX ;store
;convert minutes into ASCII and store
MOV AL,CL ;minute
CALL CONVERT ;convert to ASCII
MOV [BX+3],AX ;store
;convert seconds into ASCII and store
MOV AL,DH ;second
CALL CONVERT
MOV [BX+6],AX
RET
GET_TIME ENDP
CONVERT PROC
;converts byte number (0- 59) into ASCII digits
;input: AL = number
;output:AX = ASCII digits, AL = high digit, AH = low digit
MOV AH,0 ;clear AH
MOV DL,10 ;divide AX by 10
DIV DL ;AH has remainder, AL has quotient
OR AX,3030H ;convert to ASCII, AH has low digit
RET ;AL has high digit
CONVERT ENDP
END
The program displays the time and terminates.
15.5 User Interrupt Procedures
To make the time display program more interesting, let us write a second version that displays the time and updates it every second.
One way to continuously update the time is to execute a loop that keeps obtaining the time via INT 21h, function 2Ch and displaying it. The problem here is to find a way to terminate the program.
Instead of pursuing this approach, we will write a routine for interrupt 1Ch. As mentioned earlier, this interrupt is generated by the INT 8 routine which is activated by a timer circuit about 18.2 times a second. When our interrupt routine is called, it will get the time at display it.
Our program will have a MAIN procedure that sets up the interrupt routine and when a key is pressed, it will deactivate the interrupt routine and terminate.
To set up an interrupt routine, we need to (1) save the current interrupt vector, (2) place the vector of the user procedure in the interrupt vector table, and (3) restore the previous vector before terminating the program.
We use the INT 21h, function 35h, to get the old vector and function 25h to set up the new interrupt vector.
INT 21h, Function 25h:
Set Interrupt Vector
;store interrupt vector into vector table
Input: AH = 25h
AL = Interrupt number
DS:DX = interrupt vector
Output: none
INT 21h, Function 35h: Get Interrupt Vector
;obtain interrupt vector from vector table
Input: AH = 35h
AL = Interrupt number
Output: ES:BX = interrupt vector
The procedure SETUP_INT in program listing PGM15_2A.ASM saves an old interrupt vector and sets up a new vector. It gets the interrupt number in AL, a buffer to save the old vector at DS:DI, and a buffer containing the new interrupt vector at DS:SI. By reversing the two buffers, SETUP_INT can also be used to restore the old vector.
TITLE PGM15_2A: SET INTERRUPT VECTOR
PUBLIC SETUP_INT
.MODEL SMALL
CODE
SETUP_INT PROC
;saves old vector and sets up new vector
;input: AL = interrupt number
; DI = address of buffer for old vector
; SI = address of buffer containing new vector
; save: old interrupt vector
MOV AH, 35H ; function 35h, get vector
INT 21H ;ES:BX = vector
MOV {DI}, BX ; save offset
MOV {DI+2};ES ; save segment
; setup new vector. MOV DX, [SI] ; DX has offset PUSH DS ; save DS MOV DS, [SI+2] ; DS has segment number MOV AH, 25H ; function 25h, set vector INT 21H POP DS ; restore DS RET SETUP INT ENDP END
Each display of the current time by INT 21h, function 9, will advance the cursor. If a new time is then displayed, it appears at a different screen position. So, to view the time updated at the same screen position we must restore the cursor to its original position before we display the time. This is achieved by first determining the current cursor position; then, after each print string operation, we move the cursor back.
We use the INT 10h, functions 3 and 2, to save the original cursor position and to move the cursor to its original position after each print string operation.
INT 10h, Function 2:
Move Cursor
Input:
Output: none
INT 10h, Function 3: Get Cursor Position Input:
When an interrupt procedure is activated, it cannot assume that the DS register contains the program's data segment address. Thus, if it uses any variables it must first reset the DS register. The DS register should be restored before ending the interrupt routine with IRET.
Program listing PGM15_2. ASM contains the MAIN procedure and the interrupt procedure TIME_INT. The steps in the MAIN procedure are (1) save the current cursor position, (2) set up the interrupt vector for TIME_INT, (3) wait for a key input, and (4) restore the old interrupt vector and terminate.
To do step 2, we use the pseudo- ops OFFSET and SEG to obtain the offset and segment of procedure TIME_INT; the vector is then stored in the buffer NEW_VEC. The procedure SETUP_INT, is called to set up the vector for interrupt type 1Ch, timer tick. The interrupt 16h, function 0 is used for step 3, key input. Procedure SETUP_INT is again used in step 4; this time SI points to the old vector and DI points to the vector for TIME_INT.
The steps in the procedure TIME_INT are (1) set DS, (2) get new time,
(3) display time, (4) restore cursor position, and (5) restore DS.
The program operates like this: After setting up the cursor and interrupt vectors, the MAIN procedure just waits for a keystroke. In the meantime, the interrupt procedure, TIME_INT, keeps updating the time whenever the timer circuit ticks. After a key is hit, the old interrupt vector is restored and the program terminates.
TITLE PGM15_2: DISPLAY_TIME_VER_2
;program that displays the current time
;and updates the time 18.2 times a second
EXTRN GET_TIME:NEAR, SETUP_INT:NEAR
.MODEL SMALL
.STACK 100H
.DATA
TIME_BUF DB '00:00:00S' ;time buffer hr:min:sec
CURSOR POS DW ? ;cursor position (row:col)
NEW_VEC DW ?,?, ;new interrupt vector
OLD_VEC DW ?,? ;old interrupt vector
CODE
MAIN PROC
MOV AX,DATA
MOV DS,AX ;initialize DS
;save cursor position
MOV AH,3
MOV BH,0
INT 1011 ;DI row,DL col
MOV CURSOR POS,DX ;save it
;set up interrupt- procedure by
;placing segment:offset SI- TIME_INT IN NEW_VEC
MOV NEW_VEC OFFSET TIME_INT ;offset
MOV NEW_VEC+2,SEG TIME_INT ;segment
LEA DI'OLDVEC :DI points to .vector buffer
LEA SI,NEW_VEC SI points to new vector
MOV AL,1CH ;timer interrupt
CALL SETUP_INT ;setup new interrupt vector
;read keyboard
MOV AH,0
INT 16H
;restore old interrupt vector
LEA DI,NEW_VEC ;DI points to vector buffer
LEA SI,OLDVEC ;SI points to old vector
MOV AL,ICH
CALL SETUP_INT
;timer interrupt
;restore old vector
MOV AH,4CH
INT 21H
MAIN ENDP
TIME_INT PROC
;interrupt procedure
;activated by the timer
PUSH DS
MOV AX,@DATA
MOV DS,AX
;get new time
LEA BX,TIME_BUF
CALL GET_TIME
;display time
LEA DX,TIME_BUF
MOV AH,09H
INT 21H
;restore cursor position
MOV AH,2
MOV BH,0
MOV DX,CURSOR_POS
INT 10H
POP DS
IDET
TIME_INT ENDP
?
END MAIN
;BX points to time buffer
;store time_in buffer
;DX points to TIME_BUF
;display string
The LINK command should include the modules PGM15_2, PGM15_1A, and PGM15_2A.
15.6 Memory Resident Program
We will write the third version of DISPLAY_TIME as a TSR (terminate and stay resident) program. Normally, when a program terminates, the memory occupied by the program is used by DOS to load other programs. However, when a TSR program terminates, the memory occupied is not released. Thus, a TSR program is also called a memory resident program.
To return to DOS, a TSR program is terminated by using either INT 27h or INT 21h, function 31h. Our program uses INT 27h..
INT 27h:
Terminate and Stay Resident
Input: DS:DX = address of byte beyond the part that is
to remain resident
Output: nonc
We write our program as a .COM program because to use interrupt 27h, we need to determine how many bytes are to remain memory resident. The structure of a .COM program makes this easy because there is only one program segment. Another reason for using a .COM program is the size consideration. As we saw in Chapter 14, a .COM program is smaller in size than its EXE counterpart. So, to save space, TSR programs are often written as .COM programs.
Once terminated, a TSR program is not active. It must be activated by some external activity, such as a certain key combination or by the timer. The advantage of a TSR program is that it may be activated while some other program is running. Our program will become active when the Ctrl and right shift keys are pressed.
To keep the program small, it will not update the time. We leave it as an exercise for the reader to write a TSR program that updates the time every second.
The program has two parts, an initialization part that sets up the interrupt vector, and the interrupt routine itself. The procedure INITIALIZE initializes the interrupt vector 9 (keyboard interrupt) with the address of the interrupt procedure MAIN and then calls INT 27h to terminate. The address passed to INT 27h is the beginning address of the INITIALIZE procedure; this is possible because the instructions are no longer needed. The procedure INITIALIZE is shown in program listing PGM15_3A.ASM.
TITLE PGM15_3A:SET UP TSR PROGRAM EXTRN MAIN:NEAR,SETUP INT:NEAR
EXTRN NEW_VEC:WORD,OLD_VEC:DWORD
PUBLIC INITIALIZE
C_SEG SEGMENT PUBLIC
ASSUMECS:C_SEG
INITIALIZE PROC
;set up interrupt vector
MOV NEW_VEC.OFFSET MAIN ;store addr:ss
MOV NEW_VEC+2,CS ;segment
LEA DI,OD- D- VEC ;DI points to vector but for
LEA SI,NEW_VEC. ;SI points to new vector
MOV AL,09H ;keyboard interrupt
CALL SETUP_INT ;set int.c:rupt vector
;exit to DOS
LEA DX,INITIALIZE
INT 27H
INITIALIZE ENDP
C_SEG ENDS
END
There are a number of ways for the interrupt routine to detect a particular key combination. The simplest way is to detect the control and shift keys by checking the keyboard flags. When activated by a keystroke, the interrupt routine calls the old keyboard interrupt routine to handle the key input. To detect the control and shift keys, a program can examine the keyboard flags at the BIOS data area 0000:0417h or use INT 16h, function 2.
INT 16h, Function 2: Get Keyboard Flags
Input:
Output:
We will use the Ctrl and right shift key combination to activate and deactivate the clock display. When activated, the current time will be displayed on the upper right- hand corner. We must first save the screen data so that when the clock display is deactivated the screen can be restored.
The procedure SET_CURSOR sets the cursor at row 0 and the column given in DL. The procedure SAVE_SCREEN copies the screen data into a buffer called SS_BUF, and the procedure RESTORE_SCREEN moves the data back to the screen buffer. All three procedures are shown in program listing PGM15_3B.
TITLE.PGM15_3B: SAVE SCREEN AND CURSOR
EXTRN SS_BUF:BYTE
PUBLIC SAVE_SCREEN,RESTORE_SCREEN,SET CURSOR
C_SEG SEGMENT PUBLIC
ASSUME CS:C_SEG
SAVE_SCREEN PROC
;saves 8 characters from upper right hand corner of ;screen
LEA DI,SS_BUF ;screen buffer
MOV CX,8 ;repeat 8 times
MOV DL,72 ;column 72
CLD ;clear DF for string operation
SS LOOP:
CALL SET CURSOR ;setup cursor at row 0,col DL
MOV AH,0AH ;read char on screen
INT IOH ;AH = attribute, AL = character
STOSW ;stores char and attribute
INC DL ;next col
LOOP SS_LOOP
RET
SAVE_SCREEN ENDP
RESTORE_SCREEN PROC
;restores saved screen
LEA SI,SS_BUF ;SI points to buffer
MOV DI,8 ;repeat 8 times
MOV DL,72 ;column 72
MOV CX,1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R S LOOP: CALL SET CURSOR ;move cursor LODSW ;AL = - char, AH = attribute MOV BLI'AH ;attribute to BL MOV AH, OGH ;function 9, write char and attribute MOV BH, O ;page 0 INT 10H INC DL ;next- char position DEC DI ;move characters? JG RS_LOOP ;yes; repeat RET RESTORE_SCREEN ENDP SET CURSOR PROC ;sets cursor at row 0, column DL ;input DL = column number MOV AH, 02 ;function 2, set cursor MOV BH, 0 ;page 0 MOV DH, 0 ;row 0 INT 10H RET SET CURSOR ENDP C_SEG ENDS END
We are now ready to write the interrupt routine. To determine whether to activate or deactivate the time display, we use the variable ON_FLAG, which is set to 1 when the time is being displayed. Procedure 'MAIN is the interrupt procedure.
The steps in procedure MAIN are (1) save all registers used and set up the DS and ES registers, (2) call the old keyboard interrupt routine to handle the key input, (3). check to see if both Ctrl and right shift keys are down; if not, then exit, (4) test ON_FLAG to determine status, and if ON_FLAG is 1 then restore screen and exit, (5) save current cursor position and also the display screen info, and (6) get time; display time, then exit.
In step 1, to set up the registers DS and ES we use CS. It might be tempting to use the value C SEG instead; however, segment values cannot be used in a .COM program. In step 2, we need to push the I'LAGS register so that the procedure call simulates an interrupt call. In step 6, we used the BIOS interrupt 10h instead of the DOS interrupt 21h, function 9, to display the time, because from experience, the INT 21h, function 9, tends to be unreliable in a TSI program.
TITLE PGM15_3: TIME_DISPLAY_VER_3
;memory resident program that shows current time of day;called by Ctrl- rt shift key combination
EXTRN INITIALIZE:NEAR, SAVE_SCREEN:NEAR
EXTRN. RESTORE_SCREEN:NEAR, SET CURSOR:NEAR
EXTRN GET TIME:NEAR
PUBLIC MAIN
PUBLIC NEW_VEC,OLD_VEC,SS_BUF
C_SEG SEGMENT. PUBLIC
ASSUME CS:C_SEG, DS:C_SEG, SS:C_SEG
ORG 100H
START: JMP INITIALIZE
;
SS_BUF DB 16 DUP(?) ;save screen buffer
TIME_BUF DB '00:00:00s' ;time buffer ;itmin:sec
CURSOR POS DW ? ;cursor position
ON_FLAG DB 0 ;1 - interrupt procedure running
NEW_VEC DW 7. ? ;contains new vector
OLD_VEC DD ? ;contains old vector
;
MAIN PROC
;interrupt procedure
;save registers
; PUSH DS
PUSH ES
PUSH AX
PUSH BX
PUSH CX
PUSH DX
PUSH SI
PUSH DI
;
MOV AX,CS ;set DS
MOV DS,AX
MOV ES,AX
;call old keyboard interrupt procedure
PUSHF ;save FLAGS
CALL OLD_VEC
;get keyboard flags
MOV AX,CS ;reset DS
MOV DS,AX
MOV ES,AX ;and ES to current segment
MOV AH,02 ;function 2, keyboard flags
INT 16H ;AL has flag bits
TEST AL,1 ;right shift?
JE I_DONE ;no, exit
TEST AL,100B ;Ctrl?
JE I_DONE ;no, exit
;yes, process
CMP ON_FLAG,1 ;procedure active?
JE RESTORE ;yes, deactivate
MOV ON_FLAG,1 ;no, activate
;- save cursor position and screen info
MOV AH,03 ;get cursor position
MOV BH,0 ;page 0
INT 10H ;DH = row, DL = col
MOV CURSOR_POS,DX ;save it
CALL SAVE SCREEN ; save time display screen
;- position cursor to upper right corner
MOV DL,72 ;column 72
CALL SET_SCREEN ;position cursor in row 0,col 7C
LEA BX TIME BUF
CALL GET TIME ;get current time
;display time
LEA SI,TIME BUF
MOV CX,8 ;8 chars
MOV BH,0 ;page 0
MOV AH,OEH ;write char
M1: LODSB ;char in Al.
INT 10H ;cursor is moved to next col
LOOP M1 ;loop back if more char
JMP. RES CURSOR
RESTORE:
;restore screen
MOV ON_FLAG,0 ;clears flag
CALL RESTORE SCREEN
;restore saved cursor position
RES CURSOR:
MOV AH,02 ;set cursor
MOV BH,0
MOV DX,CURSOR POS
INT 10H
;restore registers
I_DONE:
POP DI
POP SI
POP DX
POP CX
POP BX
POP AX
POP ES
POP DS ;
;intcrnupt return
MAIN , ENCP
;
C_SEG ENDS
END START ; starting instruction
Because the program has been written as a COM program, we need to rewrite the file containing the GET_TIME procedure with full segment directives. The file PGM15_3C.ASM contains GET_TIME, CONVERT, and SETUP_INT.
Program Listing PGM15_3C.ASM
TITLE PGM15_3C: GET AND CONVERT TIME TO ASCII PUBLIC GET_TIME, SETUP_INT
C_SEG SEGMENT PUBLIC
ASSUME CS:C_SEG
;
GET_TIME PROC
;get time of day and store ASCII digits in time buffer
;input: BX - . address of time buffer
MOV . AH,2CH
INT 21H ;CH = hr, CL = min, DH = sec
;convert hours into ASCII and store
MOV AL,CH ;hour
CALL CONVERT ;convert to ASCII
MOV {BX},AX ;store
;convert minutes into ASCII and store
MOV AL,CL ;minute
CALL CONVERT ;convert to ASCII
MOV [BX+3],AX ;store
;convert seconds into ASCII and store
MOV AL,DH ;second
CALL CONVERT ;convert to ASCII
MOV [BX+6],AX ;store
RET
GET_TIME ENDP
CONVERT
;converts byte number (0- 59) into ASCII digits
;input: AL = number
;output: AX = ASCII digits, AL = high digit, AH = low
;digit
MOV AH,0 ;clear AH
MOV- DL,10 ;divide AX by 10
DIV DL ;AH has remainder, AL has quotient
OR AX,3030H ;convert to ASCII, AH has low digit
RET ;AL has high digit
CONVERT ENDP
; setup new vector
;INT 21H ;ES:BX = vector
;DI = address of buffer for old vector
;SI = address of buffer containing new vector
;save old interrupt vector
;function 35h, get vector
INT 21H ;ES:BX = vector
MOV [DI],BX ;save offset
MOV [DI+2],ES ;save segment
;setup new vector
MOV DX,[SI] ;DX has offset
PUSH DS ;save it
MOV DS,[SI+2] ;DS has segment number
MOV AH,25H ;function 25h, set vector
INT 21H ;
POP DS ;restore DS
RET
; setup new vector
;
; C_SEG ENDS
END
The LINK command should be LINK PGM15_3 + PGM15_3B + PGM15_3C + PGM15_3A. Notice that PGM15_3A is linked last so that the procedure INITIALIZE is placed at the end of the program. Writing TSR programs is tricky; if there are other TSR programs on your system, your program may not function properly.
An interrupt may be requested by a hardware device or by a program using the INT instruction or generated internally by the processor.
The INT instruction calls an interrupt routine by using an interrupt number.
The 8086 supports 256 interrupt types and the interrupt vectors (addresses of the procedures) are stored in the first 1 KB of memory.
The interrupts 0- 1FH call BIOS interrupt routines and the interrupt vectors are set up by BIOS when the computer is powered up.
The interrupts 20H- 3Fh call DOS interrupt routines.
Users can write their own interrupt routines to perform various tasks.
A memory resident program may be activated by a combination of keystrokes.
Glossary
conventional memory hand- shaking
hardware interrupt
interrupt acknowledge signal
interrupt number
interrupt request signal
interrupt routine
interrupt vector
interrupt vector table
memory resident program
NMI (nonmaskable
interrupt)
processor exception
TSR (terminate and stay resident) program
software interrupt
The first 640 KB of memory
A protocol for devices to communicate with each other
A hardware device interrupting the processor
A signal generated by the processor accepting an interrupt request signal
A number identifying the type of interrupt. A signal sent by a hardware device to the processor requesting service
A procedure invoked by an interrupt
The address of an interrupt routine
The set of all interrupt vectors
A TSR program
A hardware interrupt that cannot be
masked out by clearing the IF
A condition of the processor that requires special handling
A program that remains in memory after termination
An INT instruction
IRET
OFFSET
SEG
-
Compute the location of the interrupt vector for interrupt 20h.
-
Use DEBUG to find the value of the interrupt vector for interrupt 0.
-
Write instructions that use the BIOS interrupt 17h to print the message "Hello".
-
Write instructions that use the INT 21h, function 2Ah, to display the current date.
-
Write a program that will output the message "Hello" once every half second to the screen.
-
Modify PGM15_2.ASM so that INT 21h, function 9, is called to display the time only when the seconds change.
-
Write a memory resident program similar to PGM15_3.ASM using INT 21h, function 31h.
In Chapter 12, we showed how the screen can be manipulated in text mode. In this chapter, we discuss the graphics modes of the PC. There are three common color graphics adapters for the I:C: CGA (Color Graphics Adapter), EGA (Enhanced Graphics Adapter), and VGA (Video Graphics Array). We describe their operations and programming, and also show how to write an interactive video game program.
As noted in Chapter 12, the screen display is composed of lines traced by an electron beam; these lines are called scan lines. A dot pattern is created by turning the beam on and off during the scan; the dot patterns generate characters as well as pictures on the screen. The video signal controlling the scan is generated by a video adapter circuit in the computer.
A video adapter can vary the number of dots per line by changing the size of a dot. Some adapters can also change the number of scan lines.
In graphics mode operation, the screen display is divided into columns and rows; and each screen position, given by a column number and row number, is called a pixel (picture element). The number of columns and rows give the resolution of the graphics mode; for example, a resolution of
Figure 16.1 Pixel Coordinates in
Depending on the mapping of rows and columns into the scan lines and dot positions, a pixel may contain one or more dots. For example, in the low- resolution mode of the CGA, there are 160 columns by 100 rows, but the CGA generates 320 dots and 200 lines; so a pixel is formed by a 2
Table 16.1 shows the APA graphics modes of the CGA, EGA, and VGA. To maintain compatibility, the EGA is designed to display all CGA modes and the VGA can display all the EGA modes.
The screen mode is normally set to text mode, hence the first operation to begin a graphics display is to set the display mode. We showed in Chapter 12 that the BIOS interrupt 10h handles all video functions; function 0 sets the screen mode.
Mode Number (hex) CGA Graphics
4 320 x 200 4 Color
5 1 320 x 200 4 Color (color burst off)
6 640 x 200 2 Color
EGA Graphics
D 320 x 200 16 Color
E 640 x 200 16 Color
F 640 x 350 Monochrome
10 640 x 350 16 Color
VGA Graphics
11 640 x 480 2 Color
12 640 x 480 16 Color
13 320 x 200 256 Color
INT 10h Function 0 Set Screen Mode
Input: AH = 0
AL = mode number
Output: none
Example 16.1 Set the display mode to
Solution: From Table 16.1, the mode number is 06h; thus, the instructions are
MOV AH, 0 ; function 0
MOV AL, 06H ; mode 6
INT 10H ; select mode
The CGA has three graphics resolutions: a low resolution of
The CGA adapter has a display memory of 16 KB located in segment B800h; the memory addresses are from B800:0000 to B800:3FFF. Each pixel is represented by one or more bits, depending on the mode. For example,
IRGB Color
0000 Black
0001 Blue
0010 Green
0011 Cyan
0100 Red
0101 Magenta (purple)
0110 Brown
0111 White
1000 Gray
1001 Light Blue
1010 Light Green
1011 Light Cyan
1100 Light Red
1101 Light Magenta
1110 Yellow
1111 Infense White
high resolution uses one bit per pixel and medium uses two bits per pixel The pixel value identifies the color of the pixel.
The CGA can display 16 colors; Table 16.2 shows the 16 colors of the CGA. In medium resolution, four colors can be displayed at one time. This is due to the limited size of the display memory. Because the resolution is
To allow different four- color combinations, the CGA in medium- resolution mode uses two palettes; a palette is a set of colors that can be displayed at the same time. Each palette contains three fixed colors plus a background color that can be chosen from any of the standard 16 colors. The background color is the default color of all pixels. Thus, a screen with the background color would show up if no data have been written. Table 16.3 shows the two palettes.
The default palette is palette 0, but a program can select either palette for display. A pixel value (0- 3) identifies the color in the current selected palette; if we change the display palette, all the pixels change color. INT 10h, function OBh, can be used to select a palette or a background color.
INT 10h, Function OBh:
Select Palette or Background Color
Subfunction 0: Select Background
Input:
Output: none
Subfunction 1: Select Palette
Input:
Output: none
Table 16.3 CGA Mode, Four-Color Palettes
| Palette | Pixel Value | Color |
| 0 | 0 | Background |
| 1 | Green | |
| 2 | Red | |
| 3 | Brown | |
| 1 | 0 | Background |
| 1 | Cyan | |
| 2 | Magenta | |
| 3 | White |
Example 16.2 Write instructions that select palette 1 and a background color of light blue.
Solution: Light blue has color number 9. Thus,
MOV AH,OBH ;function OBh
MOV BH,OOH ;select background color
MOV BL,9 ;light blue
INT 10H
MOV BH,1 ;select palette
MOV BL,1 ;palette .1
INT 10H
To read or write a pixel, we must identify the pixel by its column and row numbers. The functions ODh and OCh are for read and write, respectively.
INT 10h, Function OCh:
Write Graphics Pixel
Input:
AL = pixel value
BIH = page (for the CGA, this value is ignored)
CX = column number
DX = row number
Output: none
INT 10h, Function ODh:
Read Graphics Pixel
Input:
BIH = page (for the CGA, this value is ignored)
CX = column number
DX = row number
Output: AI. = pixel value
Example 16.3 Copy the pixel at column 50, row 199, to the pixel at column 20, and row 40.
Solution: We first read the pixel at column 50, row 199, and then write to the pixel at column 20, row 40.
MOV AH,ODH ;read pixel
MOV CX,50 ;column 50
MOV DX,199 ;row 199
INT 10H ;AL gets pixel value
MOV AH, OCH ;write pixel, AL is already set
MOV CX,20 ;column 20
MOV DX,40 ;row 40
INT 10H
In high- resolution mode, the CGA can display two colors, each pixel value is either 0 or 1; 0 for black and 1 for white. It is also possible to select a background color using INT 10h, function 0Bh. When a background color is selected, a 0 pixel value is the background color, and a pixel value of 1 is white. We now show a complete graphics program.
Example 16.4 Write a program that draws a line in row 100 from column 301 to column 600 in high resolution.
Solution: The organization of the program is as follows: (1) set the display mode to 6 (CGA high resolution), (2) draw the line, (3) read a key in put, and (4) set the mode back to 3 (text mode). Step 3 is included so that we can control when to return to text mode; otherwise, the line would disappear before we can take a good look.
TITLE PGM16_1: CGA LINE DRAWING
;draws horizontal line in high res
;in row 100 from col 301 to col 600
.MODEL SMALL
.STACK 100H
.CODE
MAIN PROC
;set graphics mode
MOV AX,6 ;select mode 6, hi res
INT IOH
;draw line
MOV AH,0CH ;write pixel
MOV AL,1 ;white
MOV CX,301 ;beginning col
MOV DX,100 ;row
L1: INT IOH
INC CX ;next col
CMP CX,600 ;more columns?
JLE L1 ;yes, repeat
;read keyboard
MOV AH,0
INT IOH
;set to text mode
MOV AX,3 ;select mode 3, text mode
INT IOH
;return to DOS
MOV AH,4CH ;return
INT 21H ;to DOS
MAIN ENDP
END MAIN
When we wish to do fast screen updates, as in video game playing, we can bypass the BIOS routines and write directly to the CGA video display memory. To do so, we need to understand the organization of the CGA display memory. The CGA's 16- KB display memory is divided into two halves. Pixels in even- numbered rows are stored in the first 8 KB (B800:0000 to B800:1FFF), and pixels in odd- numbered rows are stored in the second 8 KB (B800:2000 to B800:3FFF). Each row is represented by 50h bytes. Figure 16.2 shows the relationship between the display memory address and the screen display.To locate the bit positions for a particular pixel in a display mode, we first determine the starting byte of that row and then the offset in the row for that pixel. We now show an example.
To locate the bit positions for a particular pixel in a display mode, we first determine the starting byte of that row and then the offset in the row for that pixel. We now show an example.
Example 16.5 Let the graphics mode be mode 4. Determine the byte address and bit positions for the pixel in row 5, column 10.
Solution: Row 5 is the third odd- numbered row, so the starting byte for row 5 has an offset address of
Example 16.6 Suppose the current display mode is mode 4. Write a pixel value of 10b at row 5, column 10.
Solution: We use the address computed in the last example. To write a pixel, we first read the byte containing the pixel, change the appropriate bits, and then write back. The reason for read before write is to preserve other pixel values contained in the same byte. To change the bits, we first clear them using an AND operation, and then write the data using an OR operation.
Figure 16.2 CGA Display Address
MOV AX,0B800H ;video memory segment number MOV ES,AX ;place in ES MOV DI,20A2H ;offset of byte MOV AL,ES: [DI] ;move byte into AL AND AL,11110011B ;clear the data bit positions OR AL,1000B ;write 10b into bit positions 3,2 STOSB ;store back to memory
It is possible to display text in graphics mode. Text characters in graphics mode are not generated from a character generator circuit as in text mode. Instead, the characters are selected from the character fonts stored in memory. Another difference between text mode and graphics mode is that the cursor is not being displayed in graphics mode. However, the cursor position can still be set by INT 10h, function 2.
Example 16.7 Display the letter "A" in red at the upper right corner of the screen. Use mode 4 and a background color of blue.
Solution: When we display characters in graphics mode, we use text coordinates. With the
The steps are as follows: (1) set to mode 4, default palette is 0, (2) set background color to blue, (3) position cursor, and (4) display letter "A" in red.
MOV AH,0 ;set mode
MOV AL,04H ;mode 4
INT 10H
MOV AH,OBH ;function OBh
MOV BH,00H ;select background color
MCV BL,3 ;blue
INT 10H
MOV AH,02 ;set cursor
MOV BH,0 ;page 0
MOV DH,0 ;row 0
MOV DL,39 ;col 39
INT 10H
MOV AH,9 ;write char function
Table 16.4 Text Columns and Rows in Graphics Mode
| Graphics Resolution | Text | Text |
| Columns | Rows | |
| 320 x 200 | 40 | 25 |
| 640 x 200 | -80 | 25 |
| 640 x 350 | 80 | 25 |
| 640 x 480 | 80 | 29 |
MOV. AL, A MOV BL, 2 MOV CX, 1 INT 10H
The EGA adapter can generate either 200 or 350 scan lines. To display the higher resolution, an ECD (enhanced color display) monitor is required. The EGA has sixteen palette registers; these registers store the current display colors. There are six color bits in each palette register; two for each primary color. This means that each palette register is capable of storing any one of 64 colors and thus, the EGA can display 16 colors out of 64 at one time. In the 16- color EGA modes, each pixel value selects a palette register. Initially, the 16 palette registers are loaded with the standard 16 CGA colors. To display other colors on the screen, a program can modify these registers using INT 10h, function 10h, subfunction Oh (see Appendix C).
The EGA adapter can emulate the CGA graphics modes, so that a program written for the CGA can run in EGA with the same colors. Its display memory can be configured by software. Depending on the display mode, the display memory may have a starting address of A0000h, B0000h, or B8000h. In displaying CGA modes, the EGA memory starts at B8000h so as to remain compatible with the CGA display memory.
In displaying EGA modes, the display memory has the following structure. It is located in segment A000h and uses up to 256 KB. To accommodate 256 KB in one segment, the EGA uses four modules of up to 64 KB each. The four modules, called bit planes, share the same 64 K memory addresses; each address refers to four bytes, one in each bit plane. The 8086 cannot access the bit planes directly; instead, all data transfer must go through EGA registers.
With this much storage, we can see that the display memory may hold more than one screen of graphics data. In EGA modes, the display memory is divided into pages, with each page being the size of one screen of data. The number of pages allowed depends on the graphics mode and the display memory size. For example, for the display mode D
When we use functions OCh and ODh to read or write pixels, the page number is specified in B11. These functions can be used on any page regardless of which page is being displayed.
Example 16.8 Assume that we are using a 16- color palette, write a green pixel to page 2 at column 0, row 0.
Solution: We use function OCh and a color value of 2.
MOV AH, OCH ;write pixel function MOV AL, 2 ;green MOV BH, 2 ;page 2 MOV CX, 0 ;column 0 MOV DX, 0 ;row 0 INT 10H
When a graphics mode is first selected, the active display page is automatically set to page 0. We can select a different active display page by using function 05h.
INT 10h, Function 5:
Select Active Display Page
Input: AH = 5
AH = page number
Output: none
Example 16.9 Select page 1 to be displayed.
MOV AH, 05H ; select active display page MOV AL, 1 ; page 1
INT 10H
Page switching can be used to do simple animation. Suppose we draw a figure in page 0, then draw the same figure at a slightly different position in page 1, and so on. Then, by quickly switching the active display page, we can see the figure move across the screen. This movement is limited by the total number of pages available. We show a more practical animation technique in section 16.5.
The VGA adapter has higher resolution than the EGA; it can display
The VGA adapter can emulate the CGA and EGA graphics modes. In VGA mode, the display memory is organized into bit planes just like the EGA
Let's look at the VGA mode 13h, which supports 256 colors. In this mode, each pixel value is one byte, and it selects a color register. The color registers are loaded initially with a set of default values. It is possible to change the value in a color register; but let us first display the default colors.
Example 16.10 Give the instructions that will display the 256 default colors as 256 pixels in row 100.
Solution: We begin by selecting mode 13h, then we set up a loop to write the value of AL, which goes from 0 to 255 in columns 0 to 255.
We can set the color in a color register with function 10h.
INT 10h, Function 10h, Subfunction 10h: Set Color Register
Input: AH = 10h
AL = 10h
BX = color register
CH = green value
CL = blue value
DH = red value
Output: none
Example 16.11 Put the color values of 30 red, 20 green, and 10 blue into color register 5.
MOV AH, 10H ; set color register
MOV AL, 10H
MOV BX, 5 ; register 5
MOV DH, 30 ; red value
MOV CH, 20 ; green value
MOV CL, 10 ; blue value
INP 10H
It is also possible to set a block of color registers in one call; see Appendix C.
The movement of an object on the screen is simulated by erasing the existing object and then displaying it at a new location. We will use a small ball to illustrate the techniques in animation.
For the display, we need to pick a graphics mode, the ball color, and the background color. Because all adapters support CGA modes, let's choose mode 4. If we select palette 1 with a green background color, we can show
a white ball moving on a green background. The ball will be represented by a square matrix of four pixels; its position is given by the upper left- hand pixel.
We will confine the ball to an area bounded by columns 10 and 300 and the rows 10 and 189. The boundary is shown in cyan. Initially, let us set the ball to the middle of the right- hand margin; that is, ball position is column 298, row 100.
The procedure SET_DISPLAY_MODE sets the display mode to 4, selects palette 1 and a green background color, and then draws a cyan border. The border is drawn by two macros DRAW_ROW and DRAW_COLUMN. The procedure DISPLAY_BALL displays the ball at column CX row DX with the color given in AL. Both procedures are in program listing PGM16_2A.ASM.
TITLE PGM16_2A:
PUBLIC SET_DISPLAY_MODE, DISPLAY_BALL
MODEL SMALL
DRAW_ROW MACRO X
LOCAL L1
;draws' a line in row X from column 10 to col...n 300
MOV AH, OCH ; draw pixel
MOV AL, 1 ; cyan
MOV CX, 10 ; column 10
MOV DX, X ; row X
L1: INT 10H
INC CX ; next column
CMP CX, 301 ; beyond column 300?
JL L1 ; no, repeat
ENDM
DRAW_COLUMN MACRO Y
LOCAL L2
; draws a line in column Y from row 10 to row 189
MOV AH, OCH ; draw pixel
MOV AL, 1 ; cyan
MOV CX, Y ; column Y
MOV DX, 10 ; row 10
L2: INT 10H
INC DX ; next row
CMP DX, 100 ; beyond row 189?
JL L2 ; no, repeat
ENDM
CODE
SET_DISPLAY_MODE PROC
; sets display mode and draws boundary
MOV AH, ( ; set mode
MOV AL, 04H ; mode 4, 120 x 200 4 color
INT 104
MOV AH, OBH ; select palette
MOV EB, 1
MOV PL, 1 ; palette 1
INT 104 ;
MOV BH,0 ;set lbackground color MOV BL; green INT 10H draw boundary DRAW_ROW 10 ;draw row 10 DRAW_ROW 189 ;draw row 189 DRAW COLUMN 10 ;draw column 10 DRAW COLUMN 300 ;draw column 300 RET
SET DISPLAY_MODE ENDP
DISPLAY BALL PROC
;displays ball.at.column CX and row DX with color given ;in' AL
;input: AL = color of ball
CX = column
; DX = row
MOV AH,CH ;write pixel
INT 10H
INC CX ;pixel on next column
INT 10H
INC DX ;down 1 row
INT 10H
DEC CX ;previous column
INT 10H
DEC DX ;restore DX
RET
DISPLAY BALL ENDP
END
Notice that, to erase the ball, all we have to do is display a ball with the background color at the ball position. Thus v.e can use the DISPLAY BALL procedure for both displaying and erasing.
To simulate ball movement, we define a ball velocity with two components, VEL_X and VEL_Y; each is a word variable. When VEL_X is positive, the ball is moving to the right, and when VEL_Y is positive, the ball is moving down. The position of the ball is given by CX (column) and DX (row). After displaying the ball at one position, we erase it and compute the new position by adding VEL_X to CX and VEL_Y to DX. The ball is then displayed at the new column and row position, and the process is repeated.
; .The following instructions display a ball at column CX, row DX; erase it; and display it in a new position determined by the velocity.
MOV AL3 ;color- 3 in palette = white
CALL DISPLAY BALL ;display white ball
MOV AL,0 ;color- 0 is background color
CALL DISPLAY BALL ;erase ball
ADD CX,VEL_X ;new column
ADD DX,VEL_Y ;new row
MOV AL,3 ;white color
CALL DISPLAY BALL ;display ball at new position
Because the computer can execute instructions at such a high speed, the ball will be moving too fast on the screen for us to see. One way to solve the
problem is to use a counter- controlled delay loop after each display of the ball. But due to different operation speeds of the various PC models, such a delay loop cannot give a consistent delay time. A better method is to use the timer. We noted in Chapter 15 that the timer ticks 18.2 times every second:
A timer interrupt procedure is needed for the timing, it will do the following: each time it is activated, it will set the variable TIMER_FLAG to 1. A ball- moving procedure will check this variable to determine if the timer has ticked; if so, it moves the ball and clears TIMER_FLAG to 0. The timer interrupt procedure TIMER_TICK is given in the program listing PGM16_2B.ASM.
TITLE PGM16_2B: TIMER_TICK
;timer interrupt procedure
EXTRN TIMER_FLAG:BYTE
PUBLIC TIMER_TICK
.MODEL SMALL
.CODE
;timer routine
TIMER_TICK PROC
;save registers
PUSH DS
;save DS
PUSH AX
MOV AX,SEG TIMER_FLAG
;get segment of flag
MOV DS,AX
MOV TIMER_FLAG,1
;put in DS
;set flag
;restore registers
POP AX
.POP DS
;restore DS
IRET
TIMER_TICK ENDP
;end timer routine
If we continue to move the ball in the same direction, eventually, the ball will go beyond the boundary. To confine the ball to the given area, we show it bouncing off the boundary. First we test each new position before displaying the ball. If a position is beyond the boundary, we simply set the ball at the boundary; at the same time, we reverse the velocity component that caused the ball to move outside. This will move the ball back as if it bounced off the boundary. The procedure CHECK_BOUNDARY in program listing PGM16_2C.ASM checks for the boundary condition and modifies the velocity accordingly.
With the boundary check procedure written, we can write a MOVE_BALL procedure that waits for the timer and moves the ball. The MOVE_BALL procedure first erases the ball at the current position given by CX,DX; then it computes the new position by adding the velocity and calls CHECK_BOUNDARY to check the new position; finally, it checks the TIMER_FLAG to see if the timer has ticked; if so, it displays the ball at the new position. The MOVE_BALL procedure is in program listing PGM16_2C.ASM.
TITLE PGM16_2C:
;contains MOVE_BALL and CHECK_BOUNDARY procedures
EXTRN DISPLAY_BALL:NEAR
EXTRN TIMER_FLAG:BYTE, VEL_X:WORD, VEL_Y:WORD
PUBLIC MOVE_BALL
.MODEL SMALL
.CODE
MOVE_BALL PROC
;erase ball at current position and display ball at new ;position
;input: CX = column of ball position ; DX = row of ball position
;erase ball
MOV AL,0 ;color 0 is background color
CALL DISPLAY_BALL;erase ball
;get new position
ADD CX,VEL_X
ADD DX,VEL_Y
;check boundary
CALL CHECK_BOUNDARY
;wait .for.1 timer tick to display ball
TEST_TIMER:
CMP TIMER_FLAG,1 ;timer ticked?
JNE TEST_TIMER ;no, keep testing
MOV TIMER_FLAG,0 ;yes, reset flag
MOV AL,3 ;white color
CALL DISPLAY_BALL;show ball
RET
MOVE_BALL ENDP
;
;CHECK_BOUNDARY PROC
;determine if ball is outside screen, if so move it
;back in and change the ball direction
;input: CX = column of ball position
;DX = row of ball position
;output: CX = column of ball position
;DX = row of ball position
;check column value
CMP CX,11 ;left of 11?
JG L1 ;no, go check right margin
MOV CX,11 ;yes, set to 11
NEG VEL_X ;change direction
JMP L2 ;go test row boundary
L1: CMP CX,298 ;beyond right margin?
;no, go test row boundary
MOV CX,298 ;set column to 298
NEG VEL_X ;change direction
;check row value
L2: CMP DX,11 ;above top margin?
JG L3 ;no, check bottom margin
MOV DX,11 ;set to 11
NEG VEL_Y ;change direction
JMP DONE ;done
L3: CMP DX,187 ;below bottom margin?
JL DONE ; no, done MOV DX, 187 ; yes, set to 187 NEG VEL_Y ; change direction DONE: RET CHECK BOUNDARY ENDP END
We are now ready to write the main procedure. Our program will use the SETUP_INT procedure in program listing PGM15_2A in Chapter 15 to set up the interrupt vector. The steps in the main procedure are: (1) set up the graphics display and the TIMER_TICK interrupt procedure, (2) display the ball at the right margin with a velocity going up and to the left, (3) wait for the timer to tick, (4) call MOVE_BALL to move the ball, (5) wait for the timer to tick again to allow more time for the ball to stay on the screen, and (6) go to step 3. The main procedure is shown in program listing PGM16_2. ASM.
TITLE PGM16_2 : BOUNCING BALL EXTRN SET_DISPLAY_MODE:NEAR, DISPLAY_BALL;NEAR EXTRN MOVE_BALL;NEAR EXTRN SETUP_INT:NEAR, TIMER_TICK;NEAR PUBLIC TIMER_FLAG, VEL_X, VEL_Y .MODEL SMALL .STACK 100H
DATA
NEW_TIMER_VEC DW ?, ? OLD_TIMER_VEC DW ?, ? TIMER_FLAG DB 0 VEL_X DW - 6 VEL_Y DW - 1
CODE
MAIN PROC MOV AX,@DATA
MOV DS,CS
;initialize DS
;set graphics mode and draw border CALL SET_DISPLAY_MODE
;set up timer interrupt vector
MOV NEW_TIMER_VEC, OFFSET TIMER_TICK ;offset
MOV NEW_TIMER_VEC+2,CS
MOV AL,ICH ;interrupt type
LEA DI,OLD_TIMER_VEC ;DI points to vector buffer
LEA SI,NEW_TIMER_VEC ;SI points to new vector
CALL SETUP_INT
;start ball at column = 298, row = 100
;for the rest of the program CX will be column position
;of ball and DX will be row position
MOV CX 298.
MOV DX 100
MOV. AL,3 ;white bail CALL DISPLAY BALL ;wait for timer tick before movin g the bail TEST_TIMER: CMP TIMER_FLAG,1 ptimer ticked? JNE TLEST_TIMER. ;no, keep testin g MOV TIMER_FLAG,0 ;yes, clouz flag CALL MOVE BALL ;move to new position ;delay 1 timer tick. TEST_TIMER_2: CMP TIMER_FLAG;1 ;timer ticked? JNE TLEST_TIMER_2 ;no, keep testin g MOV TIMER_FLAG,0 ;yes, clear flag JMP "TEST_TIMER ;go get next ball position MAIN ENDP END MAIN
To run the program, we need to link the object files PGM16_2 + PGM15_2A + PGM16_2A + PGM16_2B + PGM16_2C. One word of caution, however: this program has no way to terminate. So it may be necessary to reboot the system. In section 16.6.2 we discuss a way to terminate the program.
In the following sections, we'll develop the bouncing ball program into an interactive video game program. First, in section 16.6.1, we add sound to the program; when the ball hits the boundary a tone is generated. Second, in section 16.6.2, we add a paddle to allow the player to hit the ball. To keep things simple, the paddle only slides up and down along the left boundary and is controlled by the up and down arrow keys. If the paddle misses the ball when it arrives at the left margin, the game is terminated. The game can also be terminated by pressing the Esc key.
The PC has a tone generator that can be set to generate particular tones for specified durations. The frequency of the tone generation can be specified by a timer circuit.
The timer circuit is driven by a clock circuit that has a rate of 1.193 MHz. This is beyond the range of human hearing, but the timer can generate output signals with lower frequencies. It does this by generating one pulse for every N incoming pulses, where N can be specified by a program. The number N is first loaded into a counter, then, after counting N incoming pulses, the circuit produces one pulse. The process is repeated until a different value is placed in the counter. For example, by placing a value of 1193 in the counter, the output is 1000 pulses every second, or 1000 Hz.
The next thing in tone generation is to determine the duration To start the tone, we turn on the timer circuit; after a specific amount of time, we must turn it off. To keep time, we can use the TlMFk TlCK interrupt
routine. Because the TIMER_TICK procedure is activated once every 55 ms, we get half a second of delay in 9 ticks.
To access the timer circuit, we have to use the I/O instructions IN and OUT. They allow data to be moved between an I/O port and AL or AX. To read an 8- bit I/O port we use
IN AL, port
where port is an I/O port number. Similarly, to write to a 8- bit I/O port we use OUT port, AL
There are three I/O ports involved here: port 42h for loading the counter, port 43h that specifies the timer operation, and port 61h that enables the timer circuit.
Before loading port 42h with the count, we load port 43h with the command code B6h; this specifies that the timer will generate square waves and that the port 42h will be loaded one byte at a time with the low byte first. The bit positions 0 and 1 in port 61h control the timer and its output. By setting them to 1, the timer circuit will be enabled.
The sound- generating procedure, BEEP, produces a tone of 1000 Hz for half a second. The steps are (1) load the counter (I/O port 42h) with 1193, (2) activate the timer, (3) allow the beep to last for about 500 ms, and (4) deactivate the timer. Procedure BEEP is shown in program listing PGM16_3A.ASM.
TITLE PGM16_3A: BEEP
;sound generating procedure
EXTRN TIMER_FLAG:BYTE
PUBLIC BEEP
.MODEL SMALL
.CODE
BEEP PROC
;generate beeping sound
PUSH CX ;save CX
;initialize timer
MOV AL,OB6H ;specify mode of operation
OUT 43H,AL ;write to port 43h
;load count
MOV AX,1193 ;count for. 1000 Hz
OUT 42H,AL ;low byte
MOV AL,AH ;high byte
OUT 42H,AL
;activate speaker
IN AL,61H ;read control port
MOV AH,AL ;save value in AH
OR AL,11B ;set control bits
OUT 61H,AL ;activate speaker
;500 ns delay loop
MOV CX,9 ;do 9 times
B_1: CMP TIMER_FLAG,1 ;check timer flag
JNE B_1 ;not set, loop back
MOV TIMER_FLAG,0 ;flag set, clear it
LOOP B_1 ;repeat for next tick
;turn off tone
MOV AL,AH ;return old control value to AL
We now write a new ball movement procedure that uses the sound- generating procedure BEEP. Whenever the ball hits the boundary, procedure BEEP is called to sound the tone. The new procedures are called MOVE_BALL_A and CHECK_BOUNDARY_A; both are contained in the program listing PGM16_3B.ASM.
TITLE PGM16_3B: Ball Movement
;contains MOVE_BALL_A and CHECK_BOUNDARY_A
EXTRN DISPLAY_BALL:NEAR, BEEP:NEAR
EXTRN TIMER_FLAG:BYTE, VEL_X:WORD, VEL_Y:WORD
PUBLIC MOVE_BALL_A
.MODEL SMALL
.CODE
MOVE_BALL_A PROC
;erase ball at current position and display ball at new ;position
;input: CX = column
; DX = row
;output: CX = column
; DX = row
MOV AL,0 ;color 0 is background color
CALL DISPLAY_BALL;erase ball
;get new position
ADD CX,VEL_X
ADD DX,VEL_Y
;check boundary
CALL CHECK_BOUNDARY_A
;wait for 1 timer tick
TEST_TIMER_1
CMP TIMER_FLAG,1 ;timer ticked?
JNE TEST_TIMER_1 ;no, keep testing
MOV TIMER_FLAG,0 ;yes, clarify
MOV AL,3 ;white color
CALL DISPLAY_BALL;show ball
RET
MOVE_BALL_A ENDP
CHECK_BOUNDARY_A PROC
;determine if ball is outside screen, if so move it
;back in and change the ball direction
;input: CX = column
; DX = row
;output: CX = column
; DX = row
; check column value
CMP CX,11 ;left of 11?
JG L1 ;no, go check right margin
MOV CX,11 ;yes, set to 11
NEG VEL_X ;change direction
CALL BEEP ;sound beep
JMP L2 ;go test row boundary
L1: CMP. CX,299 ;beyond right margin?
JL L2 ;no, go test row boundary
MOV CX,298 ;set column to 298
NEG VEL_X ;change direction
CALL BEEP ;sound beep ;check row value L2: CMP DX,11 ;above top margin? JG L3 ;no, check bottom margin MOV DX,11 ;set to 11 NEG VEL,Y ;change direction CALL BEEP JMP DONI ;done L3: CMP DX,188 ;below bottom margin? JL DONE ;no, done MOV DX,187 ;yes, set to 187 NEG VEL,Y ;change direction CALL BEEP ;sound beep . DONE: RET CHECK BOUNDARY A ENDP ; END
Next, let us add a paddle to the program. The paddle will move up and down along the left boundary as the player presses the up and down arrow keys.
Since the program does not know when a key may be pressed, we need to write an interrupt procedure for interrupt 9, the keyboard interrupt. This interrupt procedure differs from the one in Chapter 15 in that it will access the keyboard I/O port directly and obtain the scan code.
There are three I/O ports to be accessed. When the keyboard generates an interrupt, bit 5 of the I/O port 20h is set, and the scan code comes into port 60h. Port 61h, bit 7, is used to enable the keyboard. Therefore, the interrupt routine should read the data from port 60h, enable the keyboard by setting bit 7 in port 61h, and clear bit 5 of port 20h.
The interrupt procedure is called KEYBOARD_INT. When it obtains a scan code, it first checks to see if the scan code is a make or break code. If it finds a make code, it sets the variable KEY_FLAG and puts the make code in the variable SCAN_CODE. If it finds a break code, the variables are not changed. Procedure KEYBOARD_INT is in program listing PGM16_3C.ASM.
TITLE PGM16_3C:Keyboard Interrupt EXTRN SCAN_CODE:BYTE, KEY_FLAG:BYTE PUBLIC KEYBOARD_INT
.MODEL SMALL
.CODE
KEYBOARD_INT PROC
;keyboard interrupt routine
;save registers
PUSH DS
PUSH AX
;set up DS
MOV AX, SEG SCAN_CODE
MOV. DS, AX
;input scan code
IN AL,60H. ;read scan code
PUSH AX , ;save it
IN AL,61H ;control port value
MOV AH, NL ; save in AH
OR AL,80H ;set bit for keyboard
OUT 61H,AL ;write back
XCHG AH,AL ;get back control value
OUT 61H,AL ;reset control port
POP AX ;recover scan code
MOV All,AL ;save scan code in AH
TEST AL,80H ;test for break code
JNE KEY_0 ;yes, clear flags, goto KEY_0
;make code
MOV SCAN_COE,AL;save in variable
MOV KEY_FLAG,1 ;set key flag
KEY_0:MOV AL,20H ;reset interrupt
OUT 20H,AL
;restore registers
POP AX
POP DS
IRET
KEYBOARD_INT ENDP ;end 'KEYBOARD routine
;
END
We now add a paddle in column 11, and use the up and down arrow keys to move it. If the ball gets to column 11 and the paddle is not in position to hit the ball, the program terminates. The paddle is made up of 10 pixels; the initial position is from row 45 to row 54. We use two variables, PAD- DLE_TOP and PADDLE_BOTTOM, to keep track of its current position.
We need two procedures: DRAW_PADDLE, to display and erase the paddle; and MOVE_PADDLE, to move the paddle up and down. Both procedures are in program listing PGM16_3D.ASM.
TITLE PGM16_3D: PADDLE CONTROL
;contains MOVE_PADDLE and DRAW_PADDLE
EXTRN PADDLE_TOP:WORD, PADDLE_BOTTOM:WORD.
PUBLIC DRAW_PADDLE, MOVE_PADDLE
.MODEL SMALL
.CODE
DRAW PADDLE PROC
;draw paddle in column 11
;input: AL contains pixel value
; 2 (red) for display and 0 (green) to erase
; save registers
PUSH CX
PUSH DX
MOV AH, OCH ;write pixel function MOV CX,11 ;column 11 MOV DX, PADDLE_TOP ;top row
DP1: INT 10H
INC DX ;next row
CMP DX, PADDLE_BOTTOM ;done?
JLE DP1 ;no, repeat
;restore registers
POP DX
POP CX
RET
DRAW PADDLE ENDP
;
MOVE PADDLE PROC
;move paddle up or down
;input: AX = 2 (t) move paddle down 2 pixels)
; - 2'...o move paddle up 2 pixels)
MOV BX,AX ;copy to BX
;check direction
CMP AX,0
JL UP
;neg, move up
;move down, check paddle position
CMP PADDLE BOTTOM,188 ;at bottom?
JGE DONE ;yes, cannot move
; JMP UPDATE ;no, update paddle
;move up, check if at top
UP: CMP PADDLE_TOP,11 ;at top?
JLE DONE ;yes, cannot move
;move paddle
UPDATE:
; - erase paddle
MOV AL,0 ;green color
CALL DRAW PADDLE
; - change paddle position
ADD PADDLE TOP,BX
ADD PADDLE BOTTOM,BX
; - display paddle at new position
MOV AL,2 ;red
CALL DRAW PADDLE
DONE: RET
MOVE PADDLE ENDP
END
MOVE PADDLE will either move the paddle up two pixels or down two pixels, depending on whether AX is positive or negative. However, if the paddle is already at the top, it will not move up; and if it is already at the bottom, it will not move down.
We are now ready to write the main procedure.
TITLE PGM16_3: PADDLE_BALL
EXTRN SET_DISPLAY_MODE:NEAR, DISPLAY_BALL:NEAR
EXTRN MOVE_BALL_A:NEAR, DRAW_PADDLE:NEAR
EXTRN MOVE_PADDLE:NEAR
EXTRN.KEYBOARD_INT:NEAR, TIMER_TICK:NEAR
EXTRN SETUP_INT:NEAR, KEYBOARD_INT:NEAR
PUBLIC TIMER_FLAG, KEY_FLAG, SCAN_CODE
PUBLIC PADDLE_TOP, PADDLE_BOTTOM, VEL_Y
.MODEL SMALL
.STACK 100H
.DATA
NEW_TIMER_VEC DW ?,?
OLD_TIMER_VEC DW ?,?
NEW_KEY_VEC DW ?,?
OLD_KEY_VEC DW ?,?
SCAN_CODE DB 0
KEY_FLAG DB 0
TIMER_FLAG DB 0
PADDLE_TOP DW ?, 45
PADDLE_BOTTOM DW 54
VEL_X DW - 6
VEL_Y DW - 1
;scan codes
UP_ARROW = 72
DOWN_ARROW = 80
ESC_KEY = 1
CODE
MAIN PROC
MOV AX,EDATA
MOV DS,AX
;initialize DS
;set graphics mode
CALL SET_DISPLAY_MODE
;draw paddle
MOV AL,2
;display red paddle
CALL DRAW_PADDLE
;set up timer interrupt vector
MOV NEW_TIMER_VEC_OFFSET_TIMER_TICK ;offset
MOV NEW_TIMER_VEC+2,CS
MOV AL,ICH
;segment
LEA DI,OLD_TIMER_VEC
LEA SI,NEW_TIMER_VEC
CALL SETUP_INT
;set up keyboard interrupt vector
MOV NEW_KEY_VEC,OFFSET KEYBOARD_INT ;offset
MOV NEW_KEY_VEC+2,CS ;segment
MOV AL,9H ;interrupt number
LEA DI,OLD_KEY_VEC
LEA SI,NEW_KEY_VEC
CALL SETUP_INT
;start ball at column = 298, row = 100
MOV CX,298 column MOV DX,100 row MOV AL,3 white CALL DISPLAY BALL ;check key flag TEST_KEY: CMP KEY_FLAG,1 ;check key flag JNE TEST_TIMER ;not set, go check timer fla MOV KEY_FLAG,0 ;flag set, clear it and che CMP SCAN_CODE,ESC_KEY ;Esc key? JNE TK_1- ;no, check arrow keys JMP DONE ;Esc, terminate TK_1: CMP SCAN_CODE,UP_ARROW ;up arrow? JNE TK_2 ;no, .check down arrow MOV AX,- 2 ;yes, move up 2 pixels CALL MOVE_PADDLE ; JMP TEST_TIMER ;go check timer flag TK_2: CMP SCAN_CODE,DOWN_ARROW ;down arrow? JNE TEST_TIMER ;no, check timer flag MOV AX,2 ;yes, move down 2 pixels CALL MOVE_PADDLE ; ;check timer flag TEST_TIMER: CMP TIMER_FLAG,1 ;flag set? JNE TEST_KEY ;no, check key flag MOV TIMER_FLAG,0 ;yes, clear it CALL MOVE BALL_A ;move ball to new position ;check if paddle missed ball CMP CX,11 ;at column 11? JNE TEST_KEY ;no, check key flag CMP DX, PADDLE_TOP ;yes, check paddle JL CP_1 ;missed, ball above CMP DX, PADDLE BOTTOM ;JG CP_1 ;missed, ball below ;paddle hit the ball, .delay one tick then ;move the ball and redraw paddle DELAY: CMP TIMER_FLAG,1 ;timer ticketd? JNE DELAY ;no, keep checking MOV TIMER_FLAG,0 ;yes, reset flag CALL MOVE BALL A ; MOV AL,2 ;display red paddle CALL DRAM PADDLE JMP TEST_KEY ;check key flag ;paddle missed the ball, erase the ball and terminate CP_1: MOV AL,0 ;erase ball CALL DISPLAY BALL ;reset timer 'interrupt vector DONE: LEA DI,NEW_TIMER_VEC LEA SI,OLD_TIMER_VEC MOV AL,CH CALL SETUP INT ;reset- keyboard interrupt- vector LEA DI,NEW_KEY_VEC LEA SI,OLD_KEY_VEC MOV AL,9H
CALL SETUP. INT ; read key . MOV AH, 0 : INT 16H ; reset to text mode . MOV AH, 0 MOV AL, 3. INT 10H ; return to DOS . MOV AH, 4CH INT 21H MAIN ENDP END MAIN
In the main procedure, we alternate between checking the key flag and the timer flag. If the key flag is set, we check the scan code: (1) Fsc key will terminate the program, (2) Up arrow key will move the paddle up, (3) Down arrow key will move the paddle down, and (4) all other keys are ignored. If the timer flag is set, we call MOVE_BALL_A to move the ball to a new position, and if the ball is at column 11 but missed the paddle, we terminate the program.
To terminate the program, we first reset the interrupt vectors and wait for a key input. When a key is pressed, we reset the screen to text mode and return to DOS.
Screen elcincents in graphics mode are called piecls.
The common IBM graphics adapters are CGA, EGA, and VGA.
The INT 10h routine handles all graphics operations.
The CGA has a medium- resolution mode of
The EGA has all the CGA modes plus a resolution of
The VGA has all the EGA modes plus a resolution of
Animation involves erasing an object and displaying it at a new location.
Sound generation can be achieved by writing to the I/O ports.
Interactive video game programming requires trapping the keyboard interrupt.
analog monitor
A monitor that can accept multilevel color signals
APA (all points addressable).
Graphics mode that maps a pixel into a single dot
background color
Default color of pixels
bit planes
Memory modules that share the same memory address
ECD (enhanced color display) monitor
A monitor that can display all EGA modes
palette
A collection of colors that can be displayed at the same time
pixel
Picture element
scan lines
Lines on the screen traced by a beam of electrons
New Instructions
IN OUT
-
Write instructions that will select graphics mode
$320 \times 200$ with 16 colors. -
Write instructions that will select palette 0 with white background for the CGA medium resolution mode.
-
Write instructions that will display a
$10 \times 10$ green rectangle with the upper left-hand corner at column 150 and row 100 on a white background using CGA medium resolution. -
Write instructions that will change a
$10 \times 10$ green rectangle on white background into a cyan rectangle on a white background.
-
Modify the video game program in the chapter to add a second paddle in column 299 so that it becomes a 2-player game.
-
Modify the video game program in the chapter so that the ball speed decreases when it hits the boundary, but increases when it is hit by a paddle.
A recursive procedure is a procedure that calls itself. Recursive procedures are important in high- level languages, but the way the system uses the stack to implement these procedures is hidden from the programmer. In - assembly language, the programmer must actually program the stack operations, so this provides an opportunity to see how recursion really works.
Because you may have had only limited experience with recursion, sections 17.1- 17.2 discuss the underlying ideas. Section 17.3 shows how that stack can be used to pass data to a procedure; this topic was also covered in Chapter 14. In sections 17.4- 17.5, we apply this method to implement recursive procedures that call themselves once. The chapter ends with a discussion of procedures that make multiple recursive calls.
A process is said to be recursive if it is defined in terms of itself. For example, consider the following definition of a binary tree:
A binary tree is either empty, or consists of a single element called the nkrf, and whose remaining elements are partitioned into two disjoint subsets (the left and right subtrees), each of which is a binary tree.
Let us apply the definition to show that the following tree T is a binary tree:
Choose A as the root of T. The tree T1, consisting of B, D, and E, is the left subtree of A and the tree T2, consisting of C; is the right subtree. We must show that T1 and T2 are binary trees.
Choose B as the root of T1. The trees T1a consisting of D and T1b consisting of E are the left and right subtrees. We must show that T1a and T1b are binary trees.
Choose D as the root of T1a. The left and right subtrees of D are empty, and since an empty tree is a binary tree, T1a is a binary tree. For the same reason, T1b is a binary tree. Because T1a and T1b are binary trees, T1 must be a binary tree.
Now look at T2. It has a root C whose left and right subtrees are empty, so it is a binary tree.
Since T1 and T2 have been shown to be binary trees, tree T must also be a binary tree.
This simple example illustrates the main characteristics of recursive processes:
-
The main problem (showing that T is a binary tree) breaks down to simpler problems (showing that T1 and T2 are binary trees), and each of these problems is solved in exactly the same way as the main problem.
-
There must be an escape case (empty trees are binary trees) that lets the recursion terminate.
-
Once a subproblem has been solved (T1 is shown to be a binary tree), work proceeds on the next step of the original problem (showing that T2 is a binary tree).
17.2 Recursive Procedures
A recursive procedure calls itself. As a first example, consider the factorial of a positive integer. It may be defined nonrecursively as
or, since
we may write the following recursive definition:
Let's rewrite this as an algorithm for a recursive procedure FACTORIAL:
1: PROCEDURE FACTORIAL (input: N, output: RESULT)
2: IF N = 1
3: THEN
4: RESULT = 1
5: ELSE
6: call FACTORIAL (input: N - 1, output: RESULT)
7: RESULT = N x RESULT
8: END_IF
9: RETURN
in line 7, the value of RESULT on the right side is the value returned by the call to FACTORIAL at line 6.
For
call FACTORIAL('A, RESULT) /* begin first call / call FACTORIAL('3, RESULT) / begin second call / call FACTORIAL('2, RESULT) / begin third call / call FACTORIAL('4, RESULT) / begin fourth call / RESULT = 1 RETURN / end fourth call */
The fourth call is the escape case. When it is finished, the third call is resumed at line 7:
RESULT = N x RESULT
On the right side,
RESULT = 2 x 1 2
and this call ends. The procedure then resumes the second call at line 7. In this call
RESULT = 3 x RESULT * 3 x 2 * 6
which ends this call. Finally the procedure resumes the first call at line 7. In this call
RESULT = 4 x RESULT = 4 x 6 = 24
and this is the value returned by the procedure.
This procedure has the properties of a recursive process that we noticed in the binary tree example. Each call to procedure FACTORIAL works on a simpler version of the original problem (finding the factorial of a smaller number), there is an escape case (the factorial of 1) and once a call has been completed, work continues on the previous call.
As a second example, consider the problem of finding the largest entry in an array A of
1: PROCEDURE FIND_MAX(input: N, output: MAX)
2: IF N = 1
3: THEN
4: MAX = N[1]
5: ELSE
6: call FIND_MAX(N- 1, N, X)
7: IF A[N] > MAX
8: THEN
9: MAX = A[N]
10: ELSE
11: MAX = MAX
12: END_IF
13: RETURN
In lines 7 and 11, the value MAX on the right side is the value returned by the call at line 6.
Let's trace the procedure for an array A of four entries: 10, 50, 20, 4.
call FIND_MAX(4,MAX) /* first call */
call FIND_MAX(3,MAX) /* second call */
call FIND_MAX(2,MAX) /* third call */
call FIND_MAX(1,MAX) /* fourth call */
As in the factorial example, the fourth call is the escape case. It returns MAX
Now the third call resumes at line 7. Because
Next the second call resumes at line 7. Because
Finally, we are back in the first call at line 7. Because
As we will see later, recursive procedures are implemented in assembly language by passing parameters on the stack (section 14.5.3). To see how this may be accomplished, consider the following simple program. It places the content of two memory words on the stack, and calls a procedure
0: TITLE PGM17_1: ADD WORDS
1: .MODEL SMALL
2: .STACK 100H
3: .DATA
4: WORD1 DW 2
5: WORD2 DW 5
6: .CODE
7: MAIN PROC
8: MOV AX,@DATA
9: MOV DS,AX
10: PUSH WORD1
11: PUSH WORD2
12: CALL ADD WORDS
13: MOV AH,4CH
14: INT 21H
15:
16: ADD WORDS PROC NEAR
17: ;adds two memory words
18: ;stack on entry: return addr.(top), word2, word1
19: ;output:AX = sum
20: PUSH BP ;save BP
21: MOV BP,SP
22: MOV AX,[BP+6] ;AX gets WORD1
23: ADD AX,[BP+4] ;AX has sum
24: POP BP ;restore BP
25: RET 4 ;exit
26: ADD WORDS ENDP
27: END MAIN
After initializing DS, the program pushes the contents of WORD1 and WORD2 on the stack, and calls ADD_WORDS. On entry to the procedure, the stack looks like this:
At lines 20- 21, the procedure first saves the original content of BP on the stack, and sets BP to point to the stack top. The result is
Now the data can be accessed by indirect addressing. BP is used for two reasons: (1) when BP is used in indirect addressing, SS is the assumed segment register, and (2) SP itself may not be used in indirect addressing. At line 22, the effective address of the source in the instruction
MOV AX,
is the stack top offset plus 6, which is the location of WORD1 content. Similarly, at line 23 the source in
ADD AX,
is the location of WORD2 content (5).
After restoning BP to its original value at line 24, the stack becomes
To exit the procedure and restore the stack to its original condition,
we use
RET 4
This causes the return address to be popped into IP, and four additional bytes to be removed from the stack.
17.4 The Activation Record
Before attempting to code a recursive procedure, one issue must be resolved. The parameters (and local variables, if any) of the procedure are reinitialized each time the procedure is called. In both examples of section 17.2, the procedure is first called with parameter
To illustrate, suppose we have a procedure that is called once from the main procedure, and then calls itself twice more. Before initiating the first call, the main procedure places the initial activation record on the stack and calls the procedure. The procedure saves BP and sets BP to point to the
: stack top, as was done in the example of the last section. The stack looks like this:
Using BP to access the parameters and local variables, the procedure executes its instructions. Before calling itself, it places the activation record for the next call on the stack. The return address that the recursive call places on the stack is that of the next instruction to be done in the procedure. As the second call begins, the procedure once again saves BP and sets BP to point to the stack top. The result is
Now, as in the first call, the procedure uses BP to access the data for the second call. Before initiating the third call, its activation record is placed on the stack. The third call saves BP and sets it to point to the stack top. The stack becomes
Let's suppose that the third call is the escape case. The result it com putes may be placed in a register or memory location so that it is available to the second call when the second call resumes. After the third call is com pleted, the second call may be resumed by first- populating BP to restore its previous value, and executing a return. The return places in IP the address of the next instruction to be done in the second call. As part of the return, the third call's parameters and local variables are popped off the stack and discarded, as was done in the example in the last section. The stack becomes
Now.the second call- resumes.It picks up the result of the third call and executes to completion. When it has finished and stored the result, the
stack is once again popped into BP, and control returns to the first call. As before, the second call's data are discarded. Now the stack looks like this:
When the first call is done, the procedure restores BP to its original value, and control passes to the main procedure. As before, the parameters are discarded. The procedure stores the final result in a place where the main procedure can pick it up.
17.5 Implementation of Recursive Procedures
In this section, we show how recursive procedures may be implemented in assembly language.
Example 17.1 Code the FACTORIAL procedure of section 17.2. Call it in a program to compute the factorial of 3.
Solution: To make the code easier to follow, the algorithm is repeated here:
1: PROCEDURE FACTORIAL (input: N, output: RESULT)
2: IF N $\mathbf{\Sigma}{\mathbf{\Sigma}{\mathbf{\Sigma}}^{\mathbf{\Sigma}}}$
3: THEN
4: RESULT $\mathbf{\Sigma}{\mathbf{\Sigma}{\mathbf{\Sigma}}^{\mathbf{\Sigma}}}$
5: ELSE
6: call FACTORIAL (input: N $\mathbf{\Sigma}{\mathbf{\Sigma}{\mathbf{\Sigma}}^{\mathbf{\Sigma}}}$ 1, output: RESULT)
7: RESULT = N $\mathbf{\Sigma}{\mathbf{\Sigma}{\mathbf{\Sigma}}^{\mathbf{\Sigma}}}$ RESULT
8: END_IF
9: RETURN
0: TITLE PGM17_2: FACTORIAL PROGRAM
1: .MODEL SIALL
2: .STACK 100H
3: .CODE
4: MAIN PROC
5: MOV AX,3 :N = 3
6: PUSH AX :N on stack
7: CALL FACTORIAL :AX has 3 factorial
3: MOV AH,4CH
9: INT 21H :dos return
10: MAIN ENCP
11: FACTORIAL PROC NEAR
12: : computes : factorial
13: : input:stack on entry - rot. addr.(top), N
14: .outputAX
15: - - - PUSH BP :save BP
16: - - MOV BP,SP :BP pts to stacktop
17: :if
18: CMP WORD PTR[BP+4],1 :N = 1?.
19: JG END_IF ; no, Nl 20: ; then 21: MOV AX, 1 ; result = 1 22: JMP RETURN ; go to return 23: END_IF : 24: MOV CX, [BP+4] ; get. N 25: DEC CX ; N- 1 26: PUSH CX ; save on stack 27: CALL FACTORIAL ; recursive call 28: MUL WORD PTR[BP+4] ; RESULT = N*RESULT 29: RETURN : 30: POP BP ; restore BP 31: RET 2 ; return and discard' N 32: FACTORIAL ENDP 33: END MAIN
The testing program puts 3 on the stack and calls FACTORIAL. At lines 15 and 16 the procedure saves BP and sets BP to point to the stack top. The stack looks like this:
Now, at line 18 the current value of N is examined. We must use CMP WORD PTR [BP+4], 1 rather than CMP [BP+4], 1 because the assembler cannot tell from the source operand 1 whether to code this as a byte or word instruction.
Because
At line 27, the second call
Since N is still not 1, the procedure calls itself one more time, and the stack looks like this:
Since N is now 1, the recursion can terminate. At line 21, the procedure places RESULT = 1 in AX, restores BP to its value in the second call and returns. The RET 2 at line 31 causes the .return address in the second call (line 28 in the listing) to be placed in IP. RET 2 also causes parameter 1 to be popped off the stack. The stack becomes
Now execution of the second call continues at line 28. Because the result of the third call is in AX, the procedure can multiply it by the current value of N, yielding RESULT = 2 × 1 = 2. The new result remains in AX. With this call complete, BP is restored and the first call resumed at line 28. The stack is now
As before, the latest result is multiplied by N, yielding RESULT = 3 × 2 = 6. Control passes to line 8 in the main program, with the value of the factorial in AX.
Example 17.2 Code procedure FINDMAX of section 17.2, and test it in a program.
Solution: The algorithm for the procedure is rcproduced here:
1: PROCEDURE FIND_MAX (input: N, output: MAX) 2: IF N = 1 3: THEN 4: MAX = 1 5: ELSE 6: call FIND_MAX(N - 1,MAX) 7: IF A(N) > MAX 8: THEN 9: MAX = A(N) 10: ELSE 11: MAX = MAX /* value returned by call at line 6 */ 12: ENDIF 13: RETURN
0: TITLE PGM17_3: FIND_MAX
1: .MODEL SMALL
2: .STACK 100H
3: .DATA
4: A DW 10,50,20,4
5: .CODE
6: MAIN PROC
7: MOV AX,DATA
8: MOV DS,AX ;initialize DS
9: MOV AX,4 ;no. of elts in array
10: PUSH AX ;parameter on stack
11: CALL FIND_MAX ;retunrs MAX in AX
12: MOV AH,4CH
13: INT 21H ;dos exit
14: MAIN ENDP
15: FIND_MAX PROC NEAR
16: ;finds the largest element in array A of N elements
17: ; input: stack on entry - ret. addr. (top), N
18: ; output: AX largest element
19: PUSH BP ;save BP
20: MOV BP,SP ;BP pts to stacktop
21: ;if
22: CMP WORD PTR [BP+4],1 ;N- 1?
23: JO ELSE ;no, go to set up next call
24: ;then
25: MOV AX,A ;MAX = A[1]
26: JMP END_IF
27: ELSE_
28: MOV CX,{BP+4} ;get N
29: DEC CX ;N- 1
30: PUSH CX ;save on stack
31: CALL FIND_MAX ;returns MAX in AX
32: ;if
33: MOV BX,[BP+4] ;get N
34: SHL BX,1 ;2N
35: SUB BX,2 ;2(N- 1)
36: CMP A[BX],AX ;A[N] > MAX?
37: JLE END_IF1 ;no, go to return
38: ;then
39: MOV AX,A[BX] ;yes, set MAX = A[N]
40: END_IF1:
41: PGP BP ;restore BP
42: RET 2 ;return and discard ?
45: FIND_MAX ENDP
44: END MAIN
The stacking of the activation records during the recursive calls in this example is similar to that of example 17.1, and is not shown here (see exercises).
At line 32, the procedure begins preparation for comparison of A[N] with the current value of MAX in AX. Recall from chapter 10 that the offset location of the Nth element of a word array A is A + 2 × (N - 1). Lines 33- 35 put 2 × (N - 1) in BX, so that based mode may be used in the comparison, at line 36. If MAX = A[N], we can leave it in AX, which means that the ELSE statement at line 11 of the algorithm need not be coded.
In the preceding examples, the code for recursive procedures has involved only one recursive call; for example, the only call that procedure FACTORIAL
As an example, suppose we would like to write a procedure to compute the binomial coefficients
These coefficients also are used in the construction of Pascal's Triangle. For
The coefficients satisfy the following relation:
This means that in the triangle, the entries along the edges are all 1's, and an- interior entry is the sum of the entries in the row above immediately to the left and right. So the triangle computes to
Let's apply the preceding definition to compute
Here is an algorithm for a procedure to compute
PROCEDURE BINOMIAL (input: N, K; output: RESULT)
IF
THEN
ELSE
CALL BINOMIAL(N- 1,K,RESULT1)
CALL BINOMIAL(N- 1,K- 1,RESULT2)
RESULT = RESULT1 + RESULT2
RETURN
Example 17.3 Code the BINOMIAL procedure and call it in a program to compute C(3, 2).
0: TITLE PGM17_4: BINOMIAL COEFFICIENTS
1: MODEL SMALL
2: .STACK 100H
3: .CODE
4: MAIN PROC
5: MOV AX,2 ;K=2
6: PUSH AX
7: MOV AX,3 ;N=3
8: PUSH AX
9: CALL BINOMIAL ;AX = RESULT
10: MOV AH,4CH
11: INT 21H ;DOS EXIT
12: MAIN ENDP
13: BINOMIAL PROC NEAR
14: PUSH NP
15: MOV BP,SP
16: MOV AX,[BP+6] ;get K
17: ;if
18: CMP AX,[BP+4] ;K=N?
19: JC THEN ;yes, nonrecursive. case
20: CMP AX,0 ;K=0?
21: JME ELSE ;no, recursive case
22: THEN:
23: MOV AX,1 ;RESULT = 1
24: JMP RETURN
25: ELSE_
26: ;compute C(N- 1,K)
27: PUSH {BP+6} ;save K
28: MOV CX,[BP+4] ;get N
29: DEC CX ;N- 1
30: PUSH CX ;save N- 1
31: CALL BINOMIAL ;RESULT1 in AX
32: PUSH .AX ;save RESULT1
33: ;compute C(N- 1,K- 1)
34: MOV CX,[BP+6] ;get K
35: DEC CX ;K- 1
36: PUSH CX ;save K- 1
37: MOV CX,[BP+4] ;get N
38: DEC CX ;N- 1
39: PUSH CX ;save N- 1
40: CALL BINOMIAL ;RESULT2 in AX
41: ;compute C(N,K)
42: POP BX ;get RESULT1
43: ADD AX,BX ;RESULT = RESULT1 + RESULT2
44: RETURN:
45: POP BP ;restore BP
46: RET 4 ;return and discard N and K
47: BINOMIAL ENDP
48: END MAIN
Procedure BINOMIAL differs from the procedures of examples 17.1 and 17.2 in the following ways:
-
There arc two escape cases,
$\mathbf{k} = \mathbf{n}$ or$\mathbf{k} = 0$ ; in both cases, the call returns 1 in AX (line 23). -
In the general case, computation of
$C(n,k)$ involves two recursive calls, to compute$C(n - 1,k)$ and$C(n - 1,k - 1)$ .
All calls to BINOMIAL return the result in AX. After
To completely understand how procedure BINOMIAL works, you are encouraged to trace the effect of the procedure on the stack, as was done in example 17.1.
Recursive problem solving has the following characteristics: (1)
The main problem breaks down to simpler problems, each of
which is solved in the same way as the main problem; (2) there is a nonrecursive escape case; and (3) once a subproblem has been solved, work proceeds to the next step of the original problem.
In assembly language, recursive procedures are implemented as follows: The calling program places the activation record for the first call on the stack and calls the procedure. The procedure uses BP to access the data it needs from the stack. Before initiating a recursive call, a procedure places the activation record for the call on the stack and calls itself. When a call is completed, BP is restored, the return address popped into IP, and the data for the completed call popped off the stack.
The code for a procedure may involve more than one recursive call. Intermediate results may be saved on the stack, and retrieved when the original call resumes.
activation record
Values of the parameters, local variables, and return address of a procedure call
recursive process
A process that is defined in terms of itself
-
Write a recursive definition of
$\mathbf{a}^{\mathfrak{n}}$ , where$\pmb{n}$ is a nonnegative integer. -
Ackermann's function is defined as follows for nonnegative integers
$\pmb{m}$ and$\pmb{n}$ :
Use the definition to show that
- Trace the steps in example 17.2 (PGM17_3.ASM) and show the stack
a. At line 20 in the initial (first) call to FIND_MAX.
b. At line 20 in the second call to FIND_MAX.
c. At line 20 in the third call to FIND_MAX.
d. At line 20 in the fourth call to FIND_MAX. This is the escape case.
e. At line 42 in the completion of the third call to FIND_MAX (after RET 2 has been executed). Also give the contents of AX.
f. At line 42 in the completion of the second call to FIND_MAX (after RET 2 has been executed). Also give the contents of AX.
g. At line 42 in the completion of the first call to FIND_MAX (after RET 2 has been executed). Also give the contents of AX. This is the value returned by the procedure.
-
Write a recursive assembly language procedure to compute the sum of the elements of a word array. Write a program to test your procedure on a four-element array.
-
The Fibonacci sequence 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . . may be defined recursively as follows:
Write a recursive assembly language procedure to compute
Programs often must deal with data that are bigger than 16 bits, or contain fractions, or have special encoding. In the first three sections of this chapter, we discuss arithmetic operations on double- precision numbers, BCD (binary- coded decimal) numbers, and floating- point numbers. In section 18.4, we discuss the operation of the 8087 numeric coprocessor.
We have shown that numbers stored in the 8086- based computer can be 8 or 16 bits. But even for 16- bit numbers, the range is limited to 0 to 65535 for unsigned numbers and - 32768 to +32767 for signed numbers. To extend this range, a common technique is to use 2 words for each number. Such numbers are called double- precision numbers, and the range here is 0 to
A double- precision number may occupy two registers or two memory words. For example, if a 32- bit number is stored in the two memory words A and
Since the 8086/8088 can only operate on 8- or 16- bit numbers, op erations on double- precision numbers must be emulated by software. In sec tion 18.4, we show how the 8087 coprocessor can be used to do double- precision arithmetic.
To add or subtract two 32- bit numbers, we first add or subtract the lower 16 bits and then add or subtract the higher 16 bits. However, the answer would be incorrect if the first addition or subtraction generates a carry or borrow.
One way to handle this problem is to use instructions to test the flags and adjust the result. A better method is to use two new instructions provided by the 8086. The instruction ADC (add with carry) adds the source operand and CF to the destination, and the instruction SBB (subtract with borrow) subtracts the source operand and CF from the destination. The syntax is
ADC destination, source SBB destination, source
For our first example we'll add two 32- bit numbers.
Example 18.1 Write instructions to add the 32- bit number in A+2:A to the number in B+2:B.
Solution: We have to move the first number to registers before the addition.
MOV AX,A ;AX gets lower 16 bits of A MOV DX,A+2 ;DX gets upper 16 bits of A ADD B,AX ;add the lower 16 bits to B ADC B+2,DX ;add DX and CF to B+2
While the 32- bit sum is stored in B+2:B, the flags may not be set correctly. Specifically, the ZF and the PF, which depend on both values of B+2 and B, are set by the value in B+2 only. When it is important to set the flags correctly, we can use additional instructions.
The procedure DADD in program listing PGM18_1. ASM performs a double- precision add and leaves the flags in the same state as if the processor had a 32- bit add instruction. We assume the two numbers are in DX:AX and CX:BX, and the result is returned in DX:AX.
;Procedure for double precision addition with ZF and PF ;adjust
DADD PROC
;input:CX:Bx = source' operand
; DX:AX = destination operand
;output: DX:AX = sum
;save register SI
;SI is needed in the procedure
;to store flags
ADD AX,BX ;add lower 16 bits
ADC DX,CX ;add upper 16 bits with carry
PUSHF ;save the flags on the stack
;pop SI ;put flags in SF
;test for zero
JNE CHECK,PF ;if DX is not zero then ZF
;is OK, go check PF
TEST AX,OFFFH ;DX = 0, check if AX = 0?
JE CHECK_PF ;yes, ZF is OK
AND SI,OFFBFH ;AX not zero, clear ZF bit in SI
;check.PF CHECK_PF:
;set SI for even parity TEST AX,OFF;test AL for parity
JP RESTORE ;AL has even parity, PF bit in
;SI is OK
XOR SI,100B ;AL has odd parity, negate PF
;bit in SI
RESTORE:
;place new flags on stack
POPFF ;update FLAGS register
;restore SI
POP SI
RET
DADD ENDP
The SI register is used to manipulate the flag bits. We copy the flags into SI by pushing the FLAGS register and then popping to SI because the contents of the FLAGS register cannot be moved to SI directly. To adjust ZF, we examine both DX and AX, and for PF we examine AL; then we copy SI to the FLAGS register, again using the stack. Use the INCLUDE directive to include the file PGM18_1. ASM in your program if you want to use procedure DADD.
To obtain the negation of a double- precision number, we recall that the two's complement of a number is formed by adding a 1 to its one's complement.
Example 18.2 Write instructions to form the negation of A+2:A.
Solution: We first form the one's complement by using the NOT instruction, and then add a 1.
NOT A+2 .one's complement
NOT A ;one's complement
INC A ;add 1
ADC A+2,0 ;take care of possible entry
For subtraction, again we subtract the low 16 bits first, then subtract the high- order words together with any borrow that might be generated.
Example 18.3 Write instructions to subtract the 32- bit number in . A+2:A from B+2:B.
Solution:
MCV- AX,A
MOV DX,A+2 ;DX gets upper 16 bits of A
SUE B,AX ;subtract the lower if bits
SBB B+2,DX ;subtract DX and CF from B+2
To set the flags correctly, we can use the same technique as in the case for addition; we will leave it as an exercise.
Double- Precision Multiplication and Division
Double- precision multiplication and division by powers of 2 can be achieved by using the shift operations, as was done in Chapter 7. To multiply by 2, we perform a left shift. To divide by 2 we perform a right shift.
Example 18.4 Write instructions to perform a left shift operation on A+2:A.
Solution: We start with a left shift on the low- order word, resulting in the msh being shifted into CF. Next, an RCI. shifts the CF into the high- order word. The instructions are
SHL A,1 ;low- order word shifted PCL A+2,1 ;shift CF into high- order word
Again, the OF, ZF, and the PF may be set incorrectly.
The next example shows multiplication by
Example 18.5 Write instructions to perform 10 left shifts on A+2:A.
Solution: One may be tempted to place 10 in CL and use CL as the count in the shift operation. However, this causes 9 bits in the number to be lost. In multiple- precision shifts, we must do one shift at a time. The CX register may be used as a counter in a loop.
MOV CX,10 ::initialize counter
SAI A,1 ;shift low- order word
RCL A+2,1 ;shift CF into high- order word
LOOP L1 ;repeat if count is not 0
The other shift and rotate operations are left as exercises.
When the multiplier is not a power of 2, we can simulate a multiplication operation with a series of additions. For example, to multiply two double- precision numbers M and N, we can form the product by adding the number M N times. A more efficient way to do multiplication and division of multiple precision numbers is to use the 8087 numeric processor instructions covered in section 18.4.
Binary- Coded Decimal Numbers
The BCD (binary- coded decimal) number system uses four hits to code each decimal digit, from 0000 to 1001. The combinations 1010 to 1111 are illegal in BCD. For example, the BCD representation of the decimal number 913 is 1001 (0001 0011. The reason for using BCD numbers is that the conversion between decimal and BCD is relatively simple. In section 18.4, we give a procedure for conversion between decimal and BCD.
As we saw in Chapter 9, multiplication and division are needed to do decimal 1/O. These are notoriously slow operations. For some business programs that perform a lot of 1/O and only do simple calculations, much time can be saved if numbers are stored internally in BCD format. Needless
to say, the processor must make it easy for programs to do BCD arithmetic if the savings are to be realized.
We first look at the two ways of storing BCD numbers in memory.
Because only four bits are needed to represent a BCD digit, two digits can be placed in a byte. This is known as packed BCD form. In unpacked BCD form, only one digit is contained in a byte. The 8086 has addition and subtraction instructions to perform with both forms, but for multiplication and division, the digits must be unpacked.
Example 18.6. Give the binary, packed BCD, and unpacked BCD representations of the decimal number S9.
Solution:
In the following sections, we cover the instructions needed to do arithmetic on unpacked BCD numbers.
In BCD operations, we do one digit at a time. It is possible to add two BCD digits and generate a non- BCD result. For example, suppose we add BL, which has 7, to AL, which has 6. The sum of 13 in AL is no longer a valid BCD digit. To adjust, we subtract 10 from AL and place a 1 in AH; then AX will contain the correct sum.
AH 0000000 AL 00000110 BL + 00000111 AL 00001101 ;not a BCD digit + 1 - 00001010 ;adjust by subtracting a 10d ;from AL and adding a 1 to AH AH 00000001 AL 00000011 ;result is 1 in AH and 3 in AL
We can get the same result by adding 6 to AL and then clearing the high nibble (bits 4- 7) of AL. Because the value 13 in AL is greater than the correct result by 10, adding a 6 will make it too large by 16; clearing the high nibble has the effect of subtracting 16.
AH 0000000 AL 00001101 ;not a BCD digit + 1 + 00000110 ;adjust by adding 6 to AL ;and 1 to AH AH 00000001 AL 00010011 ;and clearing the high nibble AH 00000001 AL 00000011 ; of AL
The 8086 does not have a BCD addition instruction, but it does have an instruction that performs the preceding adjustments: AAA (ASCII adjust for addition) instruction.
AAA has no operand (AL is assumed to be the operand). It is used after an add operation to adjust the BCD value in AL. It checks the low nibble of AL and the AF (auxiliary flag). If the low nibble of AL is greater than 9 or the AF is set, then a 6 is added to AL, the high nibble of AL is cleared, and a 1 is added to AH.
Both AF and CF are set if the adjustment is made. Other flags are undetinued.
It is also possible to add two ASCII digits and use AAA to adjust the result to obtain BCD digits. This allows a program to input ASCII digits, add them, and store the result in BCD format. For example, suppose Al. contains 36h (ASCII 6) and Bl. contains 37h (ASCII 7). We add Bl. to Al. and then use AAA to adjust the result.
AH 00000000 AL 00110110 BL + 00110111 AH 00000000 AL 01101101 ;low nibble not a BCD digit + 1 + 00000110 adjust by adding 6 to AL ;and adding 1 to AH AH 00000001 AL 01110011 ;and clearing the high nibble AH 00000001 AL 00000011 of AL
As another example, suppose AL is 39h (ASCII 9) and BL is 37h (ASCII 7).
AH 00000000 AL 00111001 BL + 00110111 AH 00000000 AL 01110000 ;low nibble is a BCD digit ;but AF is set + 1 + 00000110 ;adjust by adding 6 to AL ;and 1 to AH AH 00000001 AL 01110110 ;and clearing the high nibble AH 00000001 AL 00000110 ;of AL
Example 18.7 Write instructions to perform decimal addition on the unpacked BCD numbers in BL and AL...
Solution: The first operation is to clear AH, then we add and adjust the result.
MOV AH,0 ;prepare for possible carry ADD AL,BL ;binary addiucr. AAA ;BCD adjust, AX contains sum
Example 18.8 Write instructions to add the two- digit BCD number in bytes B+1:B to the one contained in A+1:A. Assume the result is only two digits.
Solution: We add the low digit before the high digit.
MOV AH, 0
MOV AL, A
ADD AL, B
AAA
MOV A, AL
MOV AL, AH
ADD AL, A+1
ADD AI, B+1
ADD AI, B+1
MOV A+1, AL
MOV AI, B+1
;prepare for possible carry ;load BCD digit ;binary addition ;BCD adjust, AX contains sum ;store digit ;put carry in AL ;add high digit of A, assume no ;adjustment is needed ;add high digit of B, assume no ;adjustment is needed ;store high digit
Multiple- digit addition is given as an exercise.
18.2.3
BCD subtraction is again performed one digit at a time. When one BCD digit is subtracted from another, a borrow may result. For example, suppose we subtract 7 from 26; we place 7 in BL, 2 in All, and 6 in AL. After subtracting BL from AL, the result in AL is incorrect. The adjustment is to subtract 6 from AL, clear the high nibble, and subtract 1 from AH. This has the same effect as borrowing from AH and adding 10 to AL.
AH 00000010 AL 00000110
BL 00000111
AH 00000010 AL 11111111 ;not a BCD digit
1 00000110
AH 00000001 AL 11111001 ;from AL and 1 from AH
AH 00000001 AL 00001001 ;clear high nibble of AL and
;adjust by subtracting 6
;from AL and 1 from AH
;clear high nibble of AL and ;result in AH:AL is 19
The AAS (ASCII adjust for subtraction) instruction performs BCD subtraction adjustment on the AL register. If the low nibble of AL is greater than 9 (low nibble of AL contains an invalid BCD number) or if the AL is set, AAS will subtract 6 from AL, clear the high nibble of AL, and subtract 1 from AH.
Example 18.9 Write instructions to subtract the two- digit BCD number in bytes B+1:B from the one contained in A+1:A. Assume the number in A+1:A is larger.
Solution: We subtract the low digit before the high digit.
1 1
MOV AH, A+1 ;load high BCD digit of A
MOV AL, A ;load low digit of A
SUB AL, B ;subtract low digit of B
AAS ;adjust for borrow
SUB AH,
;subtract high digit of B ;store 'high. digit ;store low digit
In subtracting the high digits, we were able to use the AH register because we assumed that no adjustment was needed; otherwise AL should be used as the result adjusted with AAS. For subtraction of three- digit numbers, and again start from the lowest digit to the highest. Three AAS adjustments are needed. The details are left as an exercise.
In this section, we show only single digit BCD multiplication. In section 18.4, we show how the 8087 can be used to perform multiple- digit BCD multiplication. Two BCD digits can be multiplied to produce a one- or two- digit product. We put the multiplicand in AL and the multiplier in a register or memory byte. After BCD multiplication, AX contains the BCD product.
To multiply 8 by 9, for example, we could put 8 in AL and 9 in BL. After doing the steps in BCD multiplication, the registers AH:AL contain the product 07 02.
The first step in BCD multiplication is to multiply the digits by ordinary binary multiplication. The binary product will be in AL. The second step is to convert the binary product to its BCD equivalent in AX.
With 8 in AL and 9 in BL, to do the first step we execute MUL BL. It puts
The AAM (ASCII adjust for multiply) instruction performs the second step. It divides the contents of AL by 10. The quotient, corresponding to the ten's digit (7, in this example), is placed in AH; the remainder, corresponding to the unit's (2, in the example), is placed in AL.
In summary, to multiply the BCD digits in AL and BL, and put the BCD product in AX, execute
MUL BL ;8- bit mu:t i plication AAM ;BCD adjust, result in AX
In this section, we show the division of a two digit BCD number by a single digit BCD number. The quotient is stored as a two digit BCD number (the leading digit may be 0). We put the dividend in AX and the divisor in a register or memory byte. After the BCD division AX will contain the BCD digits of the quotient.
For example, suppose we want to divide 97 by 5. Before division, AH:AL contains 09 07. The divisor 5 could be put in BL. Since the quotient is 19, after BCD division, AH:AL = 01 09.
There are three steps in BCD division:
-
Convert.the dividend in AX from two BCD digits to their binary equivalent.
-
Do ordinary binary division. This puts the (binary) quotient in AL and the remainder in AH.
-
Convert.the binary quotient in Al to its two-digit BCD equivalent in AX.
The instruction AAD (ASCII adjust for division) does step 1. It multiplies AH by 10, adds the product to Al, then clears AH. For AH:Al = 19 07, multiplication of AH by 10 yields 90 = 5Ah, and adding this to the 7 in AL puts 61h = 01100001 in AL.
If the divisor is in Bl, step 2 is done by executing DIV Bl. Al. gets the quotient, 13h = 19, and Al gets the remainder 02h.
Step 3 is done by executing AAM. It converts the 13h in Al to 01 09 in AH:Al.
In summary, to divide the two BCD digits in AX by the BCD digit in BL, execute
AAD : convert BCD dividend in AX to binary
DIV Bl. : do binary division
AAR : AX has BCD quotient
By using floating- point numbers, we can represent values that are very large and fractions that are very small in a uniform fashion. Before we look at the floating- point representation, we have to see how decimal fractions can be converted into binary.
Converting Decimal Fractions into Binary
Suppose the decimal fraction
Algorithm to convert a decimal fraction to an M digit binary fraction
Let' x contain the decimal fraction.
For i = 1 step 1 until m do
Y = X x 2
X - fractional part of Y
E. = in:equr part of Y
end
Now let's look at some examples.
Example 18.10 Convert the decimal fraction 0.75 to binary.
Solution: Step 1,
Example 18.11 Convert the decimal number 4.9 into binary.
Solution: We do this in two parts. First we convert the integer part into binary and get 100b. Next we convert the fractional part: Step 1,
In the floating- point representation, each number is represented in two parts: a mantissa, which contains the leading significant bits in a number, and an exponent, which is used to adjust the position of the binary point. For example, the number 2.5 in binary is 10.1b, and its floating- point representation has a mantissa of 1.01 and an exponent of 1. This is because 10.1b can be written as
For numbers smaller than 1, if we normalize the mantissa the exponent will be negative. For example, the number 0.0001b is
To perform most arithmetic operations on floating- point numbers, the exponent and the mantissa must be first extracted, and then different operations are performed on them. For example, to multiply two real numbers
Figure 18.1 Floating-Point Representation
we have to add.the exponents and multiply the mantissa; then the result is normalized and stored. However, if two real numbers are to be added, the number with the smaller exponent is shifted to the right so as to adjust the exponent to that of the other number; then the two mantissas are added and the result normalized.
: Needless to say, all these operations are time consuming if emulated by software. The floating- point operations can be carried out much faster by using a specially designed circuit chip.
The 8087 chip is designed to perform fast numeric operations for an 8088- or 8086- based system. It can operate on multiple- precision, BCD, and floating- point data.
The 8087 supports three signed integer formats: word integer (16 bits), short integer (32 bits), and long integer (64 bits).
The 8087 supports a 10- byte packed BCD format which consists of a sign byte, followed by 9 bytes which contain 18 packed BCD digits; a positive sign is represented by 0h and a negative sign by 80h.
There are three floating- point formats:
Short real- - Four data bytes with an 8- bit exponent and a 24- bit mantissa. The integer part is not stored.
Long read- - Light data bytes, with an 11- bit exponent and a 53- bit mantissa. Again, the integer part is not stored.
Temporary real- - Ten data bytes, with a 15- bit exponent and a 64- bit mantissa. All mantissa bits, including the integer part, are stored.
Figure 18.2 shows the data types of the 8087: We give some examples.
Example 18.12 Represent the number - 12345 as an 8087 packed BCD number.
Solution: For negative BCD numbers, the sign byte is 80h. There are a total of 18 BCD digits. Thus the number is 800000000000000012345h.
Example 18.13 Represent the number 4.9 as an 8087 short real.
Solution: From example 18.11, the binary representation for 4.9 is 100.1110011001100. ... After normalization, the 24- bit mantissa is 1.0011100110011001100110, and the exponent is 2. Adding the bias 127 to 2, we get 129 or 10000001b. The integer part is not stored, so the number is 0 10000001 00111001100110011001100 or 409CCCCCh.
Example 18.14 Represent the number - 0.75 as an 8087 short real.
Integer:1 Packed BCD: (-1)1(D12...Do) Kcal: (1)(21-Bis)(F1F1... bias = 12) for Short Real 1023 for Long Real 16383 for Temp Real
The 8087 has eight 80- bit data registers, and they function as a stack. Data may be pushed or popped from the stack. The top of the stack is addressed as ST or ST(0). The register directly beneath the top is addressed as ST(1). In general, the ith register in the stack is addressed as ST(i), where i must be a constant.
The data stored in these registers are in temporary real format. Memory data in other formats may be loaded onto the stack. When that happens, the data are converted into temporary real. Similarly, when storing data into memory, the temporary real data are converted to other data formats specified in the store instructions.
The instructions for the 8087 include add, subtract, multiply, divide, compare, load, store, square root, tangent, and exponentiation. In doing a complex floating- point operation, the 8087 can be 100 times faster than an 8086 using an emulation program.
The coordination between the 8087 and the 8086 is like this. The 8086 is responsible for fetching instructions from memory. The 8087 monitors
this instruction stream but does not execute any instructions until it finds an 8087 instruction. An 8087 instruction is ignored by the 8086, except when it contains a memory operand. In that case, the 8086 would access the operand and place it on the data bus; this is how the 8087 gains access to memory locations.
In this section, we'll show simple examples on the operations of load, store, add, subtract, multiply, and divide. Appendix F contains more information on these instructions and in the following sections we'll give some program examples.
The load instructions load a source operand onto the top of the 8087 stack. There are three load instructions: FLD (load real), FILD (integer load), and FBLD (packed BCD load). The syntax is
FLD source
FILD source
FBLD source
where source is a memory location.
The type of the memory data is taken from the declared data type. For example, to load a word integer stored in the memory word NUMBER, we write the instruction FILD NUMBER. If the variable DNUM is defined by DD (Define Doubleword), then the instruction FILD DNUM loads a short integer. The instruction FLD can also be used to load an 8087 register to the top of the stack. For example, FLD ST(3).
Once a number is loaded onto the 8087 stack, we can convert it into any data type by simply storing it back into memory. This is a simple way of using the 8087 to perform type conversion. Let's look at the store instructions.
When storing the top of the stack to memory, the stack may or may not be popped. The instructions FST (store real) and FIST (integer store) do not pop the stack, while the instructions FSTP (store real and pop), FISTP (integer store and pop), and FBSTP (packed BCD store and pop) will pop the stack after the store operation. The syntax is
FST destination FIST destination FSTP destination FISTP destination FBSTP destination
where destination is a memory location. The stored data type depends on the declared size of the memory operand.
Example 18.15 Write instructions to convert the short integer stored in the doubleword variable DNUM into a long real and store it in the quadword variable QNUM.
Solution: We use the load integer and store real instructions.
FILD DNUM ; load shorit integer FSTP QNUM ; store long real and pop stack
We can add, subtract, multiply, and divide a memory operand or an 8087 register with the top of the 8087 stack. The instructions for real operands are FADD (add real), FSUB (subtract real), FMUL (multiply real), and FDIV (divide real). Each opcode can take zero, one, or two operands. An instruction with no operands assumes ST(0) as the source and ST(1) as the destination; the instruction also pops the stack. For example, FADD (with no operands) adds ST(0) to ST(1) and pops the stack.
In an instruction with one operand, the operand specifies a memory location as the source; the destination is assumed to be ST(0). For example, to subtract a short real in the double word variable DWORD from ST(0) we write FSUB DWORD.
A two operand instruction specifies ST(0) as one operand and ST(1) as the other operand. The stack is not popped. For example, the instruction FMUL ST(1), ST(0) multiplies ST(0) into ST(1); and FDIV ST(0),ST(2) divides ST(2) into ST(0). The syntax is
FADD [!dest.ination,]source] FSUB [!dest.ination,]source] FMUL [!dest.ination,]source] FMUL [!dest.ination,]source] FDIV [!dest.ination,]source]
where.items in square brackets are optional.
There are also instructions for integer operands. They are FIADD (integer add), FSUB (integer subtract), FMUL (integer multiply), and FDIV (integer divide). The syntax is
FIADD source FISUB source FIMUL source FIDIV source
Example 18.16 Write instructions to add the short reals stored in the variables NUM1 and NUM2, and store the sum in NUM3.
Solution: We load the first number, add the second, and store into the third location.
FLD NUM1 ; load first number FADD NUM2 ; add second number FSTP NUM3 ; store result and pop
;add second number
;store result and pop
Multiple- Precision Integer I/O
A multiple- precision number is a number stored in multiple words. In section 18.4.1, you have already seen the special case of a double- precision number. Normally, conversions of multiple- precision numbers between their decimal and binary representations are very time consuming. We can use the 8087 to speed up the conversion process. To input a multiple- precision decimal number and convert it into binary, we first store it in BCD format. Then the 8087 can be used to convert the BCD into binary. To output a binary multiple- precision number in decimal, we first use the 8087 to convert it into BCD and then output the BCD digits.
The algorithm for reading digits and converting to packed BCD format is as follows:
read first char
case ' ' : set sign bit of BCD buffer '0' '9' : convert to binary and push on stack
while char <> CR
read char
case '0' '9' : convert to binary and push on stack
end while
repeat:
pop stack
assemble 2 digits to one byte
until all digits are popped
The algorithm is coded in procedure READ_INTEGEk, which also converts the BCD number into temporary real format. We can convert the number into other binary formats by using different store instructions. The READ_INTEGER procedure is given in program listing PGM18_2. SM. The input buffer is 10 bytes and contains 0's initially. We also assume the input number is at most 18 digits.
READ_INTEGER PROC
;read multiple precision integer number and store as ;real number
;input: BX = address of 10- byte buffer of 0's
XOR BP,BP :BP counts number of digits read MOV SI,BX :copy.of pointer
;read number and push digits on stack
MOV AH,01 :read
INT 21H
;check for negative
CMP AL,'- '
JNE RI_LOOP1 :not,negative
;negative, set sign byte to 80h
MOV BYTE PTR
INT 21H ;read next char
;check for CR
RI_LOOP1:
CMP AL,0DH ;CR?
JE RI_1 ;CR, goto RI_1
;digit, convert to binary and savc on stack
AND AL,0FH ;convert ASCII to binary value
INC BP ;increaant count
PUSH AX ;push on stack
MOV AH,01 ;read next char
INT 21H
JMP RI_LOOP1 ;repeat
;pop number from stack and store as packed BCD
RI_1:
MOV CL,4 ;counter for left shifts
RI_LOOP2:
Once the numbers are converted to binary format, we may add, subtract, multiply, and divide them. As long as the results do not cause overflow, we can store the results as BCD numbers and print out the results in decimal using the following algorithm:
if sign bit is set, then print get high order byte
for 9 times do
convert high BCD digit to ASCII and output
convert low BCD digit to ASCII and output
get next byte
end
The algorithm is coded as procedure PRINT_BCD given in program listing PGM18_3. ASM.
PRINT_BCD PROC
;print BCD number in buffer
;input: BX = addresses of 10- byte
TEST BYTE PTR[BX+9],80H ;check sign bit
JE PB_1 ;positive, skip
MOV DL, ;negative, output'- '
MOV AH,2
INT 21H
PB_1 : ADD BX,8 ;start with most significant digit
MOV CH,9 ;9 bytes
MOV CL,4 ;shift 4 times
PB_LOOP:
MOV DL,[BX] ;get byte
SHR DL,CL ;high digit to low nibble
OR DL,30H ;convert to ASCII
MOV AH,2 ;output
INT 21H
MOV DL,[BX] ;get byte again
AND DL,0FH ;mask out high nibble
OR DL,30H :convert low digit to ASCII
MOV AH,2' :output
INT 21H
DEC BX :next byte
DEC CH :more digits:
JG PB_LOOP :yes, repeat
RET
PRINT_BCD ENDP
When we combine 8086 and 8087 instructions in a program, we need to make sure the 8086 does not access a memory location for an 8087 result before the 8087 can finish an operation and store the result. To synchronize the 8086 with the 8087 we use the instruction FWAIT, which suspends the 8086 until the 8087 is finished executing.
Program listing PGM18_4 gives a program that reads in two multiple precision numbers, and outputs the sum, difference, product, and quotient.
TITLE PGM18_4: MULTIPLE PRECISION ARITHMETIC
:inputs 2 multiple precision numbers
:outputs the sum, difference, product, and quotient
.MODEL SMALL
.8087
.STACK
.DATA
NUM1 DT 0
NUM2 DT 0
SUM DT ?
-DIFFERENCE DT ?
PRODUCT DT ?
QUOTIENT DT ?
CR EQU ODH
LF EQU OAH
NEW_LINE MACRO ;output CR and LF
MOV DL,CR
MOV AH,2
INT 21H
MOV DL,LF
INT 21H
ENDM
DISPLAY MACRO X ;output X on screen
MOV DL,X
MOV AH,2
INT 21H
ENDM
CODE
;include I/O procedures
INCLUDE PGM18_2. ASM
INCLUDE PGM18_3. ASM
MA11 PROC
MOV AX,@DATA
MOV DS,AX
MOV ES,AX
;initialize DS
;initialize ES
DISPLAY '?'
LEA BX,NUM1
CALL READ_INTEGER
NEW_LINE
DISPLAY '?'
LEA BX,NUM2
CALL READ_INTEGER
NEW_LINE
;compute sum
FLD NUM1
FLD NUM2
FADD
FBSTP SUM
FWAIT
LEA BX, SUM
CALL PRINT_BCD
NEW_LINE
;compute difference
FLD NUM1
FLD NUM2
FSUB
FBSTP DIFFERENCE
FWAIT
LEA BX, DIFFERENCE
CALL PRINT_BCD
NEW_LINE
;compute product
FLD NUM1
FLD NUM2
FMUL
FBSTP PRODUCT
FWAIT
LEA BX, PRODUCT
CALL PRINT_BCD
NEW_LINE
;compute quotient
FLD NUM1
FLD NUM2
FDIV
FBSTP QUOTIENT
FWAIT
LEA BX, QUOTIENT
CALL PRINT_BCD
NEW_LINE
MOV AH,4CH
INT 21H
MAIN ENDP
END MAIN
;display prompt
;BX points to buffer
;input first number
;BX points to buffer
;input second number
;load first number
;load second number
;add
;store and pop
;synchronize 8086 and 8087
;BX points to SUM
;output SUM
;load first number
;load second number
;subtract second from first
;store difference and pop
;synchronize 8086 and 8087
;set pointer
;output DIFFERENCE
;load first number
;load second number
;multiply
;store product and pop
;synchronize 8086 and 8087
;BX points to PRODUCT
;output PRODUCT
;load first number
;load second number
;divide first by second
;store quotient and POP
;synchronize 8086 and 8087
;set pointer
;output QUOTIENT
;return
;to DOS
Numbers with fractions are called real numbers. The algorithm for reading real numbers is similar to that for integers. The digits are read in as BCD, then converted to floating point and scaled. To do the scaling, a counter is set to the number of digits after the decimal point.
repeat: read char case '- ' : set sign bit of BCD buffer ,': set flag 0' ... '9' : convert to binary and push on stack and if flag is set, increment counter
until CR
repeat:
pop stack
assemble 2 digits. to one byte until all digits are popped load BCD onto 8087 stack divide by nonzero count. value store back as real
The algorithm is coded as: procedure READ_FLOAT given in program listing PGM18_S.ASM.
READ_FLOAT PROC
;read and store real number
;input: BX = address of 10- byte buffer of 0's
XOR DX,DX ;DH = 1 for decimal point, DL = no. of digits after decimal point
XOR BP,BP ;BP counts number of digits read MOV SI,BX ;copy of pointer
;read number and push digits on stack
RF_LOOP1:
MOV AH,01 ;read.char
INT 21H
;check for negative
CMP AL,
JNE RF- 1 ;not negative, check
;negative, set sign byte
MOV BYTE PTR [BX+9],80H
JMP RFLOPl ;read next char
RF- 1: CMP AI, decimal point?
JNE RF- 2 ;no, check CR
;decimal point, set DH to 1
INC . DH
JMP RF LOOPl ;read next char
;check for CR
RF_2: CMP AL,ODH
JE RF_3 ;CR, - goto RF_3
;digit, convert to binary and save on stack
AND AL,OFH ;convert ASCII to binary value
INC BP ;increment count
PUSH AX ;push on stack
CMP CH,0
JE RF LOOP1
INC DL ;yes, increment count
JMP RF LOOP1 ;read next char
;pop number from stuck and store as packed BCD
RF_3:
MOV CL,4 ;counter for left shifts
RF LOOP2:
POP AX
MOV {BX},AL
DEC BP ;decrement count
JE RF_4 ;done if 0
POP AX ;get high digit
SHL AL,CL ;move to high nibble
OR {BX},AL ;move to buffer
INC BX ;next byte
DEC BP ;more digits?
JG RF LOOP2 ;yes, repeat
;convert to real
RF_4: FBLD TBYTE PTR{SI}
FWAIT
CMP DL,0
JE RF_5
XOR CX,CX
MOV CL,DL ;digit count in CX
MOV AX,1 ;prepare to form
MOV BX,10 ;powers of 10
RF LOOP3:
IMUL BX
LOOP RF LOOP3
MOV {SI},AX
FIDIV WORD PTR{SI}
RF_5: FSTP TBYTE PTR{SI}
FWAIT
RET
READ_FLOAT ENDP
Here, we assume that the number of digits after the decimal point is less than 5, which allows the scaling factor to be stored as a one- word signed integer.
To output real numbers, we first multiply the number by a scaling factor. Then we store the real number in BCD format, and output the digits with an appropriate decimal point. We print only four digits after the decimal point. So the scaling factor is 10000.
multiply real number by 10000
store as BCD
output BCD number with ''. inserted before last 4. digits
The algorithm is coded as procedure PRINT_FLOAT given in program listing PGM18_6. ASM.
PRINT_FLOAT PROC
;print top of 8087 stack
;input: BX address of buffer
MOV WORD PTR[BX],10000
FIMUL WORD PTR[BX] ;scale up by 10000
FBSTP :TBYTE PTR[BX] ;store as BCD
FWAIT ;synchronize 8086 and 8087
TEST BYTE PTR[BX+9],80H ;check sign bit
OB '1F' 10000
MOV DL,- ;output
MOV AH,2
INT 21H
PF_1: ADD BX,8; ;point to high byte
MOV CH; ;14digits before decimal point
MOV CL,4 ;4 shifts
MOV DH,2 ;2 times
PF_LOOP:
MOV DL,[BX] ;get BCD digits
SHR DL,CL; ;move high digit to low nibble
OR DL,30H; ;convert to ASCII
INT 21H ;output
MOV DL,[BX] ;get byte again
AND DL,CFH ;mask out high digit
OR DL,30H ;convert to ASCII
INT 21H ;output
DEC BX ;next byte
DEC CH ;decrement count
JG PF_LOOP ;repeat if more bytes
DEC DH ;second time?
JE PF_DONE; yes, done
DISPLAY ;no, output decimal point
MOV CH,2 ;4 more digits after decimal point
JMP PF_LOOP ;go print digits
PF_DONE:
RET
PRINT_FLOAT ENDP
The program to combine these procedures is left as an exercise.
Double- precision numbers increase the range of integers represented.
The ADC and SIB instructions are used in performing double- precision addition and subtraction.
Multiplication and division of double- precision numbers by powers of 2 can be implemented by shift and rotate instructions.
In the BCD system, the decimal digits of a number are expressed in four bits. A number is stored in packed form if two BCD digits are contained in a byte; in unpacked form, one BCD digit is contained in a byte.
-
The advantage of the BCD representation is that it is easy to convert decimal character input to BCD and back. The disadvantage is that decimal arithmetic is more complicated for the computer than ordinary binary arithmetic.
-
The AAA instruction adjusts the sum in AL after addition.
-
The AAS instruction adjusts the difference in AL after a subtraction.
-
The AAM instruction takes the binary product of two BCD digits in AL, and produces a two-digit BCD product in AH:AL.
-
The AAD instruction converts a two-digit BCD dividend in AH:AL into its binary equivalent in AL.
-
Floating-point format consists of a sign bit, an exponent, and a mantissa.
-
The 8087 numeric processor can perform a variety of numeric operations on integer, BCD, and real numbers.
BCD (binary- coded
decimal) system
bias
double- precision number
exponent
floating- point number
mantissa
multiple- precision number
packed BCD form
unpacked BCD form
A system of coding each decimal digit as four binary digits
A number that is added to the exponents to make them positive
Number stored in two computer words
The part of a floating- point number consisting of the power
Number represented in memory in the form of exponent and mantissa
The part of a floating- point number consisting of the significant digits
Number stored in multiple words
Two BCD digits stored in a byte
One BCD digit stored in a byte
AAA
FDIV
FIADD
FIDIV
FILD
FIMUL
FIMUL
FIST
FISUB
.8087
For exercises 1 to 6, use only the 8086 instructions.
-
Write a procedure DSUB that will perform a double-precision subtraction of CX:BX from DX:AX and return the difference in DX:AX. DSUB should set the flags correctly.
-
Write a procedure DCMP that will perform a double-precision compare of CX:BX from DX:AX. The registers should not be changed, and the flags should be set correctly.
-
Write the instructions that will perform the following doubleprecision operations. Assume that the number is in DX:AX. Do single shifts and rotates.
a. SHR b. SAR c. ROR d. ROL e. RCR f. RCL
- A triple-precision number is a three-word (48-bit) number. Write instructions that will perform the following operations on the two triple-precision numbers stored in
$\mathbf{A} + \mathbf{4}:\mathbf{A} + \mathbf{2}:\mathbf{A}$ and$\mathbf{B} + \mathbf{4}:\mathbf{B} + \mathbf{2}:\mathbf{B}$ .
a. Add the second number to the first. b. Subtract the second number from the first.
-
Write Instructions that will perform an arithmetic right shift on a triple-precision number stored in BX:DX:AX.
-
Suppose two unpacked 3-digit BCD numbers are stored in
$\mathbf{A} + \mathbf{2}:\mathbf{A} + \mathbf{1}:\mathbf{A}$ and$\mathbf{B} + \mathbf{2}:\mathbf{B} + \mathbf{1}:\mathbf{B}$ . Write instructions that will
a. add the second number to the first; assume the result is only three digits. b. subtract the second number from the first; assume that the first number is larger.
-
Rcpresent the number -0.0014 as an 8087 short real.
-
Rcpresent the number -2954683 as an 8087 packed BCD.
-
Write the floating-point instructions that will
a. add an integer variable X to the top of the stack.
b. divide a short real number Y into the top of the stack.
c. store and pop the stack to a BCD number Z.
-
Write a program to read in two decimal numbers from the keyboard and output their sum. The numbers may be negative and have up to 20 digits. Do not use the 8087 instructions.
-
Write a program to read in two real numbers, with up to four decimal digits after the decimal point, and output their sum, difference, product, and quotient.
19.1 Kinds of Disks
Up till now, we have used disk storage exclusively as a repository for system and user program files. Disk files can also be used to store input and output data of a program. Common examples are databases and spreadsheets. In this chapter, we study disk organization, disk operations, and file handling.
There are two kinds of disks, floppy disks and hard disks. Floppy disks are made of mylar and are flexible, hence the name. Hard disks are made of inetal and arc rigid. The surface of a disk is coated with a metallic oxide, and information is stored as magnetized spots.
Floppy and hard disk operations are similar. A disk drive unit reads and writes data on the disk with a read/write head, which moves radially in and out over the disk surface while the disk spins. Each head position traces a circular path called a track on the disk surface. The movement of the read/write head allows it to access different tracks.
A floppy disk is contained in a protective jacket and comes in 31/2- inch or 514- inch diameter sizes. The jacket for a 514- inch disk is made of flexible plastic and has four cutouts (see Figure 19.1): (1) a center cutout so that the disk drive can clamp down on the disk and spin it; (2) an oval- shaped cutout that allows the read/write head to access the disk surface; (3) a small circular hole that aligns with an index hole on the disk used by the disk drive to identify the beginning of a track; and (4) a
Figure 19.1 A 51/4-inch Floppy Disk
write- protect notch—if open, the disk can be read or written; if taped over, the disk can only be read.
The 31/2- inch disk has a more sturdy construction. Its jacket is made of hard plastic, which makes it more rigid; it has a metal- reinforced hub for longer use and a metal sliding cover that protects the read/write head access opening. The write- protection hole operates differently from that of the 51/4- inch disk; the disk is write- protected when the hole is open. There is no index hole. Figure 19.2 shows a 31/2- inch disk.
A hard disk consists of one or more platters mounted on a common spindle. Both sides of a platter are used for recording, and there is one read/write head for each side of a platter. All the heads are connected to a common moving unit. See Figure 19.3.
The read/write head hovers just above the disk surface, never actually touching it during operations (unlike a floppy disk). The space between the head and the disk surface is small that any dust particle would cause the head to crash onto the disk surface, so hard disks and their disk drives come in hermetically sealed cases.
Figure 19.2 A 31/2-inch Floppy Disk
Hard disk access is much faster than for a floppy disk for several reasons:
(1) a hard disk is always rotating, so no time is lost in starting up the disk, (2) hard disks rotate at a much faster rate (usually about 3600 rpm, or revolutions per minute, versus 300 rpm for a floppy disk), and (3) because of its rigid surface and dust-free environment, the recording density is much greater.
Information on a disk is stored in the tracks. When a disk is formatted, tracks are partitioned into 512- byte areas called sectors. DOS numbers tracks, starting with 0. Within a track, sectors are also numbered, starting with 1. The number of tracks and sectors per track depends on the kind of disk.
A cylinder is the collection of tracks that have the same number. For example, cylinder 0 for a floppy disk consists of track 0 on each side of the disk; for a hard disk; cylinder 0 consists of the tracks numbered 0 on both sides of each plotter. Cylinders are so named because the tracks that make up a cylinder line up vertically and seem to form a physical cylinder
(see Figure 19.3). The number of cylinders a disk has is equal to the number of tracks on each surface.
DOS also numbers the surfaces that make up a disk, beginning with 0. A floppy disk has surfaces 0 and 1. A hard disk can have more surface numbers, because it may consist of several platters.
The capacity in bytes that can be stored on a disk can be calculated as follows:
capacity in bytes = surfaces
For example, a 514- inch floppy disk has this capacity:
capacity in bytes = 2 surfaces
Tables 19.1A and 19.1B give the number of cylinders, sectors/track, surfaces, and capacity for some of the floppy and hard disks in use today.
The density of information on a floppy disk depends on the recording technique. Two common recording techniques are double density and high density. A high- density drive uses a narrow head and it can read double- density disks; however, a double- density drive cannot read a high- density disk.
Table 19.1A Floppy Disk Capacity
Kind of Disk Cylinders Sectors/Track Capacity
514 in. 40 9 368,640 bytes
double density
514 in. 80 15 1,228,800 bytes
- high density
312 in. 80 9 737,280 bytes
double density
312 in. 80 18 1,474,560 bytes
high density
Table 19.1B Hard Disk Capacity.
Kind of Disk Cylinders Sectors/Track Sides Capacity
10 MB 306 17 4 10,653,696 bytes
20 MB 615 17 4 21,411,840 bytes
30 MB 615 17 6 32,117,760 bytes
60 MB 940 17 8 65,454,080 bytes
19.2.2
Disk Access
The method of accessing information for both floppy and hard disks is similar. The disk drive is under the control of the disk controller circuit, which is responsible for moving the heads and reading and writing data. Data are always accessed one sector at a time.
The first step in accessing data is to position the head at the right track. This may involve moving the head assembly- - a slow operation. Once the head is positioned on the right track, it waits for the desired sector to come by; this takes additional time. Because all the tracks in a cylinder can be accessed without moving the head assembly, when DOS is writing data to a disk it fills a cylinder before going on to the next cylinder.
19.2.3. File Allocation
To keep track of the data stored on a disk, DOS uses a directory structure. The first tracks and sectors of a disk contain information about the disk's file structure. We'll concentrate on the structure of the 514- inch double- density floppy disk, which is organized as follows:
| Surface | Track | Sectors | Information |
| 0 | .0 | 1 | boot record (used in start-up) |
| .1 | . | file allocation table (FAT) | |
| 0 | 0 | 2-5 | |
| .1 | . | . | |
| 0 | 0 | .6-9 | file directory |
| .1 | 0 | 1-3 | file directory |
| .1 | 0 | 4-9 | data (as needed) |
| 0 | 1 | 1-9 | data (as needed) |
DOS creates a 32- byte directory entry for each file. The format of an entry is as follows:
Byte
Function
0- 7
filename (byte 0 is also used as a status byte)
8- 10
extension
11
attribute (see below)
12- 21
reserved by DOS
22- 23
creation hour:minute:second
24- 25
26- 27
creation year:month:day
starting cluster number (see discussion of the FAT)
28- 31
file size in bytes
There are seven sectors in the directory area, each with 512 bytes. Each file entry contains 32 bytes, so there is room for
The directory is organized as a tree, with the main directory (a.k.a. root directory) as root, and the subdirectories as branches.
In a file directory entry, byte 0 is the file status byte. The FORMAT program assigns 0 to this byte; it means the entry has never been used. ESh means the file has been deleted. 2Eh indicates a subdirectory. Otherwise, byte 0 contains the first character of the filename.
When a new file is created, DOS uses the first available directory field to store information about the file.
Byte 11 is the attribute byte. Each bit specifies a file attribute (see Figure 19.4).
A hidden file is a file whose name doesn't appear in the directory search; that is, the DIR command. Hiding a file provides a measure of security in situations where several people use the same machine. A hidden file may not be run under DOS version 2 (it may be run under DOS version 3). However, the attribute may be changed (see section 19.2.8) and then it can be run.
The archive bit (bit 5) is set when a file is created. It is used by the BACKUP command that saves files. When a file is saved by BACKUP, this bit is cleared but changing the file will cause the archive bit to be set again. This way the BACKUP program knows which file has been saved.
The attribute byte is specified when the file is created, but as mentioned earlier, it may be changed. Normally when a file is created it has attribute 20b (all bits 0 except the archive bit).
An example of a file directory entry is given in section 19.3.
DOS sets aside space for a file in clusters. For a particular kind of disk, a cluster is a fixed number of sectors (2 for a 514 in. double- density disk); in any case, the number of sectors in a cluster is always a power of 2.
Clusters are numbered, with cluster 0 being the last two sectors of the directory. Bytes 26 and 27 of the file's directory entry contain the starting
cluster number of the file. The first data file on the disk begins at cluster 2.
Even if a file is smaller than a cluster (1024 bytes for a 514 in. double- density disk), DOS still sets aside a whole cluster for it. This means the disk is likely to have space that is not being used, even if DOS says it is full.
The purpose of the file allocation table (FAT) is to provide a map of how files are stored on a disk. For floppy disks and 10- MB hard disks, FAT entries are 12 bits in length; for larger hard disks, FAT entries are 16 bits long. The first byte of the FAT is used to indicate the kind of disk (Table 19.2). For 12- bit FAT entries, the next two bytes contain FFh.
To see how the FAT is organized, let's take an example of how DOS uses the FAT to read a file (refer to Figure 19.5):
-
DOS gets the starting cluster number from the directory; let's suppose it is 2.
-
DOS reads cluster 2 from the disk and stores it in an area of memory called the data transfer area (DTA). The program that initiated the read retrieves data from the DTA as needed.
-
Since entry 2 contains 4, the next cluster in the file is cluster 4. If the program needs more data, DOS reads cluster 4 into the DTA.
-
Entry 4 in the FAT contains FFH, which indicates the last cluster in the file. In general, the process of obtaining cluster numbers from the FAT and reading data into the DTA continues until a FAT entry contains FFH.
Kind of Disk
First Byte (hex)
514- in. double density
FD
514- in. high density
F9
312- in. double density
312- in. high density
Hard disk
As another example, the FAT in Figure 19.5 shows a file that occupies clusters 3, 5, 6, 7, and 8.
To store a disk file, DOS does the following:
- DOS locates an unused directory entry and stores the filename, attribute, creation time, and date.
2: DOS searches the FAT for the first entry indicating an unused cluster (000 means unused) and stores the starting cluster number in the directory. Let's suppose it finds 000 in entry 9.
- If the data will fit in a cluster, DOS stores them in cluster 9 and places FFFh in FAT entry 9. If there are more data, DOS looks for the next available entry in the FAT; for example, Ah. DOS stores more data in cluster Ah, and places 00Ah in FAT entry 9. This process of finding unused clusters from the FAT, storing data in those clusters, making each FAT entry point to the next cluster continues until all the data have been stored. The last FAT entry for the file contains FFFh.
In this section, we discuss a group of INT 21h functions called the file handle functions. These functions were introduced with DOS version 2.0 and make file operations much easier than the previous file control block (FCB) method. In the latter, the programmer was responsible for setting up a table that contained information about open files. With the file handle functions, DOS keeps track of open file data in its own internal tables, thus relieving the programmer of this responsibility. Another advantage of the file handle functions is that a user may specify file path names; this was not possible with the FCB functions.
In the following discussion, reading a file means copying all or part of an existing file into memory; writing a file means copying data from memory to a file; rewriting a file means replacing a file's content with other data.
When a file is opened or created in a program, DOS assigns it a unique number called the file handle. This number is used to identify the file, so the program must save it.
There are five predefined file handles. They are
0 keyboard 1 screen 2 error output- screen 3 auxiliary device 4 printer
In addition to these files, DOS allows three additional user- defined files to be open (it is possible to raise the limit of open user files See the DOS manual).
There are many opportunities for errors in INT 21h file handling; DOS identifies each error by a code number. In the functions we describe here, if an error occurs then CF is set and the code number appears in AX. The following list contains the more common file- handling errors.
Hex Error Code Meaning 1 invalid function number 2 file not found 3 path not found 4 all available handles in use 5 access denied 6 invalid file handle C invalid access code F invalid drive specified 10 attempt to remove current directory 11 not the same device 12 'no more files to be found
In the following sections, we describe the DOS file handle functions. As with the DOS I/O functions we have been using, put a function number in AH and execute INT 21h.
Before a file can be used; it must be opened. To create a new file or rewrite an existing file the user provides a filename and an attribute; DOS returns a file handle.
The filename may include a path; for example, A:\PROGS\PROG1. ASM. Possible errors for this function are 3 (path doesn't exist), 4 (all file handles in use), or 5 (access denied, which means either that the directory is full or the file is a read- only file).
Example 19.1 Write instructions to open a new read- only file called FILE1.
Solution: Suppose the filename is stored as follows
FNAME DB 'FILE1', 0 HANDLE DW ?
The string FNAME containing the filename must end with a 0 byte. HANDLE will contain the file handle.
MOV AX, @DATA
MOV DS, AX ; initialize DS
MCV AH, 3CH ; open file function
LEA DX, FNAME ; DX has filename address
MOV CL, 1 ; read_only attribute
INT 21H ; open file
MOV HANDLE, AX ; save handle or error code
JC OPEN_ERROR ; jump if error
If there were an error, the program would jump to OPEN_ERROR where we could print an error message.
To open an existing file, there is another function:
INT 21h, Function 3Dh:
Open an Existing File
Input: AH = 3Dh
DS:DX = address of filename which is an ASClIZ string
AL = access code: 0 means open for reading
1 means open for writing
2 means open for both
Output: If successful, AX = file handle
Error if CF = 1, error code in AX (2,4,5,12)
After a file has been processed, it should be closed. This frees the file handle for use with another file. If the file is being written, closing causes any data remaining in memory to be written to the file, and the file's time, date, and size to be updated in the directory.
INT 21H, Function 3Eh:
Close a File
Input: BX = file handle
Output: If CF = 1, error code in AX (6)
Example 19.2 Write some code to close a file. Suppose variable HANDLE contains the file handle.
MOV AH,3EH ;close file function
MOV BX,HANDLE ;get handle
INT 21H ;close file
JC CLOSEERROR ;jump if errc
The only,thing that could go wrong is that there might be no file corresponding to the file handle (error 6).
Reading a File The'following function'reads a specified nuir'v of bytes iotn a file and stores them in memory.
INT 21H,Function.3Fh:
Read a File
Input: ,AH=.3H:
BX
CX
DS:DX
Output:
if
If
Example 19.3 Write sonic code to read a 512- byte sector from a file.
Solution: First we must set up a memory block :buffer) to receive the data:
.DATA D W BUFFER
BUFFER 512 DFF (C)
The instructions are
MOV AX,ODATA
MOV DS,AX initialize DS
MOV AH,3FH ;read file function.
MOV BX,HANDLE ;get handle
MOV CX512 ;read512- bytes
INT 21H ;read file.- AX
JC READ_ERROR ;jump if error
In some applications, we may want to read and process sectors until end of file (EOF) is encountered. The program can check for EOF by comparing'AX and CX:
CMP AX,CX ;EOF?
JL - EXIT ;yes, terminate :F. : m
JMP READ LOOP ;no, keep reading
Function 40h writes a specified number of bytes to a file or device.
INT 21H, Function 40h: Write File :
Input:
BX = file handle
CX = number of bytes to write DS:DX = data address
Output:
It is possible that there is not enough room on the disk to accept the data; DOS doesn't regard this as an error, so the program has to check for it by comparing AX and CX.
Function 40h writes data to a file, but it can also be used to send data to the screen or printer (handles 1 and 4, respectively).
Example 19.4 Use function 40h to display a message on the screen.
Solution: Let's suppose the message is stored as follows:
DATA
MSG DB
'DISPLAY THIS MESSAGE'
The instructions are
MOV AX, @DATA
MOV DS, AX ; initialize DS
MOV AH, 40H ; write file function
MOV BX, 1 ; screen 'file' handle
MOV CX, 20 ; length of message
LEA DX, MSG ; get address of MSG
INT 21H ; display MSG
A·Program to Read and Display a File
To show how the file handle functions work, we will write a program that lets the user enter a filename, then reads and displays the file on the screen.
Get filename from user
Open file
IF open error
THEN
display error code and exit
ELSE
REPEAT
Read a sectors into buffer
Display ·buffer UNTIL end ·of · file Close file ENDIF
0:· TITLE PGM19_1: DISPLAY FILE
1:· .MODEL- SMALL
2:
3: .STACK 100H
4:
5: .DATA
6: PROMPT DB 'FILENAME:$'
7: FILENAME DB 30 DUP (0).
8: BUFFER DB 512 : DUP (0).
9: HANDLE . DW ?
10: OPENERR DB 0DH, 0AH, 'OPEN ERROR - - CODE'
11: ERRCODE .DB 30H, 'S'
12:
13: .CODE
14: MAIN PROC-
15: MOV AX, @DATA :
16: MOV DS, AX ; initialize DS
17: MOV . ES, AX_ ; and ES
18: CALL GET_NAME ; read filename
19: LEA . DX, FILENAME . ; DX has filename offset
20: MOV AL, 0 ; access code 0 for reading
21: CALL OPEN ; open file
22: JC . OPEN_ERROR ; exit if error
23: MOV . HANDLE, AX ; save handle
24: READ_LOOP:
25: LEA . DX, BUFFER ; DX pts to buffer
26: MOV . BX, HANDLE - - get handle
27: CALL READ - . ; read file, AX = bytes read
28: OR . AX, AX ; end of file?
29: JE . EXIT ; yes, exit
30: MOV . CX, AX ; CX gets no. of bytes read
31: CALL DISPLAY ; display file
32: JMP READ_LOOP ; exit
33: OPEN_ERROR:
34: . LEA . DX, OPENERR , ; get error message
35: . ADD ERRCODE, AL ; convert error code to ASCII
36: MOV AH, 9
37: INT . 21H ; display error message
38: EXIT: V
39: MOV 'C'BX, HANDLE ; ; get handle
40: CALL CLOSE ; close file
41: X'MOV : C'AH, 4CH
42: INT : C'21H ; dos exit
43: MAIN ENDP3'3
44:
45: GET_NAME: PROC ; NEAR
46: ; ; reads and stores, filename
47: ; input: ; none
48: ; output: ; filename stored: as 'ASCII' string
49: PUSH AX ; save registers used
50: PUSH DX
51: PUSH DI
52: MOV AH, 9 ; display string fcn
53: LEA DX, PROMPT
54: INT 21H ; display data prompt
55: CLD
56: LEA DI, FILENAME ; DI pts to filename
57: MOV AH, 1 ; read char fcn
58: READ_NAME:
59: INT 21H ; get a char
60: CMP AL, 0DH ; CR?
61: JE DONE ; yes, exit
62: STOSB ; no, store in string
63: JMP READ_NAME ; keep reading
64: DONE:
65: MOV AL, 0
66: STOSB ; store 0 byte
67: POP DI ; restore registers
68: POP DX
69: POP AX
70: RET
71: GET_NAME ENDP
72:
73: OPEN PROC NEAR
74: ; opens file
75: ; input: DS:DX filename
76: ; AL access code
77: ; output: if successful, AX handle
78: ; if unsuccessful, CF = 1, AX = error code
79: MOV AH, 3DH ; open file fcn
80: MOV AL, 0 ; input only
81: INT 21H ; open file
82: RET
83: OPEN ENDP
84:
85: READ PROC NEAR
86: ; reads a file sector
87: ; input: BX file handle
88: ; CX bytes to read (512)
89: ; DS:DX buffer
90: ; output: if successful, sector in buffer
91: ; AX number of bytes read
92: ; if unsuccessful, CF = 1
93: PUSH CX
94: MOV AH, 3FH ; read file fcn
95: MOV CX, 512 ; 512 bytes
96: INT 21H ; read file into buffer
97: POP CX
98: RET
99: READ ENDP
100:
101: DISPLAY PROC NEAR
102: ; displays memory on screen
103: ; input: BX = handle (1)
104: ; CX = bytes to display
105:; DS:DX - data address
106:;output:AX b:tes displayed
107: PUSH EX
108: MOV iH,40H ;write file fcn
109: MOV EX,1 ;handle for screen
110: INT 2:H ;display file
111: POP EX
112: RET
113: DISPLAY ENDP
114:
115: CLOSE PROC NEA
116:;closes a file
117:;input: BX = i handle
118:;output: if CF = 1, error code in AX
119: MOV An,3H ;close file fcn
120: INT 21H ;close file
121: RET
122: CLOSE EN!
123:
124: END MAIN
At line 18, procedure GET_NAME is called to receive the filename from the user and store it in array FILENAME as an ASCII string. After FILENAME's offset is moved to DX, procedure OPEN is called at line 21 to open the file. The most likely errors are nonexistent file or path. If either happens, OPEN returns with CF set and the error code 2 or 3 in AL. The program converts the code to an ASCII character by adding it to the 30h in variable ERRCODE (line 35), and prints an error message with the appropriate code number. Note: typing mistakes will be treated as errors.
returns with CF set and the error code 2 or 3 in AL. The program converts the code to an ASCII character by adding it to the 30h in variable ERRCODE (line 35), and prints an error message with the appropriate code number. Note: typing mistakes will be treated as errors.
Note: typing mistakes will be treated as errors.
If the file opens successfully, AX will contain 5, the first available handle:after the preccfined handles.
... At line 24, the program enters the main processing loop. First, procedure READ is called to read a sector into array BUFFER. CF is set if an error occurred, but the conceivable errors (access denied, illegal file handle) are not possible in this program, so AX will have the actual number of bytes read. If this is zero, EOF was encountered on the previous call to READ, and the program calls procedure CLOSE to close the file.
If AX is not 0, the number of bytes read is moved to CX (line 30) and procedure DISPLAY is called to display the bytes on the screen.
Sample Executions:
C>PGM19_1
FILENAME: A:A.TXT
THIS IS A SMALL TEST FILE
C>PGM19_1
FILENAME: A:B.TXT
OPEN ERROR - CODE 2 (nonexistent file)
C>PGM19_1
FILENAME: A:\PROGS\A.TXT
OPEN ERROR - CODE 3 (illegal path)
The file pointer is used to locate a position in a file. When the file is opened, the file pointer is positioned at the beginning of the file. After a read operation, the file pointer indicates the next byte to be read; after writing a new file, the file pointer is at EOF (end of file).The following function can be used to move the file pointer.
The following function can be used to move the file pointer.
Move File Pointer
Input:
AL = movement code: 0 means move relative to beginning of file
1 means move relative to
the current file pointer location
2 means move relative to
the end of the file
BX = file handle
CX:DX = number of bytes to move (signed)
Output: DX:AX = new pointer location in bytes from the
beginning of the file
If
CX:DX contains the number of bytes to move the pointer, expressed as a signed number (positive means forward, negative means backward). If
If CX:DX is too large, the pointer could be moved past the beginning or end of the file. This is not an error in itself, but it will cause an error when the next file read or write is executed.
The following code moves the pointer to the end of the file and determines the file size:
MOV . AH, 42H ; move file ptr function MOV . BX, HANDLE ; get handle
XOR CX, CX
XOR . DX, DX
MOV AL, 2
INT 21H
JC MOVE_ERROR
; 0 bytes, to move ; relative to end of file ; move pointer to end. ; DX:AX = file size
; error IF CF = 1
The following program creates a file of names. It prompts the user to enter names of up to 20 characters, one name per line. After each name is entered, the program appends it to the file and blanks the input line on the screen. The user indicates end of data by typing a CRTL- Z.
Open NAMES file
Move file pointer to EOF
Print data prompt
WHILE a <CTRL- 2> has.not been typed PO .Get:a name from.the usei and store in byte :al rayNAMEFLD .Write name to NAMES file ENDWHILE. PCL.se!NAMESx file
The program calls procedure GET_NAME to get a name from the user.
Put blanks in first 20 bytes.of NAMEFLD (last 2 bytes; are )
REPEAT
Read a character
IF character is <CTRL- 2> (1Ah)
THFN
CF and exit
EISF:IF:character is not
THEN.
store character in NAMEFLD
ENDIF
UNTIL character is
:Blank input line on screen
J- T1ILE PGM19_2: APPEND RECORDS
1 .MODEL.SMALL
-
.STACK :100
5: .DATA
:6: PROMPT DB NAMES:ODH, OAH, S
7: NAMEFLD DB 20 DUP.0. ODH, OAH
:8: FILE :DB .NAMES'0 :
9: HANDLE DW ?
10: OPEN Rk :LUB :ODH, OAH, OPEN. ERROR S
11: WRITERR DB .ODH, CAH, WRITE ERROR S
:12:
13: .CODE
14: MAIN .PROC
15: MOV AX,@DATA
16: MOV DS,AX initialize DS
17: MOV .ES,AX :and ES
18: :open NAMES file
19: LEA DX,FILE :get addr of filename
20: CALL OPEN :open file
21: JC :OPEN_ERROR :exit if error
22: MOV :HANDLE,AX :isave handle
23: :move.file pointer to eof.
24: MOV :BX, HANDLE : :get handle
25: CALL MOVE_PTR :move pointer
26: :print:promptc
27: MOV V AH,9 :display string fcn
28: I:E A TIDX,PROMPT : : "NAMES:
29: INT V21H :display prompt
30: READ_LOOP: ;read names
31: - LEA DI,NAIEFLD ;DI pts to name
32: CALL GET_NAME ;read name
33: JC EXIT ;CF - 1 If end of data
34: ;append name to NAMES file.
35: MOV BX,HANDLE ;get handle
36: MOV CX,22 ;22 bytes for name, CR, LF
37: LEA DX,NAMEFLD ;get addr of name
38: CALL WRITE WRITE to file
39: JC WRITE_ERBOR ;exit if error
40: JMP READ_LOOP ;get next name
41: OPEN_ERROR:
42: LEA DX,OPENERR ;get error message
43: MOV AH,9
44: INT 21H ;display error message
45: JMP EXIT
46: WRITE_ERRCR
47: LEA DX,WRITERR ;get error message
48: MOV AH,9
49: INT 21H ;display error message
50: EXIT:
51: MOV BX,HANDLE - ;get handle
52: CALL CLOSE ;close NAMES file
53: MOV AH,4CH
54: INT 21H ;dos exit
55: MAIN ENDP
56:
57: GET_NAME PROC NEAR
58: ;reads and stores a name
59: ;input: DI - offset address of NAMEFLD
60: ;output: name stored at NAMEFLD
61: CLD
62: MOV AH,1 ;read char function
63: ;clear NAMEFLD
64: PUSH DI save ptr to NAMEFLD
65: MOV C.,20 ;name can have up to 20 chars
66: MOV AL,
67: REP STOSB ;store blanks
68: POP DI ;restore ptr
69: READ_NAME:
70: INT 21H ;read a char
71: CMP AL,1AH ;end of data?
72: JNE NO ;no, continue
73: STC ;yes, set CF
74: RET ;and return
75: NO:
76: CMP AL,ODH ;end of name?
77: JE DONE ;yes, exit
78: STOSB ;no, store in string
79: JMP READ_NAME ;keep reading
80: ;clear input line
61: DONE:
82: MOV AH,2 ;print char fcn.
83: MOV DL,ODH
84: INT 21H ;execute CR
85: MOV DL,' ;get blank
86: .MOV. ; CX, 20.
87: CLEAR:
88: INT ; 21H
!89: .LOOP: .CLEAR
MOV DL,ODH
90: MOV DL,ODH
91: INT : 21H
92: RET
93: GET_NAME ENDP
94:
95: OPEN PROC NEAR
96: ;opens file
97: ;input: DS:DX filename
98: AL access code
99: ;output: if successful, AX handle
100: ; if unsuccessful, CF = 1, AX = error code
101: MOV AH,3DH ;open file fcn
102: MOV AL,1 ;write only
103: INT 21H ;open file
104: RET
105: OPEN ENDP
107:WRITE PROC NEAR
108: ;writes a file
109: ;input: BX = handle
110: ; CX = bytes to write
111: ; DS:DX = data address
112: ;output: AX = bytes written.
113: ; If unsuccessful, CF = 1, AX = error code
114: MOV AH,40H ;write file fcn
115: INT 21H ;write file
116: RET
117:WRITE ENDP
118:
119: CLOSE PROC NEAR
120: ; closes a file
121: ; input: BX = handle
122: ; output: if CF = 1, error code in AX
123: MOV AH,3EH ;close file fcn
124: INT 21H ;close file
125: RET
126: CLOSE ENDP
127:
128: MOVE_PTR PROC NEAR
129: ; moves file pointer to eof
130: ; input: BX = file handle
131: ; output: DX:AX = pointer position from beginning
132: MOV AH,42H ;move ptr function
133: XOR CX,CX ;0 bytes
134: XOR DX,CX ;from end of file
135: MOV AL;2 ;movement code
136: INT 21H ;move ptr.
137: RET
138: MOVE_PTR ENDP
139:
140: END MAIN
The program begins by using INT 21h, function 3Dh, to open the NAMES tile. Since this function may only be used to open a file that already exists, a blank file NAMES must be created before the program is run the first time. To create such a file, enter DEBUG and follow these steps:
-
Use the N command to name the file (type N NAMES).
-
Put O in BX and CX (specify U file length).
-
Write file to disk (type W).
After the program has been run, the DOS TYPE command may be used to view it.
Sample execution: (The input names are actually entered on the same line, but will be shown on separate lines.)
PGM19_2
NAMES:
GEORGE WASHINGTON
JOHN ADAMS
C:TYPE NAMES
GECROGE WASHINGTON
JOHN ADAMS
PGM19_2
NAMES:
THOMAS JEFFERSON
HARRY TRUMAN
SUSAN B. ANTHONY
CTR Z
TYPE NAMES
GECROGE WASHINGTON
OHi ALAMS
THOMAS JEFFERSON
HARRY TRUMAN
SUSAN B. ANTHONY
19.3.8
Changing a File's Attribute
In section 19.1.2, we saw that a file's attribute is specified when it is created (function 3Ch). The following function provides a way to get or change the attribute.
INT 21H, Function 43h:
Get/Change File Attribute
Innitt- AH = 43h
DS:DX = address of file pathname as ASCIIZ string
AL = 0 to get attribute
CX = new file attribute (if AL = 1)
Output: If successful, CX = current file attribute (if AL = 0)
Error if CF = 1, error code in AX (2,3, or 5)
This function may not be used to change the volume label or subdirectory bits of the file attribute (bits 3 and 4).
Example:19.5 .Change a:file's attribute to hidden.
MOV AH43H
;get/change attribute fcn
MOV AL1
;change' attribute option
LEA DX,FLNAME ;get path
MOV CX,1 "hidden attribute
INT 21H 77 ;change attribute
JC ATTR_ERROR 77 ;exit if error. AX = error code
19.4
Direct Disk
Operations
Up to now, we have been talking about operations on files using the DOS INT 21h file handle functions. There are two other DOS interrupts for reading and writing disk sectors directly.
194.1
INT 25h and
INT 26h
The'DOS interrupts for reading and writing sectors are INT 25h and INT 26h, respectively. Before invoking these interrupts, the following registers must be initialized:
AL = drive number (0 = drive A, 1 = drive B, etc.)
DS:BX. = segment:offset of memory buffer
CX = number of sectors to read or write
DX = starting logical sector number (see following section)
Unlike INT 21h, there is no function number to put in AH. The interrupt routines place the contents of the FLAGS register on the stack, and it should be popped before the program continues. If CF = 1, an error has occurred and AX gets the error code.
Table 19.3 Logical Sectors
| Surface | Track | Sectors | Logical Sectors | Information |
| 0 | 0 | 1 | 0 | Boot record |
| 0 | 0 | 2-5 | 1-4 | FAT |
| 0 | 0 | 6-9 | 5-8 | File directory |
| -1 | 0 | 1-3 | 9-11 (9h-Bh) | File directory |
| 1 | 0 | 4-9 | 12-17 (Ch-11h) | Data (as needed) |
| 0 | 1 | 1-9 | 18-26 (12h-1Ah) | Data (as needed) |
In section 19.1.2, we identified positions on a disk by surface, track, and sector. DOS assigns a logical sector number to each sector, starting with 0. Logical sector numbers proceed along a track on surface 0, then continue on the same track on surface 1. Table 19.3 gives the correspondence between surface- track- sector and logical sector for the first part of a 5V4- inch floppy disk.
As an example of a direct disk operation, the following program reads the first sector of the directory (logical sector 5) of the disk in drive A.
0: TITLE PGM19_3: READ SECTOR
1: .MODEL SMALL
2:
3: .STACK IOCH
4:
5: .DATA
6: BUFi DB 512 LUP (0)
7: H- K- MSU 111 'ERKGPS'
8:
9: .COLE
10: MAIN I- ROC
11: MOV AX,0DATA
12: MOV DS,AX ;initia1iEe DS
13: MOV AL,0 ;drive A
14: LEA BX,BUFF ;BX has buffer offset
15: MOV CX,1 ;read I sector
16: MOV DX,5 ;start at sector 5
17: INT 25H ;read sector
18: PCP DX ;restore stack
19: SYNC EXIT ;jump if n0 errr
20: here if error
21: MOV AH,9
22: LEA DX,ERROR_MSG
23: INT 21H ;display error message
24: EXIT:
25: MOV AH,4CH ;dos exit
26: INT 21H
27: MAIN ENDP
28:
29: END MAIN
To demonstrate the program, a disk containing two files A.TXT and B.TXT is placed in drive A, and the program is executed inside DEBUG. In this environment, the program performs the same function as DEBUG's L (load) command.
- G13 (execute through .line 18 above)
AX=0100 BX=0000 CX=0000 DX=0005 SP=0062 BP=7420 SI=01F6 DI=0001
DS=0F12 ES=0EFB SS=0F0B CS=0F33 IP=0013 NV UP EI PL 2R NA PE NC
OF33:0013 5A POP DX
- DO (dump buffer)
0F12:0002 41 20 20 20 20 20 20 20- 54 58 54 20 00 00 00 00 A TXT
0F12:0010 00 00 00 00 00 BD 19- 22 16 02 00 80 00 00 00
0F12:0020 42 20 20 20 20 20 20- 51 58 54 20 00 00 00 00 B TXT
0F12:0030 00 00 00 00 00 00 23 24- 8A 16 03 00 80 00 00 00 U0
0F12:0040 00 F6 F6 F6 F6 F6 F6 F6- F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F
0F12:0050 F6 F6 F6 F6 F6 F6 F6 F6- F6 16 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6
0F12:0060 00 F6 F6 F6 F6 F6 F6 F6- F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6
0F12:0070 F6 F6 F6 F6 F6 F6 F6- F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F6 F
From the display, we can pick out the relative fields of the directory entries. For file A,
Offset (hex) Information Bytes
0- 7. filename 41 20 20 20 20 20 20 20 A
8- A extension 54 58 54 TXT
B attribute 20
C- 15 reserved by DOS
16- 17 creation time BD 19
18- 19 creation date 22 16
1A- 1B starting cluster 02
1C- 1D file size 80
The format of the creation hour:minute:second is hhhhhmmmmmmsssss. For this file,
19BDh=0001100110111101
=3:13:29
The year:month:day has form yyyyyyymmmmdddd where the year is relative to 1980 for this DOS version. We get
1622h=0001011000100010
=11:1:2 (actually 91:1:2)
As another example, we can put a disk that contains several files in drive A and use the preceding program to display the first part of the FAT,
which begins at logical sector 1. If we change line 17 in the program to read MOV DX,1 and run the program inside DEBUG, the result is
- d0
OF2:2:0000 FD FF FF FF 4F 00 05 60- 00 07 FO FF 09 A0 00 0B
OF12:0010 C0 00 0D E0 00 0B 00 01- 11 20 01 13 40 01 15 60
OF12:0020 01 17 80 01 19 AO 01 1B- CO 01 1D E0 01 1F 00 02
OF12:0030 21 20 02 23 40 02 25 60- 02 27 80 02 29 A0 02 2B
OF12:0040 C0 02 2D E0 02 2F 00 03- 31 20 03 33 40 03 35 60
OF12:0050 03 37 80 03 39 AO 03 3B- CO 03 3D E0 03 3F 00 04
OF12:0060 41 20 04 43 40 04 45 60- 04 47 80 04 49 A0 04 4B
OF12:0070 C0 04 4D E0 04 4F 00 05- 51 20 05 53 40 05 55 60
The FAT is hard to read in this form because FAT entries are 12 bits = 3 hex digits. To decipher the display, we need to form 3- digit numbers by alternately (1) taking two hex digits from a byte and the rightmost digit from the next byte, and (2) taking the remaining (leftmost) digit from that byte and the two digits from the next byte. Performing this operation on the preceding display, we get
CLUSTER 0 1 2 3 4 5 6 7 8 9 A CONTENTS FFD I'FF FFF 004 005 006 007 FFF 009 00A 00B
The first data file begins in cluster 2. The entry thcre is FFFh, so the file also ends in this cluster. The next file begins in cluster 3 and ends in cluster 7. The next one starts in cluster 8, and so on.
-
The FORMAT program partitions each side of a disk into concentric circular areas called tracks. Each track is further subdivided into 512-byte sectors. The number of tracks and sectors depends on the kind of disk. A 5¼-inch double-density floppy disk has 40 tracks per surface and 9 sectors per track.
-
In storing data, DOS fills a track on one side, then proceeds to a track on the other side.
-
Data about files are contained in the file directory. A file entry includes name, extension, attribute, time, date, starting cluster, and file size.
-
A file's attribute byte is assigned when it is opened. The attribute specifies whether a file is read-only, hidden, DOS system file, volume label, subdirectory, or has been modified. The usual file attribute is 20h.
-
DOS sets aside space for a file in clusters. A cluster is a fixed number of sectors (2 for a double-density floppy disk). The first data file on the disk begins in cluster 2.
-
The FAT (file allocation table) provides a map of how files are stored on the disk. Each FAT entry is 12 bytes. A file's directory entry contains the first cluster number
$N1$ of the file. FAT entry$N1$ contains the cluster number$N2$ of the next cluster of the file if there is one; the last FAT entry for a file contains FFFh. -
The DOS INT 21h file handle functions provide a convenient way to do file operations. With them, a file is assigned a number called a file handle when it is opened, and a program may identify a file by this number.
-
File handle functions are specified by putting a function number in AH and invoking INT 21h. The main functions are 3Ch for opening a new file, 3Dh for opening an existing file, 3Eh for closing a file, 3Fh for reading a file, 40h for writing a file, 42h for moving the file pointer, and 43h for changing the file attribute.
-
DOS interrupts INT 25h and INT 26h may be used to read and write disk sectors.
archive bit
Used to indicate the most recently modified version of a file
attribute byte
Specifies a file's attribute
cluster
A fixed number of sectors—depends on the kind of disk
cylinder
The collection of tracks on different surfaces that share a track number
data transfer area (DTA)
Area of memory that DOS uses to store data from a file
file allocation table (FAT) file handle
Provides a map of file storage on a disk A number used by INT 21h functions to identify a file
file pointer hidden file
Used to locate a position in a file
A file whose name doesn't appear in a disk's directory search
read a file rewrite a file sector status byte track write a file
Copy all or part of the file to memory Replace a file's contents by other data
A 512- byte section of a track
Byte 0 in a tile directory entry
A circular area on a disk
Copy data from memory to the file
-
Verify that 1,228,800 bytes can be stored on a 54-inch floppy disk that has 80 cylinders and 15 sectors per track.
-
Suppose FAT entries for a disk are 12 bits = 3 hex digits in length. Suppose also that the disk contains three files: FILE1, FILE2, and FILE3, and the FAT begins like this:
a. If FIIE1, FILE2, and FILE3 begin in clusters 2, 3, and 7, respectively, tell which clusters each of the files are in.
b. When a file is erased, all its FATentries are set to 000. Show the contents of the FAT after each the following operations are performed (assume the operations occur in the following order):
FILE1 is erased.
A 1500- byte file FILE4 is created.
FILE2 is erased.
A 500- byte file FILE5 is created.
A 1500- byte file FILE6 is created.
- Write instructions to do the following operations. Assume that the file handle is contained in the word variable HANDLE.
a. Move the file pointer 100 bytes from the beginning of a file.
b. Move the file pointer backward 1 byte from the current location.
c. Put the file pointer location in DX:AX.
- From the DEBUG display of the file directory in section 19.4.1, determine the creation time, date, and size for file B.TXT.
-
Write a program that will copy a source text file into a destination file, and replace each lowercase letter by a capital letter. Use the DOS TYPE command to display the source and destination files.
-
Write a program that will take two text files, and display them side by side on the screen. You may suppose that the length of lines in each file is less than half the screen width.
-
Modify PGM19_2.ASM in section 19.3.7 so that it prompts the user to enter a name, and determines whether or not the name appears in the NAMES file. If so, it outputs its position in hex.
-
Modify PGM19_2.ASM in section 19.3.7 so that it prompts the user to enter a name. If the name is present in the NAMES file, the program makes a copy of the file with the name removed. Use the DOS TYPE command to display the original file and the changed file.
We have so far been concentrating on the 8086/8088 processors. In this chapter, we take a look at Intel's advanced microprocessors, which have become very popular. We'll show that they are compatible with the 8086 and can execute 8086 programs. In addition, they have features that support memory protection and multitasking.
In section 20.1 we discuss the 80286. The operating system software needed to use the protected mode of the 80286 is discussed in section 20.2. In section 20.3 we discuss the 80386 and 80486 processors.
Like the 8086, the 80286 is also a 16- bit processor. It has all the 8086 registers and it can execute all the 8086 instructions. It was designed to be compatible with the 8086 and also support multitasking. This is achieved by having two modes of operation: real address mode (also called real mode), and protected virtual address mode (protected mode, for short).
In real address mode, the 80286 behaves like an 8086 and can execute programs written for the 8086 without modification. In addition to the 8086 instructions, it can execute some new instructions called the extended instruction set.
In protected mode, the 80286 supports multitasking and it can execute additional instructions needed for this purpose. There are also additional registers being used in this mode.
Let us start with the extended instruction set.
The extended instruction set contains some 8086 instructions with additional operand types as well as new instructions. They are push and pop, multiply, rotate and shift, string I/O, and high- level instructions.
The 80286 allows constants to be used in the PUSH instruction. The format is
PUSH immediate
With this instruction, we no longer have to put a constant into a register and then push the register. For example, we can use PUSH 25 instead of MOV AX,25 and PUSH AX.
There are also instructions for pushing and popping all general registers. The instruction PUsHA (push all) pushes all the general registers in the following order: AX, CX, DX, BX, SP, BP, SI, and DI. The instruction POPA (pop all) pops all the general registers in the reverse order: DI, SI, BP, SP, BX, DX, CX, and AX. These two instructions are useful in procedures that need to save and restore all the registers.The formats are
PUSHA POPA
The 80286 has three new formats for IMUL that permit multiple operands:
IMUL reg16,immed
IMUL reg16,reg16,immed
IMUL reg16,mem16,immed
where immcd is a constant, reg16 is a 16- bit register, and mem16 is a memory word. The first format specifies an immediate operand as source and a general 16- bit register as destination. The second and third formats contain three operands: the first operand is a 16- bit register that stores the product, the multiplier and multiplicand are found in the second and third operands.
Here are some examples:
- IMUL BX,20
;BX and 20 are multiplied and the ;product is in BX
- IMUL AX,BX,20
;BX and 20 are multiplied and the
;result is stored in AX
- IMUL AX,WDATA,20 ;WDATA and 20 are multiplied and ;the result is stored in AX
Note that only the low 16 bits of the product are stored. The CF and OF are cleared if the product can be stored as a 16- bit signed number; otherwise, they are set. The other flags are undefined.
Shifts and RotatesThe 80286 allows multiple shifts and rotates using a byte constant. There is no need to use the CL register. For example, we may use SHR AX,4 instead of the two instructions MOV CL,4 and SHR AX,CL.
The 80286 aices multiple bytes for input and output operations. The input instructions are INSB (input string byte), and INSW (input string word). The instruction INSB (or INSW) transfers a byte (or a word) from the port addressed by DX into the memory location addressed by ES:DI. DI is then incremented or decremented according to the 11: just like other string instructions. The REP prefix can be used to input multiple bytes or words.
The output instructions are OUTsB (output string byte), and OUTSW (output string word). The instruction OUTSb (or OUTSW) transfers a byte (or a word) from the memory location addressed by ES:SI to the port addressed'by DX. SI is then incremented or decremented according to the DF just like other string instructions. The REP prefix can again be .led to output multiple bytes or words.
High- Level InstructionsThe high- level instructions allow block- structured high- level languages to check array limits and to create memory space on the stack for local variables. The instructions are BOUND, LEAVE, and ENTER. Because they are primarily used by compilers, we shall not discuss them further.
One of the major drawbacks of the 8086 lies in its use ol a 20- bit address, which gives a memory space of only 1 incgabyte. This 1- MB :memory is further restricted by the structure of the PC, which reserves the addresscs above 640 KB for video and other purposes. The 80296 uses a 24- bit address, so it has a memory address space of 224 or 16 MB.
On first glance, it appears that the 80286 may solve a lot of th memory limitation problems. On closer examination, however, we see th it programs running under DOS cannot use the extra mem ry. DOS is designed for the 8086/8088, which corresponds to the real mode of the 80286. In order to be compatible with the 8086, the 80286 real address mode generates a physical address the same way as the 8086; that is, the 16- it segmen number is shifted left four bits and then the ofiset is added. The 20- bit number formed becomes the- low 20 bits of the 24- bit physical address: the high four bits are cleared. This gives us a limit of 1 MB.
Actually, the 80286 can access slightly more than 1 MB in real mode. To illustrate, let us use a segment number of FFFFh and an offset of FFFFh, the computed address is FFFFh + FFFFh = 101FEEh. In the 8086, the extra bit is dropped, resulting in a physical address of 0FEEh. For the 80286, because there are 24 address lines, the memory location 101FEEh is addressed. It is simple to see that for the FFFFh segment, bytes with offset addresses 10h to FFFFh have 21- bit addresses. Thus, in the real address mode, the 80286 can access almost 64 KB more than the 8086. This address space above 1 MB is used by DOS version 5.0 to load some of its routines, resulting in more memory for application programs. Note that on many PCs the twenty- first address bit must be activated by software before the higher memory can be accessed.
Under DOS, the 80286 must operate in real mode. Any program written for the 8086 will run on an 80286 machine under DOS. A program for the 80286 may also contain extended instructions. To assemble a program with extended instructions, we must use the .286 assembly directive to avoid assembly errors.
As an example of extended instructions, let's write a procedure to output the contents of BX in hex. The algorithm is given in Chapter 7.
.286
HEX_OUT PROC
;output contents of BX in hex
PUSHA ;save all registers
MOV CX,4 ;CX counts # of hex digits
;repeat loop 4 times
REPEAT:
MOV DL,BH ;get the high byte
SHR DL,4 ;shift out low hex digit
CMP DL,9 ;see if output digit or letter
JG LETTER ;go to LETTER if > 9
OR DL,30H ;<=9, change to ASCII
JMP PRINT ;output
LETTER:
ADD DL,37H ;>9, convert to letter
PRINT:
MOV AH,2 ;output function
INT 21H ;output hex digit
SHL BX,4 ;shift next digit into first
;position
LCOF REPEAT
POP A ;restore registers
RET
HEX_OUT ENDP
20.1.3
Protected Mode
To fully utilize the power of the 80286, we need to operate it in protected mode. When executing in protected mode, the 80286 supports virtual addressing, which allows programs to be much bigger than the machine's physical memory size. Another protected mode feature is the support for multitasking, which allows several programs to be running at the same time. The 80286 is designed to execute in real mode when it is powered up. Switching it into protected mode is normally the job of the operating system. In section 20.2 we look at some software that executes in protected mode.
Application programs running in protected mode still use segment and offset to refer to memory locations. However, the segment number no longer corresponds to a specific memory segment. Instead, it is now called a segment selector and is used by the system to locate a physical segment that may be anywhere in memory. Figure 20.1 shows a segment selector.
To keep track of the physical segments used by each program, The operating system maintains a set of segment descriptor tables. Each application program is given a local descriptor table, which contains
information about the program's segments. In addition, there is a global descriptor table, which contains information on segments that can be accessed by all programs.
The segment selector is used to access a segment descriptor contained in a segment descriptor table. As we see in Figure 20.2, a segment descriptor describes the type and size of the segment, whether the segment is present, and a 24- bit base address of the segment in memory.
The process of translating the segment and offset used in an application program into a 24- bit physical address goes like this. First, the TI bit in the selector is used to select the descriptor table;
A descriptor table may have up to 64 KB. Since a descriptor is 8 bytes each descriptor table can have up to 8 K
Figure 20.2 Segment Descriptor
or 1 GB (gigabyte) of memory. This memory is known as virtual memory, because the 80286 only has 16 MB of physical memory.
The virtual segments of a program are maintained on the disk drive. The operating system may load the segments into memory as they are needed. It uses the P bit in a descriptor to keep track of whether the corresponding segment has been loaded into memory. If a virtual segment is not loaded, the P bit in the corresponding descriptor is cleared.
An example is a program that is bigger than the physical memory size. It must be loaded incrementally. When an instruction addresses a segment that is not loaded, the operating system is notified by the hardware in the form of an interrupt. The operating system then loads the segment and restarts the instruction. It may be necessary to save a memory segment to disk to make room for this new segment.
The basic unit of execution in protected mode is a task, which is similar to a program execution in real mode. Each task has its own local descriptor table. At any one time, only one task can be executing, but the operating system can switch between tasks using an interrupt. Also, one task may call another task.
Because one task cannot access another task's local descriptor table, the memory segments of one task are protected from other tasks. To provide further protection, each task is assigned a privilege level. There are four privilege levels, 0- 3. Level 0 is the most privileged, and level 3 is the least. The operating system operates at level 0, and application programs operate at level 3. There are privileged instructions such as loading descriptor table registers that can be executed only by a task at level 0. A task operating at one level cannot access data at a more privileged level, and it cannot call a procedure at a less privileged level.
As we have seen, the 80286 cannot access all its potent.al memory when operating in real mode; this is also true for the 80386 and 30486. The memory above 1 MB, called extended memory, is normally not available for DOS application programs. However, a program could access extended memory by using INT 15h. The two functions for dealing with extended memory are 87h and 88h. A program uses function 88h to determine the size of the extended memory available, and then uses function 87h to transfer data to and from the extended memory. A word of caution in using INT 15h to manipulate extended memory: parts of the extended memory may be used by other programs such as VDISK, and the memory may be corrupted by your program. A better method is for the program to call an extended- memory manager program for extended- memory access.
INT 15h Function 87h: Move Extended Memory Block
Input: AH = 87h
CX = number of words to move
ES:SI = address of Global Descriptor Table
Output: AH = 0 if successful
When function 87h is called, the interrupt routine temporarily switches the processor to protected mode. After the data transfer, the processor is switched back to real mode. This is why a Global Descriptor Table is needed.
INT 15h Function 88h.
Get Extended Memory Size
Input: AH = 88h
Output: AX = amount of extended memory (in KB)
Program PGM20_1 copies the data from the array SOURCE to extended memory at 110000h and then copies back the information from extended memory at 110000h to the array DESTINATION. Since the program does not do any I/O, the memory can be examined in DEBUG.
Program Listing PGM20_1. ASM
TITLE PGM20_1: COPY_EXTENDED MEMORY
MODEL SMALL
.286.
. STACK
. DATA
SOURCE DB 'HI, THERE!'
DESTINATION DB 10 DUP(0)
GDT DB 48 DUP(?) ;global table
SRC_ADDR DB ?, ?, ? ; 24- bit source address
DST_ADDR DB ?, ?, ? ; 24- bit dest. address
.CODE
MAIN PROC
MOV AX,@DATA
MOV DS,AX
MOV ES,AX
;put 24- bit source address in SRC_ADDR
MOV WORD PTR SRC_ADDR,DS ;get segment address SHL WORD PTR SRC_ADDR,4 ;shift seg no. 4 places MOV AX,DS ;get highest 4 bits
MOV AX,DS
SHR AH,4
MOV SRC_ADDR+2,AH
LEA SI,SOURCE ;source offset address
ADD WORD PTR SRC_ADDR,SI ;add offset to segment
ADC SRC_ADDR+2,0 ;take care of carry
;put 24- bit destination address in DST_ADDR
MOV DST_ADDR,0 ;destination address is
MOV DST_ADDR+1,0
MOV DST_ADDR+2,11H
;set up registers
LEA SI,SRC_ADDR ;source address
LEA DX,DST_ADDR ;destination address
MOV CX,5 ;number of words
LEA DI,GDT ;global table
;transfer data
CALL COPY_EMEM
;set up source address
MOV SRC_ADDR,0 ;source address is MOV SRC_ADDR+1,0 ;110000h
MOV SRC_ADDR+2,11H
;set up destination address
MOV WORD PTR DST_ADDR,DS ;get segment address
SHL WORD PTR DST_ADDR,4 ;shift seg no. 4 places
MOV AX,DS
SHR AH,4
MOV DST_ADDR+2,AH
LEA SI,DESTINATION
ADD WORD PTR DST_ADDR,SI ;add offset to segment ADC DST_ADDR+2,0 ;take care of carry
;set up registers
LEA SI,SRC_ADDR
LEA DX,DST_ADDR
MOV CX,5
LEA DI,GDT
CALL COPY_EMEM
MOV AH,4CH
INT 21H
MAIN ENDP
COPY_EMEM PROC
;move block to and from extended memory
;input:ES:DI- address of 48 byte buffer to be used as GDT
CX - number of words to transfer
SI - source address (24 bits)
;DX - destination address (24 bits)
;initilize global descriptor table by setting up six ;descriptors
PUSHA ;save registers
;- first descriptor is null, i.e. 8 bytes of 0
MOV AX,0
STOSW
STOSW
STOSW
STOSW
;- second descriptor is set to 0, i.e. 8 bytes of 0
STOSW
STOSW
STOSW
STOSW
;- third descriptor is source segment
SHL CX,1 ;convert to number of bytes
DEC CX
MOV AX,CX ;size of segment, in bytes
STOSW
;
;source address, 3 bytes
MOVSB
MOVSB
MOVSB
MOV AL,93H ;access rights byte
STOSB
MOV AX,0
STOSW
;- fourth descriptor is destination segment
MOV AX,CX ;size of segment, in bytes
STOSW
;
;destination address, 3 bytes
MOVSB MOVSB MOVSB
MOV AL,93H ;access rights byte
STOSB
MOV AX,0
STOSW
; - fifth descriptor is' set to 0
STOSW
STOSW
STOSW
STOSW
; - sixth descriptor is set to 0
STOSW
STOSW
STOSW
STOSW
; restore registers
POPA
; transfer data
MOV SI,DI ;ES:SI points to GDT
MOV AH,87H
INT 15H
RET
COPY_EMEM ENDP
1
END MAIN
The copying is done by procedure COPY_EMEM. It receives in CX the number of words to transfer, in SI the location of a 24- bit source address, and in DI the location of a 24- bit destination address. The source and destination buffers can be anywhere in the 16- MB physical address space of the 80286. COPY_EMEM first sets up the global descriptor table which contains the source and destination buffers as program segments. It then uses INT 15h, function 87h to perform the transfer.
Now that we have some idea of how the hardware functions in protected mode, let's turn to the software. At present, there is no standard multitasking operating system for the PC. We'll look at Windows 3 and OS/2. First, let's consider the process of multitasking.
In a single- task environment like DOS, one program controls the CPU and releases control only when it chooses to. An exception to this scenario is that of an interrupt. In a multitasking environment, however, such as Windows and OS/2, the operating system determines which program has control and several programs can be running at the same time. Actually, a program is given a small amount of time to execute, and when the time is up, another program is allowed to execute. By rotating quickly among several programs, the computer gives the impression that all the programs are executing at the same time.
Windows 3 is the most popular graphical user interface (gui) on the PC. Each executing task is shown in a box on the screen, called a window. A window may be enlarged to occupy the entire screen or shrunk to a single graphitrs element called an icon. A Windows 3 application program may provide services, identified by a menu, to the user. To select an item in the menu, a user simply positions a screen pointer with a mouse at the item and clicks it.
Windows 3 can operate in one of three modes: real mode, standard mode, and 386 enhanced mode. When Windows 3 runs on an 8086 machine or in the real address modes of the advanced processors, it operates in real mode. An application program must end before another one can be executed.
The standard mode of Windows 3 corresponds to the protected mode of the 80286. Windows 3 uses the multitasking features of the 80286 to support multiple Windows 3 applications. It can also execute a program written for DOS. However, to run such a program it must switch the processor back to real address mode. In this case, other applications cannot execute in the background. Windows 3 requires at least 192 KB of extended memory to run in this mode; otherwise it can only run in real mode.
The 386 enhanced mode of Windows corresponds to the protected mode of the 386. In the next section, we'll see that the 386 can execute multiple 8086 applications in protected mode. So, in 386 enhanced mode Windows 3 can perform multitasking on Windows 3 applications as well as DOS applications. A machine must have a 386 or 486 processor chip and at least 1 MB of extended memory to run Windows 3 in this mode.
Windows 3 is not a complete operating system, because it still needs DOS for many file operations. To run Windows 3, we must start in DOS and then execute the Windows 3 program.
Unlike Windows 3, OS/2 is a complete operating system. OS/2 version 1 was designed for the protected mode of the 80286. It requires at least 2 MB of extended memory. OS/2 version 2 supports the 80386 protected mode.
Under OS/2, it is possible for a program to be doing several things simultaneously. For example, a program may display one file on the screen while at the same time it is copying another file to disk. The program itself is called a process, and each of the two tasks here is known as a thread. A thread is the basic unit of execution in OS/2, and we can see that it corresponds to a task supported by the hardware. A thread can create another thread by calling a system service routine.
To summarize, a process consists of one or more threads together with a number of system resources, such as open files and devices, that are shared by all the threads in the process. The concept of a process is similar to the notion of a program execution in DOS.
We only show some simple OS/2 programs as an illustration. More complex OS/2 programs and Windows 3 application programs are beyond the scope of this book.
One noticeable difference between DOS and OS/2 for the programmer is that, to do I/O and system calls in OS/2, a program must do a far call to a system procedure, instead of using the INT instruction. Parameters are to be pushed onto the stack before the call is made. This is done to optimize high- level language interface. The system procedures can be linked to the application program by including the appropriate system library. Actually, the library only contains a reference to the procedure and not the code. The system procedure is contained in a .DLL file and is linked when the program is loaded. Linking modules at loading time is called dynamic linking and is used by OS/2. OS/2 function calls are known as application program interface (API).
As a first example, we show a program that prints out 'Hello!'. The program is shown in program listing PGM20_2. ASM.
Program Listing PGM20_2. ASM
TITLE PGM20_2: PRINT HELLO
.286
.MODEL SMALL
.STACK
.DATA
MSG DB 'HELLO!'
NUM BYTES DW 0
CODE
EXTRN DOSWRITE: FAR, DOSEXIT: FAR
MAIN PROC
;put arguments for DosWrite on stack
PUSH 1 ; file handle for screen
PUSH DS ; address of message: segment
PUSH OFFSET MSG ; offset
PUSH 5 ; length of message
PUSH DS ; addr of number of bytes written: seq
PUSH OFFSET NUM BYTES ; offset
CALL DOSWRITE: ; write to screen
; put arguments for DosExit on stack
PUSH 1 ; action code 1 = end all threads
PUSH 0 ; return code 0
CALL DOSEXIT ; exit
MAIN ENDP
END MAIN
Notice that we do not have to initialize DS. When a program is loaded, OS/2 sets DS to the data segment and it does not create a PSP for the program; furthermore, OS/2 supports only .EXE files.
We have used two API functions, DosWrite to write to the screen, and DosExit to terminate, the program. DosWrite writes to a file; the arguments are file handle, address of buffer, length of buffer, and address of bytes- out variable. The file handle for the screen is 1. The bytes- out variable
receives the number of bytes written to the file; the value can be used to check for errors. The arguments must be pushed on the stack in the order given before calling DOSWrite.
DosExit can be used to terminate a thread or all threads in a process. The arguments are (1) an action code to terminate a thread or all threads, and (2) a return code that is passed back to the system that created the process. The arguments for normal exit consist of an action code 1 to end all threads and a return code 0.
In OS/2, the called procedures are responsible for clearing the stack of arguments sent to them when they return. Thus there are no POP instructions in our program.
The API functions DosWrite and DosExit are defined in the library file called DOSCALLS.LIB. As a matter of fact, all API functions used in this book are contained there. To link the program, we use the following command:
LINK PGM20_2, , DOSCALLS.
Echo ProgramAs a second program, we write a program to echo a string typed at the keyboard.
TITLE PGM20_3: ECHO PROGRAM
.286
. MODEL SMALL
. STACK
. DATA
BUFFER DB 20 DUP(0)
NUM CHARS DW 0
NUM BYTES DW 0
EXTRN DOSREAD:FAR, DOSWRITE:FAR, DOSEXIT:FAR
MAIN PROC
;put arguments for DosRead on stack
PUSH 0 ; file handle for keyboard
PUSH DS ; address of buffer: segment
PUSH OFFSET BUFFER;offset
PUSH 20 ; length of buffer
PUSH DS ; addr of no. of chars read: segment
PUSH OFFSET NUM CHARS ; offset
CALL DOSREAD ; read from keyboard
;put arguments for DosWrite on stack
PUSH 1 ; file handle for screen
PUSH DS ; address of message: segment
PUSH OFFSET BUFFER;offset
PUSH NUM CHARS ; length of message
PUSH DS ; addr of no. of bytes written: segmen
PUSH OFFSET NUM BYTES ; offset
CALL DOSWRITE ; write to screen
;put arguments for DosExit on stack
PUSH 1 ; action code l=end all threads
PUSH 0 ; return code 0
CALL DOSEXIT ; exit
MAIN ENDP
END MAIN
END MAIN
We use the API function DosRead to read from the keyboard. DosRead inputs from a file and it takes the arguments: file handle, buffer address, buffer length, and address of chars_read variable. The file handle for the keyboard is 0. DosRead reads in the keys until the buffer is filled or a carriage return is typed. The number of characters read is returned in the chars_read variable.
We have treated the screen and keyboard as files for DosWrite and .DosRead. There are also Vio (video) and Kbd (keyboard) API functions that can perform more I/O operations.
The preceding two programs are only meant as an introduction to OS/2 programming. A full treatment requires a separate book.
20.3 80386 and:80486 Micrroprocessors
The 80386 and 80486 are both 32- bit microprocessors. As noted in Chapter 3, they are very similar, with the exception that the 80486 contains the floating- point processor circuits. In the following treatment, we'll concentrate on the 80386, because the 80486 can be treated like a fast 80386 with a floating- point processor.
The 80386 has both a real address mode and a protected mode of operation, just like the 80286.
.20.3.1 Real Address Mode
The 80386 has eight 32- bit general registers: EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP. Each register contains a 16- bit 8086 counterpart; for example, AX is the lower 16 bits of EAX. There are six 16- bit segment registers: CS, SS, DS, ES, FS, and GS, with DS, ES, FS, and GS being data segment registers. The 32- bit EFLAGS register contains in it the 16- bit FLAGS register, and the 32- bit EIP register contains the 16- bit IP register. There are also debug registers; control registers, and test registers. In addition, there are registers for protected mode memory management and protection.
for protected mode memory management and protection.
In real address mode, the 80386 can execute all of the 80286 real address mode instructions. Hence, to the programmer the 80386 real address mode is similar to the 8086 with extensions to the instruction set and registers.
The 80386 uses 32- bit addresses; but in real address mode it generates an address like an 8086, so it can address at most 1 MB plus 64 KB just like an 80286.
20.3.2 Protected Mode
The 80386 in protected mode can execute all 80286 instructions. When an 80286. protected mode operating system is used on an 80386, a segment descriptor contains a 24- bit base address so only 16 MB of physical memory are available. Actually, because there is no wraparound it can access 16 MB plus 64 KB.
The 80386 in protected mode allows a segment descriptor to contain a 32- bit address and the offset can also have 32 bits, giving a segment size of 232 or 4 gigabytes; this is also the size of the physical memory address space. A program can still use 214 segments, so the virtual memory space is 232 x 214, which is 246 or 64 terabytes. This should be sufficient for any application, program in the foreseeable future.
It is possible to organize the virtual memory into pages. The oper ating system can set a bit in a control register to indicate the use of page tables. When this happens, the 32- bit address in the segment descriptor is treated as a page selector that selects one of the 1- K page- tables from a page directory, and a page number in the selected page table and an offset in the page. The page directory contains 1 K tables, and each table contains 1 K pages, and each page is 4 KB. Hence, the total address possible is
The 80386 supports execution of one or more 8086 programs in an 80386 protected mode environment. The processor executes in virtual 8086 (V86) mode when the VM (virtual machine) bit in the EFLAGS register is set. In V86 mode, the segment registers are used in the same fashion as in real address mode; that is, an address is computed by adding the offset to the segment number shifted four bits. This linear address can be mapped to any physical address by the use of paging.
The real mode 80386 instructions with only 16- bit operands are esentially 80286 instructions. There are some new instructions, and they are given in Appendix F.
In 32- bit programming, both operand size and offset address are 32 bits. The machine opcodes for 32- bit 386 instructions are actually the same as those for 16- bit instructions. It turns out that the 80386 has two modes of operations, 16- bit mode and 32- bit mode. Since the instruction opcodes for 32- bit and 16- bit are the same, the operand type must depend on the current mode of the 386. Byte- size operands are not affected by the operating mode.
When the 80386 is in protected mode, it can operate in either 16- bit or 32- bit mode; the operating mode is identified in the segment descriptor in each task. However, it can only operate in 16- bit mode when it is in real address mode.
It is possible to mix 16- bit and 32- bit instructions in the same pro gram. An operand- size override prefix (66h) can be placed before an instruc tion to override the default operand size. In 16- bit mode, the prefix switches the operand size to 32 bits, and in 32- bit mode the same prefix switches the operand size to 16 bits. One prefix must be used for each instruction.
There is also an address size override prefix (67h), which overrides the offset address size. It is used in a similar manner as the operand size - prefix, and they both can be used in the same instruction.
We demonstrate by writing a program for DOS (16- bit real mode) using 32- bit operands. The program given in program listing PGM20_4. ASM reads in two unsigned double- precision numbers and outputs their sum. The addition is performed by 32- bit registers.
TITLE PGM20_4: 32- BIT OPERATIONS
;input two 32- bit numbers and output their sum;uses 386 32- bit operations
.386
. MODEL SMALL
S_SEG SEGMENT USE16 STACK
DB 256 DUP (?)
S_SEG ENDS
D_SEG SEGMENT USE16
.FIRST DD 0 ;stores first 32- bit number
D_SEG ENDS
NEW_LINE MACRO
;go to next line
MOV AH,2
MOV DL,0AH
INT 21H
MOV DL,0DH
INT 21H
ENDM
PROMPT MACRO
;output prompt
MOV DL,?''.
MOV AH,2
INT 21H
ENDM
C_SEG SEGMENT
ASSUME
USE16
CS:C_SEG,DS:D_SEG,SS:S_SEG
MAIN PROC
MOV AX,D_SEG
MOV DS,AX; initialize DS
;output prompt
PROMPT
;clear EBX
MOV EBX,0
;read character
Ll: MOV AH,1
INT 21H
;check for CR
CMP AL,0DH
JE NEXT ;CR, get next number
;place digit in'EBX
AND AL,0FH
IMUL EBX,10 ;multiply EBX by 10
MOVAX ECX;AL ;move AL to ECX and extend with 0's
ADD EBX,ECX ;add 'digit
;repeat :
JMP Ll
;save first number
NEXT:MOV FIRST,EBX;
;next line
NEW_LIN8
;output prompt
- PRCMPT
;clear EBX
MOV EBX,0
;read character
L2: MOV AH,1
INT 21H
;check for CR
CMP AL,ODH
JE SOMOP
;place digit in EBX
AND AL,OFH
IMUL EBX,10
MOVZX ECX,AL
ADD EBX,ECX
;repeat
JMP L2
;sum up
SUMUP:ADD EBX,FIRST
;convert to decimal
MOV EAX,EBX
MOV EBX,10
MOV CX,0
L3: MOV EDX,0
DIV EBX
;DIV EDX:EDX;divide EBX into EDX:EAX
PUSH DX
INC CX ;increment count
CMP CAX,0 ;done?
JG L3 ;no, repeat
;next line
MOV AH,2
;output function
;output
L4: POP DX ;get digit
OR DL,30H ;convert to ASCII
INT 21H ;output
LOOP L4
MOV AH,4CH ;return
INT 21H ;to DOS
MAIN ENDP
C_SEG ENDS
END MAIN
We have used an 80386 instruction MOVZX which moves a source operand into a bigger size register and zero extends teh leading bits. To assemble 386 instructions, we need to use the .386 directive. However, when the .386 directive is used, the assembler assumes that the operating mode is 32- bit mode. When we are running programs under the real address mode of the 386, we have to specify a default mode of 16 bits. This can only be done with the full segment directives. A segment can be specified with a use type. For example, to specify a 16- bit use type we wrote D_SEG SEGMENT USE16 in the program. The use type, USE16 specifies both operand size and offset address size are 16 bits; and all our segments have the use type USE16.
-
The 80286 can operate in either real address mode or protected mode.
-
In real address mode, the 80286 operates like an 8086.
-
The 80286 uses 24-bit addresses, allowing it a total memory space of 16 MB. However, in real address mode, it can only access 1 MB.
-
In protected mode, the 80286 can use 1 gigabytes (GB) of virtual memory.
-
In protected mode, the 80286 can use 1 gigabytes (GB) of virtual memory.
-
Windows 3 has three modes of operations: real mode, standard mode, and 386 enhanced mode.
-
OS/2 version 1 supports the 80286 protected mode and version 2 supports the 80386 protected mode.
-
System services OS/2 are coded as far calls.
-
The 80486 is like an 80386 with a floating-point unit.
-
The 80386/80486 operates as an 80286 in real address mode.
-
In protected mode, the 80386/80486 supports paging and a virtual 8086 mode. It can also execute all 80286 instructions.
-
In protected mode, the 80386/80486 supports paging and a virtual 8086 mode. It can also execute all 80286 instructions.
dynamic linking
extended instruction set
Linking modules at the time of loading Set of new instructions first used by the 80186 and 80188 processors, can a.so be executed by the 80286, 80386, and 80486 processors
extended memory
global descriptor table
Memory above 1 MB
A segment descriptor table that contains information about the segments that can be accessed by all tasks
graphical user interface, gui icon
A user interface that uses pointers to commands, and special graphics symbols
A graphical element representing a command or program
local descriptor table register (LDTR) menu
A register that holds the address of a local descriptor table
A set of command selections displayed in a window
mouse
A pointing device used to control cursor position on a display screen
multitasking
A technique that allows more than one program (task) to run concurrently
privilege level
A measure of a program's ability to execute special commands
process
protected (address) mode
A program execution A mode of operation by the advanced processors that protects the memory used by
one program from other concurrent programs
real (address) mode
The mode of operation in which an address contained in an instruction corresponds to a physical address
segment descriptor
An entry in a descriptor table that describes a program segment
segment descriptor table
A table that contains segment descriptors, there are two kinds of segment descriptors tables, global descriptor table and local descriptor table
segment selector
The value of a segment register when the processor is running under protected mode, it identifies a segment in a descriptor table
task
thread
A program unit with its own segments
virtual address
A subtask of a process
An address contained in an instruction that does it correspond to any particular physical address
virtual memory
Disk memory used by the operating system to store segments of a task that are not needed currently
window
A rectangular area on the screen
New Instructions
BOUND
OUTSB
ENTER LEAVE OUTSW
INS MOVZX POPA
INSB OUTS PUSHA
New Pseudo- ops
.286 .386
Exercises
-
Write a procedure for OS/2 that will input a string, and then echoes the string ten times on 10 different lines.
-
Use 80386 instructions to multiply two 32-bit numbers.
-
Use the 386 instructions given in Appendix I to write a procedure that outputs the position of the leftmost set bit in the register BX.
Programming Exercises
- Modify program PGM20_4.ASM so that it will output the sum of two signed double-precision numbers.
The IBM PC uses an extended set of ASCII characters for its screen display. Table A.1 shows the ASCII characters. The control characters BS (backspace), HT (tab), CR (carriage return), ESC (escape), SP (space) correspond to the keys Backspace, Tab, Enter, Esc, and space bar; LF (line feed) advances the cursor to the next line, BEL (bell) sounds the beeper, and FF (form feed) advances the printer to the next page.
Table A.2 shows the extended set of 256 display characters. When a display code is written to the active page of the display memory, the corresponding character shows up on the screen. To write to the display memory, we can use INT.10h functions 9h, 0Ah, 0Eh, and 13h. The functions 9h and 0Ah write all values to the display memory. The functions 0Eh and 13h recognize the control character codes 07h (bell), 08h (backspace), 0Ah (line feed), and 0Dh (carriage return) and perform the control functions instead of writing these codes to the display memory.
Table A.1 ASCII Code
| DEC | HEX | CHAR | DEC | HEX | CHAR | DEC | HEX | CHAR | DEC | HEX | CHAR |
| 0 | 00 | 32 | 20 | (SP) | 64 | 40 | @ | 96 | 60 | . | |
| 1 | 01 | 33 | 21 | ! | 65 | 41 | A | 97 | 61 | a | |
| 2 | 02 | 34 | 22 | “ | 66 | 42 | B | 98 | .62 | b | |
| 3 | 03 | 35 | 23 | # | 67 | 43 | C | 99 | 63 | c | |
| 4 | 04 | 36 | 24 | $ | 68 | 44 | D | 100 | 64 | d | |
| 5 | 05 | 37 | 25 | % | 69 | 45 | E | 101 | 65 | e | |
| 6 | 06 | 38 | 26 | & | 70 | 46 | F | 102 | 66 | f | |
| 7 | 07 | (BEL) | 39 | 27 | . | 71 | 47 | G | 103 | 67 | g |
| 8 | 08 | (BS) | 40 | 28 | ( | 72 | 48 | H | 104 | 68 | h |
| 9 | 09 | (HT) | 41 | 29 | ) | 73 | 49 | I | 105 | 69 | i |
| 10 | 0A | (LF) | 42 | 2A | * | 74 | 4A | J | 106 | 6A | j |
| 11 | 0B | 43 | 2B | + | 75 | 4B | K | 107 | 6B | k | |
| 12 | 0C | (FF) | 44 | 2C | , | 76 | 4C | L | 108 | 6C | l |
| 13 | 0D | (CR) | 45 | 2D | - | 77 | 4D | M | 109 | 6D | m |
| 14 | 0E | 46 | 2E | . | 78 | 4E | N | 110 | 6E | n | |
| 15 | OF | 47 | 2F | / | 79 | 4F | O | 111 | 6F | o | |
| 16 | 10 | 48 | 30 | 0 | 80 | 50 | P | 112 | 70 | p | |
| 17 | 11 | 49 | 31 | 1 | 81 | 51 | Q | 113 | 71 | q | |
| 18 | 12 | 50 | 32 | 2 | 82 | 52 | R | 114 | 72 | r | |
| 19 | 13 | 51 | 33 | 3 | 83 | 53 | S | 115 | 73 | s | |
| 20 | 14 | 52 | 34 | 4 | 84 | 54 | T | 116 | 74 | t | |
| 21 | 15 | 53 | 35 | 5 | 85 | 55 | U | 117 | 75 | u | |
| 22 | 16 | 54 | 36 | 6 | 86 | 56 | V | 118 | 76 | v | |
| 23 | 17 | 55 | 37 | 7 | 87 | 57 | W | 119 | 77 | w | |
| 24 | 18 | 56 | 38 | 8 | 88 | 58 | X | 120 | 78 | x | |
| 25 | 19 | 57 | 39 | 9 | 89 | 59 | Y | 121 | 79 | y | |
| 26 | 1A | 58 | 3A | : | 90 | 5A | Z | 122 | 7A | z | |
| 27 | 1B | (ESC) | 59 | 3B | : | 91 | 5B | [ | 123 | 7B | { |
| 28 | 1C | 60 | 3C | < | 92 | 5C | \ | 124 | 7C | l | |
| 29 | 1D | 61 | 3D | = | 93 | 5D | ] | 125 | 7D | } | |
| 30 | 1E | 62 | 3E | > | 94 | 5E | ^ | 126 | 7E | - | |
| 31 | 1F | 63 | 3F | ? | 95 | 5F | - | 127 | 7F |
Blank spaces indicate control characters that are not used on the IBM PC.
Table A.2. IBM Extended Character Set
In this appendix, we give some common DOS commands.
In this appendix, we give some common DOS commands.Note: in the following, two special characters can be used within a file name or extension. The ? character used in any position indicates that any character can occupy that position in the file name or extension; The * character used in any position indicates that any character can occupy that position and all remaining positions in the file name or extension.
Creates a backup of disk files.
Example: BACKUP C: A:
Copies the files in the current C directory to a backup in disk A.
CLS (Clear Screen)
CLS (Clear Screen)Clears the display screen and moves the cursor to the upper left corner.Example: CLSCOPY
Example: CLS
Copies files from one disk and directory to another.
Example 1: COPY A:FILE1. TXT B:
Copies the file I:ILE1. TXT from drive A to drive B. The current drive need not be specified in the command. It is also possible to give the copy a different name.
Example 2: COPY FILE1. TXT B:FILE2. TXT
Copies FILE1. TXT from the disk in the current drive to FILE2. TXT on the disk in drive B.
Example 3: COPY A:*. B:
Copies all files from drive A to drive B.
DATEChanges the date known to the system. The date is recorded as a directory entry on any files you create. The format is mm- dd- yy.
Example: DATE 07- 14- 90
Lists the directory cntrics.
Example 1: DIR
Lists all directory entries in the current drive. Each entry has a file name, size, and date. The entries in a different directory or different drive can also be listed by specifying the name of the drive or directory.
Example 2: DIR C*
Lists all directory entries of files that begin with C and have any extension.
Erases a file.
Example 1: ERASE FILE1. TXT
Erases the file called FILE1. TXT from the current drive and directory.
Example 2: ERASE *OBJ
Erases all files with an .OBJ extension in the current drive.
Initializes a disk.
Example: FORMAT A:
FORMATExample: FORMAT A:Formats the disk in drive A. Caution: formatting a disk destroys any previous contents of the disk. A new disk must be formatted before it can be used.
Prints files on the printer.
Example: PRINT A: MYFILE.TXT
Prints the file called MYFILE.TXT in drive A.
Changes the name of a file.
Example: REN FILE1. TXT MYFILE.TXT
Renames the file FILE1. TXT to MYFILE.TXT.
Restores files from a backup disk.
Example: RESTORE A: C:
Copies the backup files from disk A to disk C.
TIMEChanges the time known to the system. The time is recorded as a directory entry on any files you create. The format is hh:mm:ss. The range of hours is 0- 23
Example: TIME 16:47:00
TYPE
Displays the contents of a file on the display screen.
Example: TYPE MYFILE.TXT
Displays the file called MYFILE.TXT.
DOS versions 2.1 and later provide the capability of placing related disk files in their own directories.
When a disk is formatted, a single directory called the root directory is created. It can hold up to 112 files, for a double- sided, double- density 514 inch floppy disk.
The root directory can contain the names of other directories called subdirectories. These subdirectories are treated just like ordinary files; they have names of 1- 8 characters and an optional one- to three- character extension.
To illustrate the following commands, we'll use the following tree- structured directory as an example.
Here, PROGS is a subdirectory of the root directory. PRO1 and PRO2 are subdirectories of PROGS. P1A.EXE is a file in PRO1.
A path to a file consists of a sequence of subdirectory names: separated by backslashes (), and ending with the file name. If the sequence begins with a (\backslash), then the path begins at the ROOT DIRECTORY. If not, it begins with the current directory.
CHDIR (or CD)
Changes the current directory.
Example 1: CD\
Example 1: CD\ Makes the root directory the current directory of the logged drive.
Example 2: CD\PROGS
Example 2: CD\PROGSMakes PROGS the current directory of the logged drive.
Example 3: CD PRO1
After example 2, makes PRO1 the current directory.
Example 4: CD\PRO1
DOS would reply "invalid directory" because PRO1 is not a subdirectory of the root directory.
Example 5: CD
This command causes the path to the current directory to be display so after example 3, if C is the logged drive, DOS would respond with C:\PROGS\PRO1.
Creates a subdirectory on the specified disk.
As examples, we'll create the preceding tree structure on the disk in drive C:
C>CD\ C>MD\PROGS C>MD\PROGS\PRO1 C>MD\PROGS\PRO2
Removes a subdirectory from a disk. The subdirectory must be empty. The last directory in a specified path is the one removed.
As examples, we'll erase file P1A.EXE and remove all the preceding directories from the disk in drive C:
C>ERASE\PROGS\PRO1\P1A.EXE C>RM\PROGS\PRO1 C>RM\PROGS\PRO2 C>RM\PROGS
C.1 Introduction
In this appendix, we show some of the common BICS and DOS interrupt calls. We begin with interrupt 10h; interrupts 0 to 1h are not normally used by application programs, their names are given in Table C.1.
C.2 BIOS Interrupts
Function Oh:
Select Display Mode
Selects video display mode.
Input: AH = 0h
AL = video mode
Output: none
Function 1h:
Change Cursor Size
Selects the start and ending lines for the cursor.
Input: AH = 1h
CH (bits 0- 4) = starting line for cursor
CL (bits 0- 4) = ending line for cursor
Output: none
Oh Divide by zero
1h Single step
2h NMI
3h Breakpoint
4h Overflow
5h Printscreen
6h Reset red
7h Reserv_d
8h Timer tick
9h Keyboard
0Ah Reserved
OBh Serial communications (COM2)
0Ch Serial communications (COM1)
0Dh Fixed disk
0Eh Floppy disk
0Fh Parallel printer
Positions the cursor.
Input: AH = 2h
BH = page
DH = row
DL = column
Output: none
Obtains the current position and size of the cursor.
:input: AH = 3h
BH = page
Output: CH = starting line for cursor
CL = ending line for cursor
DH = row
DL = column
Input: AH = 5h
AL = page
DH = row
DL = column
Output: none
Scroll Window Up
Scrolls the entire screen or a window up by a specified number of lines.
Input: AH = 6h
AL = number of lines to scroll
of zero, entire window is blanked)
BH = attribute for blanked lines
CH,CL = row, column of upper left corner of windows DI+,DL = row, column of lower right corner of windows Output: none
Scroll Window Down Scrolls the entire screen or a window down by a specified number of lines
Input: AH = 7h
AL = number of lines to scroll
(if zero, entire window is blanked)
BH = attribute for blanked lines
CH,CL = row, column of upper left corner of window
DH,DL = row, column of lower right corner of window
Output: none
Obtains the ASCII character and its attribute at the cursor position.
Input: AH = 8h
BH = page
Output: AH = attribute
AL = character
Writes an ASCII character and its attribute at the cursor position.
Input: AH = 9h
AL = character
BH = page
BL = attribute (text mode) or color (graphics mode)
CX = count of characters to write
Output: none
Writes an ASCII character at the cursor position. The character receives the attribute of the previous character at that position.
Input: AH = 0Ah
AL = character
BHI = page
CX = count of characters to write
Output: none
Set Palette, Background, or Border
Selects a palette, background color, or border color.
Input: To select the background color and border color
AH = OBh
BH = 0
BL = color
To select palette (320 x 200 four- color mode)
AH = OBh
BH = 1
BL = palette
Output: none
Function OCh: Write Graphics Pixel
Input: AH = 0Ch
AL = pixel value
BH = page
CX = column
DX = row
Output: none
Function OCh:
Read Graphics Pixel
Obtains a pixel value.
Input: AH = 0Dh
BH = page
CX = column
DX = row
Output: AL = pixel value
Function OCh:
Write Character in Teletype Mode
Writes an ASCII character at the cursor position, then increments cursor position.
Input: AH = 0l:h
AL = character
BH = page
BL = color (graphics mode)
Output: none
Note: the attribute of the character cannot be specified.
Function OCh:
Get Video Mode
Obtains current display mode.
Input: AH = 0l:h
Output: AH = number of character columns
AL = display mode
BH = active display page
Function 10h, Subfunction 10h:
Set Color Register
Sets individual VGA color register.
Input: AH = 10h
AL = 10h
BX = color register
CH = green value
CI. = blue value
DH = icd value
Output: none
Function 10h, Subfunction 12h:
Set Block of Color Registers
Sets a group of VGA color registers.
Input: AH = 10h
AL = 12h
BX = firstcolor register
CX = number of color registers
ES:DX = segment:offset of color table
Output: none
Note: the table consists of a group of three- byte entries corresponding to red, green, and blue values for each color register.
Obtains the red, green, and blue values of a VGA color register.
Input: AH = 10h
AL = 15h
BX = color register
Output: CH = green value
CL = blue value
DH = red value
Obtains the red, green, and blue values of a group of VGA color registers.
Input: AH = 10h
AL = 17h
BX = first color register
CX = number of color registers
ES:DX = segment:offset of buffer to receive color list
Output: ES:DX = segment:offset of buffer
Note: the color list consists of a group of three- byte entries corresponding to red, green, and blue values for each color register.
Obtains the equipment list code word.
Input: none
Output: AX = equipment list code word
(bits 14- 15 = number of printers installed,
8 is reserved;
2 used by PS/2;
Return: the amount of conventional memory.
Input: none
Output: AX = memory size (in KB)
Function 2h:
Read Sector
Read Sector: Reads one or more sectors.
Input: AH = 2h
CH = cylinr
CL = sector
DH = head
DL = drive (0- 7Fh = floppy disk, 80h- FFh = fixed disk).
ES:BX = segment:offset of buffer
Output:
If function successful
CF = clear
AH = j
AL = number of sectors transferred
If function unsuccessful
CF = set
All = error status
Function 3h:
Write Sector
Writes one or more sectors.
Input: AH = 3h
AL = number of sectors
BX = first color register
CH = cylinder
CL = sector
DH = head
DL = drive (0- 7Fh = floppy disk, 80h- FFh = fixed disk).
ES:BX = segment:offset of buffer
Output: If function successful
CF = clear
AH = 0
AL = number of sectors transferred
If function unsuccessful
CF = set
All = c:ror status
Interrupt 15h; Cassette I/O and Advanced Features for AT, PSA
Function 87h:
Move Extended Memory Block
Transfers data between conventional memory and extended memory.
Input: AH = 87h
CX = number of words to move
ES:SI = segment:offset of Global I- x:riptur Table
Output: If function successful
CF = clear
All = 0
AL = number of sectors transferred
If function unsuccessful
CF = set
AH = error status
Function 88h:
Get Extended Memory Size
Obtains amount of extended memory.
Input: AH = 88h
Output: AX = extended memory size (in KB)
Function Oh: Read Character from Keyboard
Input: AH = Oh Output: AH = keyboard scan code AL = ASCII character
Function 2h:
Get Keyboard Flags
Obtains key flags that describe the status of the function : says.
Input: AH = 2h Output: AL = flags
Bit If Set
7 Insert on
6 Caps Lock on
5 Nurn Lock on
4 Scroll Lock on
3 Alt key is down
2 Ctrl key is down
1 left shift key is down
0 right shift key is down
1
Function 10h:
Read Character from Enhanced Keyboard
Input: AH = Oh
Output: AH = keyboard scan code
AL = ASCII character
Note: this function can be used to return scan codes for control keys such as F11 and F12.
Function Oh:
Write Character to Printer
Input: AH = 0
AL = character
DX = printer number
Output: AH = status
Bit
If Set
7 printer no: busy
6 printer acknowledge
5 out of paper
4 printer selected
3 V0 error
2 unused
1 unused
0 printer timed out
Terminates the execution of a program.
Input:
Output: none
Keyboard Input
Waits for a character to be read at the standard input device (unless one is ready), then echoes the character to the standard output device and returns the ASCII code in AL.
Input:
Output: AL = character from the standard input device
Display Output
Outputs the character in DL to the standard output device.
Input:
DL = character
Output: none
Printer Output
Outputs the character in DL to the standard printer device.
Input:
DL = character
Output: none
Print String
Outputs the characters in the print string to the standard output device.
Input:
DS:DX = pointer to the character string ending with 'S'
Output: none
Get Date
Returns the day of the week, year, month and date.
Input:
Output: AL = Day of the week (0=5UN, 6=5AT)
CX = Year (1980- 2099)
DH = Month (1- 12)
DL = Day (1- 31)
Set.Date
Sets the date.
Input:
CX = year (1980- 2099)
DH = month (1- 12)
DL = day (1- 31)
Output: AL
Returns the time: hours, minutes, seconds and hundredths of seconds.
Input:
Output: CH = hours (0- 23)
CL = minutes (0- 59)
DH = seconds (0- 59)
DL = hundredths (0- 99)
Set Time
Sets the time.
Input: AH = 2Dh
CH = Hours (0- 23)
DH = Seconds (0- 59)
CL = Minutes (0- 59)
DL = Hundreds (0- 99)
Output: AL = 00h if the time is valid
FFh if the time is not valid
Get DOS Version Number
Returns the DOS vusion number.
Input: AH = 30h
Output: BX = 00x011
CX = 0000H
AL = major version number
All = minor version number
Terminate Process and Remain Resident
Terminates the
Input:
AI = 10. urn code
DX = 1111mory size in paragraphs
Output: none
Ctrl- break Check
Set or qtt the stite of .Rt.AF Ctr.hite.k 1. cking.
Input:
AL = 00h, o rcquest current stite
01h to set the current state
DL = 00h, to set current state OFF
01h, to set current state ON
Output: DL = The current state (00h=OFF, 01h=ON)
Get Vector
Obtains the address in an interrupt vector.
Input: AH = 35h
AL = interrupt number Output: ES:DX = pointer to the interrupt handling routine.
Returns the disk free space (available clusters, clusters/drive, bytes/sector).
Input: AH = 36h
DL = drive (0=default, l=A)
Output: BX = Available clusters
DX = clusters/drive
CX = bytes/sector
AX = FFF!h if the drive i:DL is invalid,
otherwise the number of sectors per cluster
Creates the specified directory.
Input: AH = 39h
DS:DX = pointer to an ASCIIZ string
Output: AX = error codes if carry flag is set
Removes the specified directory.
Input: AH = 3Ah
DS:DX = pointer to an ASCIIZ string
Output: AX = error codes if carry flag is set
Changes the current directory to the specified directory.
Input: AH = 3Bh
DS:DX = pointer to an ASCIIZ string
Output: AX = error codes if carry flag is set
Create a File (CREAT)
Creates a new file or truncates an old file to zero length in preparation for writing.
Input: AH = 3Ch
DS:DX = pointer to an ASCIIZ string
CX = attribute of the file
Output: AX = error codes if carry flag is set
16- bit handle if carry flag not set
Open a File
Opens the specified file.
Input: AH = 3Dh
DS:DX = pointer to an ASCIIZ path name
AL = access Code
Output: AX = error codes if carry flag is set
16- bit handle if carry flag not set
Function 3Eh:
Close a File Handle
Closes the specified file handle.
Input: AH = 3Eh
BX = file handle returned by open or create
Output: AX = error codes if carry flag is set
none if carry flag not set
Function 3Fh:
Read from a File or Device
Transfers the specified number of bytes from a file into a buffer location.
Input: AH = 3Fh
BX = file handle
DS.DX = buffer address
CX = number of bytes to be read
Output: AX = number of bytes read
error codes if carry flag set
Function 40h:
Write to a File or Device
Transfers the specified number of bytes from a buffer into a specified file.
Input: AH = 40h
BX = file handle
DS.DX = address of the data to write
CX = number of bytes to be write
Output: AX = number of bytes written error codes if carry flag set
Function 41h:
Delete a File from a Specified Directory (UNLINK)
Removes a directory entry associated with a file name.
Input: AH = 41h
DS.DX = address of an ASCIIZ string
Output: AX = error codes if carry flag set
none if carry flag not set
Function 42h:
Move File Read Write Pointer (LSEEK)
.Moves the read/write pointer according to the method specified.
Input: AH = 42h
CS.DX = distance (offset) to move in bytes
AL = method of moving (0,1,2)
BX = file handle
Output: AX = error codes if carry flag set
DX:AX = new pointer location if carry flag not set
Function 47h:
Get Current Directory
Places the full path name (starting from the root directory) of the current directory for the specified drive in the area pointed to by DS:SI.
Input: AH = 47h
DS:SI = pointer to a 64- byte- user memory area
DL = drive number (0=default, 1=A, etc.)
error codes if carry flag set
Output: DS:SI = filled out with full path name from the root if carry is not set AX = error codes if carry flag is set
Allocates the requested number of paragraphs of memory.
Input: AH = 48h
BX = number of paragraphs of memory requested
Output: AX:0 = points to the allocated memory block
AX = error codes if carry flag set
BX = size of the largest block of memory available (in paragraphs) if the allocation fails
Frees the specified allocated memory.
Input: AH = 49h
ES = segment of the block to be returned
Output: AX = error codes if carry flag set
none if carry flag not set
Terminates the current process and transfers control to the invoking process.
Input: AH = 4Ch
AL = return code
Output: none
Input: AL = drive number
CX = number of sectors to read
DX = beginning logical sector number
DS:BX = transfer address
Output: If successful CF = 0
If unsuccessful CF = 1 and AX contains error code
Input: AI. = drive number
CX = number of sectors to read
DX = beginning logical sector number.
DS:BX = transfer address
Output: If successful CF = 0
If unsuccessful CF = 1 and AX contains error code
Input: DX = offset of beginning of free space,
segment is with respect to PSP.
Output: none
The MASM assembler translates an assembly language source file into a machine language object file. It generates three files, as shown:
The object fil contains the machine language translation of the as sembly language source code, plus other information needed to produce an executable file.
The list file is a text file that gives assembly language code and the corresponding machine code, a list of names used in the program, error messages, and other statistics. It is helpful in debugging.
The cross- reference file lists names used in the program and line num bers where they appear. It makes large programs easier to follow. As generated, it is not readable; the CREF utility program may be used to convert it to a legible form.
For MASM version 5.0, the most general command line is MASM options source_file, object_file, list_file, crossref_file
MASM 4.0 has the same command line, except that the options appear last.
The default extension for the object file is .OBJ, for the listing file it is .LST, and for the cross- reference file it is .CRF.
For example, suppose MASM is on a disk in drive C, source file FIRST.ASM is on a disk in drive A, and C is the logged drive. To create object file FIRST.OBJ, listing file FIRST.LST, and cross- reference file FIRST.CRF on drive A, we could type
C>MASM A: FIRST.ASM, A: FIRST.OBJ, A: FIRST.LST, A: FIRST.CRF
A simpler way to get the same result is
C>MASM A: FIRST,A:,A:,A:
A semicolon instead of a comma on the MASM command line tells the assembler not to generate any more files. For example, if we type
C>MASM A: FIRST,A:;
Then MASM will generate only FIRST.OBJ. If we type
C>MASM A: FIRST,A:,A:;
Then we get FIRST.OBJ, FIRST.LST, but not FIRST.CRF.
It's also possible to let MASM prompt you for the files you want. For example, suppose we want .OBJ and .CRF files only.
Microsoft (R) Macro Assembler Version 5.10 Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.
Object filename [FIRST.OBJ]: A:
Source listing [NUL.LST]:
Cross- reference [NUL.CRF]: A: FIRST
50140 + 234323 Bytes symbol space free
0 Warning Errors
0 Severe Errors
The first response just given means that we accept the name FIRST.OBJ for the object file. The second one means that we don't want a listing file (NUL means no.file). The third one means we want a cross- reference file called FIRST.CRF.
The MASM options control the operations of the assembler and the format of the output files. Table D.1 gives a list of some commonly used ones. For a complete list, see the Microsoft Programmer's Guide.
Several options may be specified on a command line. For example,
/A
Arrange so. ce segments in alphabetical order.
/C
Create a cross- reference file.
Create pass 1 listing (see below).
Make names case sensitive.
Accept 8087 floating- point instructions.
Leave source segments in original order.
/D
Set error level display: (default = 1):
statements
inefficient code
/D
/ML
Display the lines containing errors.
/R
Write symbolic information to the
/S
/M(0112)
object file (use with CODEVIEW).
To show what the MASM output files look like, the following program SWAPASM will be assembled. It swaps the content of two memory words.
TITLE PGMD_1: SNAP WORDS
MODEL SMALL
. STACK 100H
.DATA
WORD1 DW 10
WORD2 DW 20
.CODE
MAIN PROC
MOV AX,@DATA
MOV DS,AX
MOV AX,WORD1
XCHG AX,WORD2
MOV WORD1,AX
MOV AH,4CH
INT 21H
ENDP
END MAIN
C>MASM A:PGMD_1,A:,A:,A:
Microsoft (P) Macro Assembler Version 5.10
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.
47358 + 390893 Bytes symbol space free
0 Warning Errors
0 Severe Errors
The listing file is shown in Figure D.1.
C>TYPE A:PGMD_1. LST
Down the left side of the listing are the line numbers. Next we have a column of offset addresses (in hex), relative to stack, data, and code segments. After that comes the machine code translation (in hex) of the instructions.
MASM makes two pass.s through the source file. On the first pass, MASM checks for syntax errors and creates a symbol table of names and their relative locations within a segment. To keep track of locations, it uses a location counter. The location counter is reset to 0 at the beginning of a
segment. When an instruction is encountered, the location counter is increased by the number of bytes needed for the machine code of the instruction. When a name is encountered, it is entered in the symbol table along with the location counter's value. The symbol table appears near the bottom of the .LST file; in the preceding example, the symbols are MAIN, WORD1, and WORD2. The MASM /D option causes the .LST file to include pass 1 error messages. Whether these are actually errors is determined in pass 2.
On the second pass, MASM completes error checking and machine codes the instructions, except for those instructions that refer to names in other object modules. The .LST file is also created.
The reason MASM needs two passes to assemble a program is that some instructions may refer to names that appear later on in the source file. These instructions can be machine- coded only after their relative locations have been determined from the symbol table.
The object file (PGMD_1. OBJ) that MASM creates is not executable. The final addresses of the variables need to be determined by the LINK program (see later description). In the .LST file, these addresses are marked by a "R" (relocatable) symbol (lines 9, 10, 11, 12, 13).
The cross- reference file (here PGMD_1. CRF) contains information on names—where they are defined and the line numbers where they appear in the .LST file. The .CRF file is not printable; the CRF program, on the DOS disk, converts it to a .REF file that has an ASCII format:
Microsoft Cross- Reference Version 5.10 Fri Sep 06 01:33:52 1991
PGMD_1: SWAP WORDS
Symbol Cross Reference (# definition, + modification) Cref- 1
ACPU 1#
@VERSION 1#
CODE 7
DATA 4
DGROUP 9
MAIN 8# 16 17
STACK. 3 #3
WORD1. 5# 11 13+
WORD2. 6# 12+
DATA. 4#
- Text. 7#
11 Symbols
C>CRF A:PGMD_1;
Microsoft. (R) Cross- Reference Utility Version 5.10
Copyright (C).Microsoft. Corp. 1981- 1985, 1987. All rights reserved.
11 Symbols
The output is the file PGMD_1. REF, which can be printed by using the TYPE command (Figure D.2).
C>TYPE PGMD_1. REF
D.2 LINK
The job of the LINK program is to link object files (and possibly library files) into a single executable file. To do this, it must resolve reference to names used in one module but defined in another. The mechanism for doing this is explained in Chapter 14. LINK must be used even if there is only one object file.
The input to LINK is one or more object and library files, and the output is a run file and an optional roadmap file, as shown:
Run file Loadmap file The run file is an executable machine language program. The loadmap file gives the size and relative location of.the program segments.
For LINK version 5.0, the most genet:! cor:mand line is
LINK options"object_file_list, run_file, loadmap_file, li brary_list"
The only option you will be likely to use is /CO, which causes extra information for CODEVIEW to be included.
The object_file_list is a list of object files to be linked. It begins with the name of the object file containing the main program; the other object files usually contain procedures that are called by the main program and by each other. The file names are separated by blanks or "*".
The run_file h is an .EXE extension. It is an executable file unless the program is a .COM frrn.n.t program, in which case one more step is needed to produce an executable file..COM programs are discussed in Chapter 14. . The library list consists of library files, if any, separated by blanks or
For example, suppose LINK Is on a disk in drive C and the files to be linked are in driv. A. The main object file is FIRST.OBJ, other object files are SECOND.OBJ and THIRD.OBJ. To create a run file FIRST.EXE and a loadmap file FIRST.MAP, we could type
C.LINr: A: FIRST+SECOND+THIRD, A: FIRST, A: FIRST;
or just
C>LINr: A: FIRST+SECOND+THIRD, A: A: ;
The semicolon at the end means that there are no library files. As with MASM, it's possible to run LINK interactively:
C>LINK. FIRST+SECOND+THIRD
Microsoft (R) Overlay Linker Version 3.64
Copyright (C). Microsoft Corp 1983- 1988. All rights
reserved.
Run File [FIRST.EXE]:
List File [NUL.MAP] A: FIRST
Libraries: [.LIB]
The first response means that we accept the name FIRST.EXE for the run file. The second response means we want to call the loadmap file FIRST.MAP. The third response means that there are no library files.
Let's link P'GMD_1 above:
C>LINK A:PGND:1,A::,A::;
Microsoft (R) Overlay Linker Version 3.64.
Copyright (C) Microsoft Corp- 1983- 1988. All rights
reserved.
C>
Here is the loadmap file:
C>TYPE A:PGND:1. MAP
Start Stop Length Name Class
00000H 00012H.00013H.TEXT CODE
00014H 00017H 00004H DATA DATA
00020H 0011FH 00100H STACK STACK
Origin Group 0001:0 DGROUP.
Program entry point at:0000:0000
The file gives the relative size and location of the program segments.
This appendix covers the DEBUG and CODEVIEW debuggers. DEBUG is available on the DOS disk, and CODEVIEW comes with the Microsoft Macro Assembler, version 5.0 or later. DEBUG is a primitive but utilitarian program with a small, easy- to- learn command set. CODEVIEW is a much more sophisticated program that may be used to debug Pascal, BASIC, FORTRAN, C, or assembly language code. The user can simultaneously view source code, registers, flags, and selected variables.
Since nlost of the DEBUG commands will work in CODEVIEW, you should read the sections on DEBUG even if you will ultimately be using CODEVIEW. Table E.1 summarizes the most useful DEBUG commands. For a complete list, see the DOS user's manual.
To demonstrate the DEBUG commands, we'll use PGM4_2. ASM, which displays "HELLO!" on the screen.
TITLE PGM4_2: PRINT STRING PROGRAM
.MODEL SMALL.
.STACK 10CH
.DATA
MSG DB 'HELLO!'
.CODE
MAIN PROC
;initialize. DS
MOV AX, E0ATA
MOV DS, AX
;initialize DS
;display message
LEA DX, MSG
;get message MOV AH, 9 ;display string function
INT 21h ;display message
;return to DOS
MOV AH, 4CH
INT 21h
;DOS exit
MAIN ENDP
END MAIN
After assembling and linking the program, we take it into DEBUG. (the user's response appears in boldface).
C>DzBUG PGM4_2. EXE
DEBUG comes back with its "- " command prompt. To view the registers, type "R"
- R
AX=0000 BX=0000 CX=0121 DX=0000 SP=0100 BP=0000 SI=0000 DI=0000
DS=0EFB ES=0EFB SS=0F0B CS=0F1C IP=0000 NV UP DI PL NZ NA PO NC
0F1C:0000 B81B0F MOV AX,0F1B
The display shows the contents of the registers in hex. The third line of the display gives the segment:offset address of the first instruction in the program, along with its machine code and assembly code. The letter pairs at the end of the second line are the current settings of some of the status and control flags. The flags displayed and the symbols DEBUG uses are the following:
Clear (0) Symbol
NV
OV
UP
DN
DI
EI
NG
NZ
ZR
AC
AC
PE
CY
Optional parameters are enclosed in curly brackets. All constants are hexadecimal.
D {start {end}} (range)
Examples:
D 100
D CS:100 120
D
E start {list}
Examples:
E DS:0 A B C
E ES:100 1 2 'ABC'
E 25
G (=start) {addr1 addr2 . . . addrn}
Examples:
G
G =100
G 100 300 200
G =100 150
L address {drive start_sector end_sector}
Examples:
L DS:100 0' C 18
L 8FD0:0 1 2A 3B
L DS:100
N filename
Example:
N myfile
Q
R {register}
Examples:
R
RAX
Dump bytes in hex format
Dump 80h bytes starting at DS:100h
Dump bytes from CS:100h to CS:120h
Dump 80h bytes starting at DS:last1 where last is the last byte displayed
where last is the last byte displayed
Enter data in list beginning at start
Enter Ah,Bh,Ch in bytes DS:0,DS:1,DS:2
Enter 1 in ES:100h, 2 in ES:101h, 41h in ES:102, 42h in ES:103, 43h in ES:104h
ES:104h
Enter bytes interactively starting at
DS:25. Space- bar moves to next byte, Return terminates
Go (execute) at start, with breakpoints at addr1, addr2, . . . addrn.
Execute at CS:IP to completion
Execute at CS:100h to completion
Execute at CS:IP stop of first of
breakpoints CS:100h, CS:300h, or
CS:200h encountered
Execute at CS:100h, breakpoint at
CS:150h
Load absolute disk sectors or named
program (see N command)
Drive specified by number
Load sectors Ch to 1Bh from the disk
in drive A at DS:100h
Load sectors 2Ah to 3Bh from the disk
in drive B at address 8FD0h
Load named file' at DS:100h
Set current filename for L and W commands
Set load/write name to myfile
Quit DEBUG and return to DOS
R {register}
Examples:
R
RAX
Display/Change contents of register
Display registers and flags
Display AX and change contents if desired
T {=start} {value}
Examples:
T
T =100
T =100 5
T 4
U {start {end}} {range}
Examples:
U 100
U CS:100 110
U 200 L 20
U
Trace "value" instructions from start
Trace the instruction at CS:IP
Trace the instruction at CS:100h
Trace 5 instructions starting at CS:100h
Trace 4 instructions starting at CS:IP
Unassemble data in instruction format
Unassemble about 32 bytes starting at CS:100h
Unassemble from CS:100h to CS:110h
Unassemble 20h instructions starting at CS:200h
Unassemble about 32 bytes starting at last+1, where last is the last byte unassembled
W {start}
Write the BX:CS bytes to file (see N command)
Example:
W 100
Write the BX:CX bytes stored at CS:100h
To change the contents of a register—for example, DX—to 1ABCh, type
- RDX
DX 0000
:1ABC
DEBUG responds by displaying the current content of DX, then displays a colon and waits for the us to enter the new content. We enter 1ABC and press the Enter key. DEBUG assumes that all numbers the user types are expressed in hex, so no "h" is needed). To retain the current content of DX, we would just hit the Enter key after the colon.
To verify the change, we can display the registers again.
- R
AX=00(0) BX=0000 CX=0121 DX=1ABC SP=0100 BP=0000 SI=0000 DI=0000
DS=0EFB ES=0E1B SS=0FOB CS=0F1C IP=0000 NV UP DI PL NZ NA PO NC
0F1C:0000 B9100F
MOV AX,0F1B
Now let's trace down to the INT 21h.
- T
AX=0F1B BX=0000 CX=0121 DX=1A·BC SP=0100 BP=0000 SI=0000 DI=0000
DS=0EFB ES=0EFB SS=0F0B CS=0F1C IP=0003 NV UP DI PL NZ NA PO NC
OF1C:0003 8ED8 MOV DS,Ax
- T
AX=0F1B BX=0000 CX=0121 DX=0002 SP=0100 BP=0000 SI=0000 DI=0000
DS=0F1B ES=0EFB SS=0F0B CS=0F1C IP=000B NV UP DI PL NZ NA PO NC
OF1C:0009 B409 MOV AH,00
Note that DEBUG seemingly "skipped" the instruction LEA DX,MSG. Actually, that instruction was executed (we can tell because DX has new contents). DEBUG occasionally executes an instruction without pausing to display the registers.
- T
AX=091B BX=0000 CX=0121 DX=0002 SP=0100 BP=0000 SI=0000 DI=0000
DS=0F1B ES=0EFB SS=0F0B CS=0F1C IP=000B NV UP DI PL NZ NA PO NC
CF1C:000B CD21 INT 21 A
If we were to hit "T" again, DEBUG would start to trace INT 21h, which is not what we want.
From the last register display, we see that INT 21h is a two- byte instruction. Since IP is currently 000Bh, the next instruction must be at 000Dh, and we can set up a breakpoint there:
- GD
HELLO!
AX=091B ES=0EFB SS=0F0B CS=0F1C IP=000D NV UP DI PL NZ NA PO NC
DS=0F1B ES=0EFB SS=0F0B CS=0F1C IP=000D NV UP DI PL NZ NA PO NC
OF1C:000D is4 iC MOV AH,4C
The INT 21h, function 9, displays "HELLO!" and execution stops at the breakpoint 000Dh. To finish execution, just type "G":
- G
Program. terminated normally
This message indicates the program has run to completion. The program must be reloaded to be executed again. So let's leave DEBUG.
To demonstrate the U command, let's reenter DEBUG and use it to list our program:
C>DEBUG PGM4_2. EXE.
- U
0F1C:0000 B81BOF MOV AX,0F1B
0F1C:0003 8ED8 MOV DS,AX
0F1C:0005 8D160200 LEA DX,[0002]
0F1C:0009 B409 MOV AH,09
0F1C:000B CD21 INT 21
0F1C:000D B44C MOV AH,4C
0F1C:00QF CD21 INT 21
0F1C:0011 015BE8 ADD [BP,DI- 18],BX
0F1C:0014 3BEE CMP BP,SI
0F1C:0016 E88AF3 CALL F3A3
0F1C:0019 E97E08 JMP 09A
0F1C:001C 8D1E8E09 LEA BX,[098E]
DEBUG has unassembled about 32 bytes; that is, interpreted the contents of these bytes as instructions. The program ends at 000Fh, and the rest is DEBUG's interpretation of the garbage that follows as assembly code. To list just our program, we type
- 0 F
0F1C:0000 B81BOF MOV AX,0F1B
0F1C:0003 8ED8 MOV DS,AX
0F1C:0005 8D160200 LEA DX,[0002]
.0F1C:0009 B409 MOV AH,09
0F1C:000B CD21 INT 21
0F1C:000D B44C MOV AH,4C
0F1C:000F CD21 INT 21
In the unassembly listing, DEBUG replaces names by the segments or offsets assigned to those names. For example, instead of MOV AX,@DATA we have. MOV AX,01FB. LEA DX,MSG becomes LEA DX,[0002] because 0002h is the offset in segment .DATA assigned to MSG.
To demonstrate the D command, let's dump that part of memor that contains the message "HELLO!". First, we execute the two statement that initialize DS:
- G5
AX- OF1B BX- 0000 CX- 0121 DX- 0000 SP- 0100 :BP- 0000 SI- 0000. DI- 0000.
DS- OF1B ES- 0EFB SS- 0F0B CS- 0F1C IP- 0005 NV UP DI PL NZ. NA PO NC
OF1C:0005 8D160000 .LEA DX,[0000] DS:0000=4548
Now we dump memory starting at DS:0
- DO
OF1B:0000 21 00 48 45 4C 4C 4F 21- 24 C4 02 BB 1E 46 43 D1 !. HELLO!\(\dots FC.
OF1B:0010 E3 D1 E3 8B 87 BC 3D 8B- 97 BE 3D 89 86 7C FF 89
OF1B:0020 96 7E FF 05 0C 00 52 50- E8 7D 6A 83 C4.04 50 E8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
OF1B:0030 6C FB 83 C4 02 0A. CO 75- 03 E9 F6 FE C6.06 D9 37 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
OF1B:0040 FF.8B 1E 46 43 D1 E3 8B- 87 A0 3C A3 60 3E 8B 1E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
OF1B:0050 46 43.8A 97 E6 3C 2A E4- A3 5A 3C D1 E3 D1 E3 8B FC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Now let's execute to completion.
The E command can also be used to enter data interactively. Suppose, for example, we would like to change the contents of bytes 200h- 204h. Before doing so, let's have a look at the current content:
Now let's put 1,2,3,4,5 in these bytes.
DEBUG begins by displaying the current content of byte (0200h, namely OCh, and waits- for us to enter the new content. We type 1 and hit the space bar. Next DEBUG displays the content of byte 0201, which is FFh, and again waits for us to enter the new content. 'We type 2 and hit the space bar to go on to the next byte. After 5 has been entered in byte 204h, DEBUG displays the content of byte 205h, which is F3. Since we don't want to enter any more data, just hit the Enter key to get back to the command prompt. Now let's have a look at memory:
In the process of entering data, if we had wanted to leave the contents of a byte unchanged, we would just hit the space bar to go on to the next byte, or hit return to get back to the command prompt.
CODEVIEW is a powerful debugger that enables the user to view both high- level and assembly language source code during the debugging process. There are two operating modes: window and sequential. In sequential mode, CODEVIEW behaves more or less like DEBUG; sequential mode must be used if your machine is not an IBM compatible or the program is assembled and linked without options. In window mode, all the capabilities of CODEVIEW are available, and for that reason it is the only mode we will discuss. Because CODEVIEW is a large program with many features, we do not attempt to be comprehensive in the following discussion.
To debug in window mode, the code segment of the program must have class 'CODE', for example,
C_SEG SEGMENT 'CODE'
Note: the simplified segment directive .CODE generates a code segment with default class 'CODE':
When the program is assembled and linked, the /ZI and /CO options should be specified; for example,
MASM /ZI MYPROG; LINK /CO MYPROG;
These options cause symbolic information for CODEVIEW to be included in the .EXE file. Because this makes the file a lot bigger, the program should be assembled and linked in the ordinary way after it has been debugged.
The command line for entering CODEVIEW is
1CV (options) filename:
File name is the name of an executable file. The options control CODEVIEW'S start- up behavior. Here is a partial list (see the Microsoft manual for the complete set):
Option Action
/D You are using an IBM compatible that does not support certain IBM- specific trapping functions.
A You are using a non- IBM- compatible computer and want to be able to use CTRL- C and CTRL- break to stop a program.
/M You have a mouse but don't want to use it.
/P You have a non- IBM EGA and have problems running CODEVIEW.
/S You have a non- IBM compatible and want to be able to see the output screen.
/W You have an IBM compatible and want to use window mode.
More than one option may be specified. For example,
CV /D /M /W Myprog
Note that with CODEVIEW, unlike DEBUG, it is not necessary to use a file extension.
To demonstrate some of CODEVIEW's features, we will assemble anc link the program we used to demonstrate DEBUG (PGM4_2. ASM) and take it into CODEVIEW.
C>PGM4_2;
Microsoft (R) Macro Assembler Version 5.10
Copyright (C) Microsoft Corp 1981, 1988. All rights reserved.
50094 + 289327 Bytes symbol space free
0 Warning Errors
0 Severe Errors
C>LINK /CO A:PGM4_2;
Microsoft (R) Overlay Linker Version 3.64
Copyright (C) Microsoft Corp 1983- 1988. All rights reserved.
C>CV A:PGM4_2
Figure E.1 shows window mode display. We see three windows: a display window at the top, a dialog window at the bottom, and a register window at the right side.
The display window shows the source code, with the current instruction in reverse video or in a different color. Lines with previously set breakpoints are highlighted.
The dialog window is where you can enter commands, but as we will see, the function keys can be used for many commands.
The register window shows the contents of the registers and the flags. The flag symbols are the same as DEBUG's.
It is also possible to activate a watch window, which will display selected variables and conditions.
The appearance of the display may be controlled with the keyboard or a mouse. Table E.2 shows the keys and key combinations. For mouse operations, see the Microsoft manual.
Table E.3 shows the function keys that may be used to set and clear breakpoints, trace through a program, or execute to a breakpoint.
The menu bar at the top of the screen has nine titles. The two commands at the end (TRACE and GO) are provided for mouse users.
- To open a menu, press Alt and the first letter of the title. For example, Alt-F to open the File menu. This causes a menu box to be displayed.
Key Function
F1 Displays initial on- line help screen.
F2 Toggles the register window.
F3 Switches between source, mixed, or assembly modes.
Source mode shows source code in the display window. assembly mode shows assembly language instructions, and mixed mode shows both.
F4 Switches to the output screen. The output screen shows output from the program. Press any key to return to the display screen.
F6 Moves cursor between display and dialog . . .indows.
CTRL- G increases size of the window the cursor is in.
CTRL- T Decreases size of the window the cursor is in.
Up arrow Moves cursor up one line.
Down arrow Moves cursor down one line.
PgUp Scrolls up one page.
PgDn Scrolls down one page. Stops at bottom of file if in source mode, behaves like DEBUG's U command in other modes.
Home Scrolls to top of file if cursor is in display window, or to top of command buffer if in dialog window.
Key Function
F5 Executes to the next breakpoint or to the end of the program if no breakpoint encountered.
F7 Sets a temporary breakpoint on the line with the cursor and executes to that line, unless another breakpoint or end of program is encountered.
F8
F8 Traces the next source line, if in source mode, or the next instruction if in assembly mode. If the source line is a call, it enters the called routine. Note: it will execute through DOS function calls.
F9
F9 Sets or clears a breakpoint on the line with the cursor. If the line does not have a breakpoint, it sets one on the line. If it already has a breakpoint, the breakpoint is cleared.
F10
F10 Executes the next program step. Like F8 except that calls are executed rather than traced.
-
Use the up and down arrow keys to make a selection. When the item you want is highlighted, press Enter.
-
For most menu selections, the choice is executed immediately, however, some selections require a response.
-
If a response is needed, a dialog box opens up and you type the needed information.
The escape key can be pressed to cancel a menu. When a menu is open, the left and right arrow keys may be used to move from one menu to another.
The RUN Menu This menu contains selections for running the program. Table E.4 gives the choices.
Watch Commands One of the most useful of CODEVIEW's features is the ability to monitor variables and expressions. The watch commands described hereafter specify the variables and expressions to be watched.
Selection Action
Start Runs the program from the beginning. Program will run to completion unless a breakpoint or watch statement (see below) is encountered.
Restart Restarts the program but doesn't begin to execute it. Any previously set breakpoints or watch statements will still be in effect.
Execute Executes in slow motion from the current instruction. To stop execution, press a key or mouse button.
Clear breakpoints Clears all breakpoints. Doesn't affect watch statements.
The watch commands can be entered from the watch menu, but it's easier to enter them as dialog commands; also, with dialog watch commands a range of variables can be specified.
The watch command to monitor memory is
W(type) range
where range is either
start_address end_address
or -
L'count
and count is the number of values to be displayed.
Type is one of the following:
Type
Meaning
None
default
B
hex byte
A
ASCII
1
signed decimal word
U'
unsigned decimal word
W
hex word
D
hex doubleword
S
short real
L
long real
T
10- byte real
The default type is the last type specified by a DUMP, ENTER, WATCH, or TRACEPOINT command; otherwise it is B.
For example, suppose that array A has been declared as
A DW 37,12,18,96,45,3
and DS has been initialized to 4A7Dh, the segment number of the .DATA segment.
The dialog commands
W A
WI A
WI A L6
WW A L6
WI A 4
W 100 104
Create the following watch window:
In (0), CODEVIEW displays both hex and ASCII values of variable A. In (1), we ask for A to be displayed as a signed integer. In (2), we want to see the array A of six words, displayed as integers. In (3), we ask for array A to be displayed in hex. In (4), we ask for the following range to be displayed in decimal: start_address
Now as the program is traced or executed, the values in the watch window will change as the program changes memory.
We can monitor the stack as a special case of a memory range. For example, suppose
and the watch window shows
:also BP may be used as a stack: pointer; for example,
and the watch window shows
The watch window may also be used to monitor the value of a symbolic expression. The syntax is
W? expression {,format}
where expression can be a single variable or a complex expression involving several variables and constants. The optional format is a single letter that specifies how the expression will be displayed. Some possibilities are
Format Output Format d signed decimal integer i signed decimal integer u unsigned decimal integer x hexadecimal integer C single character
Here are some examples, using the array A defined earlier. Suppose that
W? A. >W? A, d >W? AX + BX >W? A + 2*AX
and the watch window is
- A : 0x0025 1) A, d: 37 2) AX + BX : 0x0005 3) A - 2*AX : 0x0027
In (0), the expression to be displayed is just the variable A. It appears as 0x0025 (the notation 0xdigits is the C language notation for hexadecimal digits). In (1), we ask for A to be displayed with a decimal format. In (2), we get the sum of the contents of registers AX and BX. In (3), we ask for the the sum of A and 2 times the contents of AX.
Sometimes we would like to keep track of a byte or word that is being pointed to by a register; for example, [BX] or
Assembly Language Symbol BYTE PTR [register] WORD PTR [register] DWORD PTR [register] Codeview Symbol BY register BY register WO register DW register
For example, supposr that BX contain O100h, and memory looks like this:
Offset 0100h 0101h 0102h 0103h
Contents ABh CDh EFh 00h
The following watch commands
produce this watch window:
- BY BX : 0x00ab 1) WO BX : 0xcdab 2) WO BX+2 : 0x00ef
To remove a line from the watch window, the Y (yank) command can be used. Its syntax is
Y number
where number is the number of the line to be removed. The command Y causes all the lines to be removed.
You can specify a variable or range of variables as a trace point. When the variable(s) change, the program will break execution. The syntax is
TP? expression (, format)
or
TP(type) range.
where format, type, and range are the same as for the W command. CODEVIEW displays the expression, variable, or range of variables in the same format as the W command, except that the display is intense. For example, we could type
- A L6 4A7D:0000 37 - 12 - 18 - 96 45 3
: 'in the watch window. 'If any element of the array A changes, execution would break.
A'watchpoint breaks execution when a specified expression becomes nonzero (true). The command line for setting a watchpoint is
WP? expression (format)
where expression is a relational expression involving variables and possibly constants.
For example, suppose that A is defined as
A DW 25h
and the current values of AX and BX are 5 and 2, respectively. The dialog commands.
WP? AX>BX >WP? AX - BX - 3 >WP? A > 25 >WP? A = 25
will create the following watch window
-
AX>BX : 0x0001
-
AX - BX - 3': 0x0000
-
A > 25 : 0x0000
-
A 25: 0x0025
The display following (0) indicates that execution will break if AX > BX is true. Because AX has 5 and BX has 2, this is currently true and execution would break immediately. CODEVIEW indicates true by the notation 0x0001.
In (1), execution will break if AX - BX - 3 is nonzero. Currently, AX - BX - 3 = 5 - 2 - 3 = 0, so this condition is false. CODEVIEW indicates false by the notation 0x0000.
In (2), execution will break if A > 25, which is currently false.
In (3), execution will break if A = 25. This is currently true, so execution would break immediately. CODEVIEW shows the current value of A as 0x0025.
Most of the DEBUG commands can be used as dialog commands in CODEVIEW. When working in source mode, symbolic labels may be used in commands. For example, if BELOW is a label in a program, then
G BELOW
causes execution to break at this label if encountered. In the D and E commands, a type can be specified. The syntax for E is
D has the same syntax. Type comes from the same list of one- letter specifiers that are used for the W command. For example,
EI A 17 - 1 456 8900 - 29
will let the user enter the preceding five decimal integers in array A.
In this appendix, we show the binary encoding of a typical 8086 instruction and give a summary of the common 8086, 8087, 80:286, and 80386 instructions.
A machine instruction for the 8086 occupies from one to six bytes. For most instructions, the first byte contains the opcode, the second byte contains the addressing modes of the operands, and the other bytes contain either address information or immediate data. A typical two- operand instruction has the format given in Figure 1:1
In the first byte, we see a six- bit opcode that identifies the operation. The same opcode is used for both 8- and 16- bit operations. The size of the operands is given by the W bit:
For register- to- register, register- to- memory, and memory- to- register op crations, the REG field in the second byte contains a register number and the D bit specifies whether the register in the REG field is a source or dest.nation operand,
The combination of the W. bit and the R/EG field can specify a total of 16 registers, see Table F.1.
The second operand is specified by the MOD and R/M fields. Figure F.2 shows the various modes.
For segment registers, the field is indicated by SEG. Table F.2 shows the segment register encodings.
The following set of 8086 instructions appears in alphabetical order. In the set
-
(register) stands for the contents of the register
-
(EA) stands for the contents of the memory location given by the effective address EA
-
flags affected means those flags that are modified by the instruction according to the result
-
flags undefined means the values of those flags are unreliable
-
disp means 8-bit displacement
-
disp-low disp-hi means 16-bit displacement
Table F.1 :Register Encoding
REG W = 0. W = 1 000 AL AX 001 CL CX 010 t DL DX 011 BL BX 100 AH SP 101 CH BP 110 DH SI 111 BH DI
Table F.2 Segment Register Encoding
SEG Register 00 ES 01 CS 10 SS 11 DS
Corrects the result in AL of adding two unpacked BCD digits or two ASCII digits. Format: AAA Operation: If the lower nibble of AL is greater than 9 or if AF is set to 1, then AL is incremented by 6, AiH is incremented by 1, and AF is set to 1. This instruction always clears the upper nibble of AL and copies AF to CF.
Flags: - - - Affectcd—AF, CF Undefined—OF, PF, SF, ZF Encoding: 00110111 37
Adjusts the unpacked BCD,dividend in AX in preparation for division. Format: AAD Operation: The unpacked BCD operand in AX is converted into binary and stored in AL. This is achieved by multiplying AH by 10 and adding the result to A.L. AH is then cleared. Flags: Affectcd—PF, SF, ZF Undefined—AF, CF, OF Encoding: 11010101 - .00001010 D5 0A
Converts the result of multiplying two BCD digits into unpacked BCD format. Can be used in converting numbers lower than 100 into unpacked BCD format. Format: AAM Operation: The contents of AL are converted into two unpacked BCD digits and placed in AX. AL is divided by 10 and the quotient is placed in AH and the remainder in AL.
Flags: Affected—PF, SF, ZF Undefined—AF, CF, OF Encoding: 11010100 00001010 D4 0A
Corrects the result in AL of subtracting two unpacked BCD numbers.
Format:
Operation: If the lower nibble of AL is greater than 9 or if AF is set to 1, then AL is decremented by 6, AH is decremented by 1, and AF is set to 1. This instruction always clears the upper nibble of AL and copies AF to CF.
Flags: Affected—AF, CF
Undefined—OF, PF, SF, ZF
Encoding: 00111111
3F
The carry flag is added to the sum of the source and destination.
Format: ADC destination, source
Operation: If
If
Flags: Affected—AF, CF, OF, PF, SF, ZF
Encoding: Memory or register with register
000100dw mod reg r/m
Immediate to accumulator
0001010w data
Immediate to memory or register
100000w mod 010 r/m data
(s is set if a byte of data is added to 16- bit memory or register.)
Format: ADD destination, source
Operation: (dest) = (source) + (dest)
Flags: Affected—AI; CI; OF; PF, SF, ZF
Encoding: Memory or register with register
00000dw mod reg r/m
Immediate to accumulator
0000010w data
Immediate to memory or register
100000sw mod 000 r/m data
(s is set if a byte data is added to 16- bit memory or register.)
Format: AND destination, source
Operation: Each bit of the source is ANDed with the corresponding bit in the destination, with the result stored in the destination.
CF and OF are cleared.
Flags: Affected—CF, OF, PF, SF, ZF
Undefined: AF
Encoding: Memory or register with register
001000dw mod reg r/m
Immediate to accumulator
0010010w data
Immediate to memory or register 1000000w mod 100 r/m data
Fomat:
CALL target
Operation: The offset address of the next sequential instruction is pushed onto the stack, and control is transferred to the target operand. The target address is computed as follows: (1) intrasegment direct, offset
Flags: Affected—none
Encoding: Intrasegment Direct
11101000 disp- low disp- high
Intra- segment Indirect
11111111 mod 010 r/m
Intersegment Direct
10011010 offset- low offset- high seg- low seghigh
Intersegment Indirect
11111111 mod 011 r/m
Converts the signed 8- bit number in AL into a signed 16- bit number in AX.
Format: CBW
Operation: If bit 7 of AL is set, then AH gets FFh.
If bit 7 of AL is clear, then AH is cleared.
Flags: Affected—none
Encoding: 10011000
98
Format: CLC
Operation: Clears CF
Flags: Affected—CF
Encoding: 11111000
F8
CLD: Clear Direction Flag
Format: CLD
Operation: Clears DF
Flags: Affected—DF
Encoding: 11111100
FC
Disables maskable external interrupts.
Format: CLI
Operation: Clears IF
Flags: Affected—IF
Encoding: 11111010
FA
Format: CMC
Operation: Complements CF
Flags: Affected—CF
Encoding: 11110100
F5
Compares two operands by subtraction. The flags are affected, but the result is not stored.
Format: CMP destination, source
Operation: The source operand is subtracted from the destination and the flags are set according to the result. The operands are not affected.
Flags: Affected—AF, CF, OF, PF, SF, ZF
Encoding: Memory or register with register
001110dw. mod reg r/m
Immediate with accumulator
0011110w data
Immediate with memory or register
100000w mod 111 r/m data
Compares two memory operands. If preceded by a REP prefix, strings of arbitrary size can be compared.
Format: CMPs source- string- dest- string
or
CMPsB
or
CMPsw
Operation: The dest- string indexed by ES:DI is subtracted from the
source- string indexed by SI. The status flags are affected. If
the control flag DF is 0, then SI and DI are incremented; oth
erwise, they are decremented. The increments are 1 for byte
strings and 2 for word strings.
Flags: Affected—AF, CF, OF, PF, SF, ZF
Encodirg: 101001lw
Converts the signed 16- bit number in AX into a signed 32- bit number in DX:AX.
Format:
CWD
Operation: If bit 15 of AX is set, then DX gets FFFF.
If bit 15 of AX is clear, then DX is cleared.
Flags: Affected—none
Encoding: 10011001
99
Corrects the result in AL of adding two packed BCD operands.
Format:
Operation:
DAA
Operation: If the lower nibble of AL is greater than 9 or if AF is set to 1, then AL is incremented by 6, and AF is set to 1. If AL is
greater than 9Fh'or if the CF is set, then 60h is added to AL and CF is set to 1.
Flags: Affected—AF, CF, PF, SF, ZF
Undefined—OF
Encoding: 00100111
27
Corrects the result in AL of subtracting two packed BCD operands.
Format: DAS
Operation: If the lower nibble of AL is greater than 9 or if AF is set to 1, then 60h is subtracted from AL and CF is set to 1.
Flags: Affected—AF, CF, PF, SF, ZF
Encoding: 0010111
2F
Format: DEC destination.
Operation: Decrements the destination operand by 1.
Flags: Affected—AF, OF, PF, SF, ZF
Encoding: Register (word)
01001 reg
Memory or register
11111111 mod 001 r/m
Performs unsigned division.
Format: DIV source
Operation: The divisor is the source operand, which is either memory or register. For byte division (8- bit source) the dividend is
AX, and for word division (16- bit source) the dividend is
DX:AX. The quotient is returned to AL (AX for word civi
sion), and the remainder is returned to AH (DX for word di
vision). If the quotient is greater than 8 bits (16 bits for
word division), then an INT 0 is generated.
Flags: Undefined—AF, CF, OF, PF, SF, ZF
Encoding: 1110111 mod 110 r/m
11110111 mod 110 r/m
Allows other processors, such as the 8087 coprocessor, to access instructions.
The 8086 processor performs no operation except to fetch a memory operand for the other processor.
Format: ESC external- opcode, source
Flags: none
Encoding: 11011xxx mod xxx r/m
(The xxx sequence indicates an opcode for the coprocessor.)
Causes the processor to enter its halt state to wait for an external interrupt.
Format: HLT
Flags: none
Encoding: 11110100
. F4
Performs signed division.
Format: IDIV source
Operation: The divisor is the source operand, which is either memory or register. For byte division (8- bit source) the dividend is AX, and for word division (16- bit source) the dividend is DX:AX. The quotient is returned to AL (AX for word division), and the remainder is returned to AH (DX for word division). If the quotient is greater than 8 bits (16- bits for word division), then an INT 0 is generated.
Flags: Undefined—AF, CF, OF, PF, SF, ZF Encoding: 1111011w mod 111 r/m
Performs signed multiplication.
Format:
IMUL source Operation: The multiplier is the source operand, which is either memory or register. For byte multiplication (8- bit source) the multiplicand is AL, and for word multiplication (16- bit source) the multiplicand is AX. The product is returned to AX (DX:AX for word multiplication). The flags CF and OF are set if the upper half of the product is not the sign- extension of the lower half.
Flags: Affected—CF, OF Undefined—AF, PF, SF, ZF
Encoding: 1111011w mod 101 r/m
Format: IN accumulator, port
Operation: The contents of the accumulator are replaced by the contents of the designated I/O port. The port operand is either
a constant (for fixed port), or DX (for variable port).
Flags: Affected—none
Encoding: Fixed port
1110010w port
Variable port
1111110w
Format: INC destination
Operation: Increments the destination operand by 1.
Flags: Affected—AF, OF, PF, SF, ZF
Encoding: Register (word)
01000 reg
Memory or register
1111111w mod 000 r/m
Transfers control to one of 256 interrupt routines.
Format: INT interrupt- type
Operation: The FLAGS register is pushed onto the stack, then TF and PF are cleared, CS is pushed onto the stack and then filled by the high- order word of the interrupt vector, IP is pushed
onto the stack and then filled by the low- order word of the interrupt vector.
Flags: Affected- IF,TF
Encoding: Type 3
11001100
Other types
- type
Generates an INT 4 if OF is set.
Format:
INTO
Operation: If OF = 1, then same operation as INT 4. If OF = 0, then no operation takes place.
Flags: If OF=1 then OF and TF are cleared
If OF=0 then no flags are affected.
Encoding: 11001110
CE
Provides a return from an interrupt routine.
Format: IRET
Operation: Pops the stack into the registers IP, CS, and FLAGS.
Flags: Affected- all
Encoding: 11001111
CE
Format: J(condition) short- label
Operation: If the condition is true, then a short jump is made to the label. The label must be within 128 to +127 bytes of the next instruction.
Flags: Affected- none
Instruction Jump If Condition Encoding
JA above CF=0 and ZF=0 77 disp
JAE above or equal CF=0 73 disp
JB below CF=1 72 disp
JBE below or equal CF=1 or ZF=1 76 disp
JC carry CF=0 72 disp
JCXZ CX is 0 (CF or ZF) = 0 E3 disp
JE equal ZF=1 74 disp
JG greater ZF=0 and SF=OF 7F disp
JGE greater or equal ZF=OF 7D disp
JL less (SF.xor OF) = 1 7C disp
JLE less or equal (SF.xor OF) or ZF=1 7E disp
JNA not above CF=1 or ZF=1 76 disp
JNAE not above or equal CF=1 72 disp
JNB not below CF=0 73 disp
JNBE not below or equal CF=0 and ZF=0 77 disp
JNC not carry CF=0 73 disp
JNE not equal ZF=0 75 disp
JNG: Jnot geater (SF xor OF) or ZF = 1 7E disp JNGE not greater nor equal 7C disp (SF xor OF) = 1 JNL not less SF = OF 7D disp JNLE not less nor equal ZF = 0 and SF = OF 7F disp JNO not overflow OF = 0 71 disp JNP not parity PF = 0 7B disp JNS not sign SF = 0 79 disp JNZ not zgn ZF = 0 75 disp JO overflow OF = 1 70 disp JP parity PF = 1 7A disp JPE parity even PF = 1 7A disp JPO parity odd PF = 0 7B disp JS sign SF = 1 .78 disp JZ zero ZF = 1 74 disp
Format: JMP target Operation: Control is transferred to target label. Flags: Affected—none lncoding: Intrasegment direct i l l 0 1 0 0 1 disp- low disp- hi Intrasegment direct short. i l l 0 1 0 1 1 disp Intersegment direct i l l 0 1 0 1 0 Intersegment indirect i l l l l l l mod 101 r/m Intrasegment indirect i l l l l l l mod 100 r/m
Format: LAHF Operation: The low eight bits of the FLAGS register are transferred to AH. Flags: Affected—none Encoding: 10011111 9F
Loads the DS register with.u segment address and a general register with an offset so that data at the segment:offset may be accessed. Format: LDS destination,source Operation: The source is a doubleword memory operand. The lower word is placed in the destination register, and the upperword is placed in DS. Flags: Affected—none Encoding: 11001101 mod reg r/m.
Loads an offset memory address to a register.
Format: LEA destination, source Operation: The offset address of the source memory operand is placed in the destination, which is a general register.
Flags: Affected- none Encoding: 10001101 mod reg r/m
Loads the ES register with a segment address and a general register with an offset so that data at the segment:offset may be accessed.
Format: LES destination, source Operation: The source is a doubleword memory operand. The lower word is placed in the destination register, and the upperword is placed in ES.
Flags: Encoding: 11000100 mod reg r/m
In a multiprocessor environment, locks the bus.
Format: LOCK Operation: LOCK is used as a prefix that can precede any instruction. The bus is locked for the duration of the execution of the instruction to prevent other processors from accessing memory.
Flags: Encoding: 11110000 F0
LODS/LODSB/LODSW: Load Byte or Word String
Transfers a memory byte or word indexed by SI to the accumulator. Format: LODS source- string or LODSB or LODSW Operation: The source byte (word) is loaded into AL (or AX). SI is increm ented by 1 (or 2) if DF is clear; otherwise SI is decrem ented by 1 (or 2).
Flags: Affected- none Encoding: 1010110w
Loop until count is complete. Format: LOOP short- label Operation: CX is decremented by 1, and if the result is not zero ther. control is transferred to the labeled instruction; otherwise control flows to the next instruction. . Flags: Affected- none Encoding: 11100010 disp E2
A loop is controlled by the counter and the ZF. Format: LOOPE short- label or
LOOPZ short- label Operation: CX is decremented by 1, if the result is not zero and
Flags: Affected—none Encoding: 11100001 disp
A loop is controlled by the counter and the ZF.
Format: LOOPNE short- label
or
LOOPNZ short- label
Operation: CX is decremented by 1, if the result is not zero and
Flags: Affected—none
Encoding: 11100000 disp
E0
Move data.
Format: MOV destination, source
Operation: Copies the source operand to the destination operand.
Flags: Affected—none
Encoding: To memory from accumulator
1010001w addr- low addr- high
To accumulator from memory
1010000w addr- low addr- high
To segment register from memory or register
10001110 mod 0 seg r/m
To memory or register from segment register
10001100 mod 0 .seg r/m
To register from memory or register/ To memory from reg
100010di mod reg r/m (addr- low addr- high)
To register from immediate- data
1011w reg data (data- high)
To memory or register from immediate- data
1100011w mod 000 r/m data (data- high)
Transfers memory data addressed by SI to memory location addressed by ES:DI. Multiple bytes (or words) can be transferred if the prefix REP is used.
Format: MOVs dest- string, source- string
or
MOVSB
or
MOVsw
Operation: The source string byte (or word) is transferred to the destination operand. Both SI and DI are then increment by 1 (or 2 for word strings) if
2 for word strings) if
decremented by 1 (or 2 for word strings).
Flags: Affected—none
Encoding: 1010010w
Unsigned multiplication.
Format: MUL source
Operation: The multiplier is the source operand which is either memory or register. For byte multiplication (8- bit source) the multiplicand is AL and for word multiplication (16- bit source) the multiplicand is AX. The product is returned to AX (DX:AX for word multiplication). The flags CF and OF are set if the upper half of the product is not zero.
Flags: Affected—CF, OF
Undefined—AF, PF, SF, ZF
Encoding: 1111011w mod 100 r/m
Forms two's complement.
Format: NEG destination
Operation: The destination operand is subtracted from all 1's (OF:Fh for bytes and OFFFFh for words). Then a 1 is added and the result placed in the destination.
Flags: Affected—AF, CF, OF, PF, SF, ZF
Encoding: 1111011w mod 011 r/m
Format: NOP
Operation: No operation is performed.
Flags: Affected—nonie
Encoding: 10010000
90
Format: NOT destination
Operation: Forms the one's complement of the destination.
Flags: Affected—nonie
Encoding: 1111011w mod 010 r/m
Format: OR destination, source Operation: Performs logical OR operation on each bit position of :he operands and places the result in the destination.
Flags: Affected—CF, OF, PF, SF, ZF
Undefined—AF
Encoding: Memory or register with reg. ster
000010dw mod reg r/m
Immediate to accumulator
0000110w data
Immediate to memory:or register
1000000w mod 001 r/m
Format: OUT accumulator, port
Operation: The contents of the designated I/O port are replaced by the contents of the accumulator. The port is either a constant (for fixed port) or DX (for variable port).
Flags: Affected- nonc
Encoding: Fixed Port
1110011w port
Variable port
111011w
Format: POP destination
Operation: The contents of the destination are replaced by the word at the top of the stack. The stack pointer is incremented by 2.
Flags: Affected- none
Encoding: General register
01011 reg
Segment register
000 seg.111
Memory or register
1000111 mod 000 r/m
Format:
POPFOperation: Transfers flag bits from the top of the stack to the FLAGS register and then increments SP by 2.
Flags: Affected- all
Encoding: 10011101
90
Format: PUSH source
Operation: Decrements the SP register by 2 and then transfers a word from the source operand to the new top of stack.
Flags: rone
Encoding: General register
01010rg
Segment register
000 seg 110
Memory or register
11111111 mod 110 r/m
Format:
PUSHF
Operation: Decrements SP by 2 and transfers flag bits to the top of the stack.
Flags: Affected- nonc
Encoding: 10011100
9C
Rotates destination left through the CF flag one or more times.
Format: RCL destination, 1
or
RCL destination, CL
Operation: The first format rotates the destination once through CF resulting in the msb being placed in CF and the old CF ended in the isb. To rotate more than once, the count must be
placed in CL. When the count is 1 and the leftmost two bits of the old destination are equal, then OF is cleared; if they are unequal, OF is set to 1. When the count is not 1, then OF is undefined. CL is not changed.
Flags: Affected- CF,OF
Encoding: 110100w mod.010 r/m
If
If
Rotates'destination right through the CF flag qne or more times.
Format: RCR destination,1
or
RCR destination,CL
Operation: The first format rotates the destination once through CF .e- sulting in the lsb being placed in CF and the old CF ended in the msb. To rotate more t 11 once, the count must be placed in CL. When the count is 1 and the leftmost two bits of the new destination are equal, then OF is cleared; if they are unequal, OF is set to 1. When the count is not 1, then OF is undefined. CL is not changed.
Flags: Affected- CF,OF Encoding: 110100w mod.011 r/m If $\begin{array}{r l r}{\mathbf{\nabla}\mathbf{v}} & {{} =} & {0} \end{array}$ , count = 1 If $\begin{array}{r l r}{\mathbf{\nabla}\mathbf{v}} & {{} =} & {1} \end{array}$ , count = (CL)
The string operation that follows is repeated while (CX) is not zero.
Format: REP/REPZ/REPE/REPNE/REPNZ string- instruction
Operation: The string operation is carried out until (CX) is decremented to 0. For CMPS and SCAS operations, the ZF is also used in terminating the iteration. For REP/REPZ/REPE the CMPS and SCAS operations are repeated if (CX) is not zero and ZF is 1. For REPNE/REPNZ, the CMPS and SCAS operations are repeated if (CX) is not zero and ZF is 0.
Flags: See the associated string instruction.
Encoding: REP/REPZ/REPE 11110011
REPNE/REPNZ 11110010
Returns control after a called procedure has been executed.
Format: RET.[pcp- value]
Operation: If RET is within a NEAR procedure, it is translated into an intrasegment return, which updates the IP by popping one word from the stack. If RET is within a FAR procedl.re, it is translated into an intersegment return that updates both the II' and CS. The optional pop value specifies a number of b'yte's in the stack to be discarded. These are parameters passed to the procedure.
Flags: Affected- none
Encoding: Intrasegment
1100011
Intrasegnient with pop value
1100015
Intersegment
11001011
Intersegment with pop value
11001010
Rotates destination left one or more times.
Format: ROL destination,1
or
ROL destination,CL
Operation: The first format rotates the destination once; CF also gets
the msb. To rotate more than once, the count must be
placed in CL. When the count is 1 and the new CF is not
the same as the msb, then the OF is set, otherwise, OF is
cleared. When the count is not 1, then OF is undefined. CL is not changed.
Flags: Affected- - - CF, OF
Encoding: 110100vw mod 000 r/m
If
If
Rotates destination right one or more times.
Format: ROR destination,1
or
ROR destination,CL
Operation: The first format rotates the destination once; CF also gets
the lsb. To rotate more than once, the count must be placed
in CL. When the count is 1 and the leftmost two bits of the
new destination are equal, then OF is cleared; if they are un
- equal, OF is set to 1. When the count is not 1, then OF is undefined. CL is not changed.
Flags: Affected- - - CF, OF
Encoding: 110100vw mcd 001 r/m
If
If
Format:
SAHF
Operation: Stores five bits of AH into the lower byte of the FLAGS regis
ter. Only the bits corresponding to the flags are transferred.
The flags in the lower byte of FLAGS register are SF = bit 7,
ZF = bit 6, AF = bit 4, PF = bit 2, and CF = bit 0.
Flags: Affected- - - AF, CF, PF, SF, ZF
Encoding: 1001110
9E
Format: SAL/SHL destination,1
or
SAL/SHL destination,CL
Operation: The first format shifts the destination once; CF gets the msb and a 0 is shifted into the lsb. To shift more than once, the count must be placed in CL. When the count is 1 and the
new CF is not the same as the msb, then the OF is set; otherwise, OF is cleared. When the count is not 1, then OF is undefined. CL is not changed.
Affected—CF, OF, PF, SF, ZF
Undefined—AF
Encoding: 110100w mod 100 r/m
If
If
Format: SAR destination, 1 or
SAR destination, CL
Operation: The first format shifts the destination once; CF gets the Isb and the msb is repeated (sign is retained). To shift more than once, the count must be placed in CL. When the count is 1 OF is cleared. When the count is not 1, then OF is undefined. CL is not changed.
Flags: Affected—CF, OF, PF, SF, ZF
Undefined—AF
Encoding: 110100w mod 111 r/m
If
If
Format: SBB destination, source
Operation: Subtracts source from destination; and if CF is 1 then subtract 1 from the result. The result is placed in the destination.
Flags: Affected—AF, CF, OF, PF, SF, ZF
Encoding: Memory or register with register
000110dw mod reg r/m
Immediate from accumulator
0001110w data
Immediate from memory or register
100000sw mod 011 r/m data
(s is set if an immediate- data- byte is subtracted from 16- bit memory or register.)
Compares memory against the accumulator. Used with REP, it can scan multiple memory locations for a particular value.
Format: SCAS dest- string
or
SCASB
or
SCASW
Operation: Subtracts the destination byte (or word) addressed by DI from AL (or AX). The flags are affected but the result is not saved. DI is incremented (if
Flags: Affected—AF, CF, OF, PF, SF, ZF
Encoding: 101011w
Format: SHR destination, 1 or
SHR destination, CL
Operation: The first format shifts the destination once; CF gets the 1sb and a 0 is shifted into the msb. To shift more than once, the count must be placed in CL. When the count is 1 and the leftmost two bits are equal, then OF is cleared; otherwise, OF is set to 1. When the count is not 1, then OF is undefined. CL is not changed.
Flags: Affected—CF, OF, PF, SF, ZF
Undefined—AF
Encoding: 110100vw mod 101 r/m
If
If
Format: STC
Operation: CF is set to 1.
Flags: Affected—CF
Encoding: 11111001
F9
Format: STO
Operation: DF is set to 1.
Flags: Affected—DF
Encoding: 11111101
FD
Format: STI
Operation: IF is set to 1, thus enabling external interrupts.
Flags: Affected—IF
Encoding: 11111011
FB
Stores the accumulator into memory. When used with REP, it can store multiple memory locations with the same value.
Format: STOS dest- string
or
STOSB
or
STOSW
Operation: Stores AI. (or AX) into the destination byte (or word) addressed by DI. DI is incremented (DI = 1), or decremented (DH = 0) by 1 (byte strings) or 2 (word strings).
Flags: Affected—none
Encoding: 11110101w
Format: SUB destination, source Operation: Subtracts source from destination. The result is placed in the destination.
Flags: Affected—AF; CF, OF, PF, SF, ZF
Encoding: Mcmr~r register with register
001010w mod reg r/m
immediate from accumulator
001010w data
Immediate from memory or register
100000sw mod 101 r/m data
(s is set if an immediate- data- byte is subtracted from 16- bit
memory or register.)
Format: TEST destination, source
Operation: The two operands are ANDed to affect the flags. The operands are not affected.
Flags: Affected—CF, OF, PF, SF, ZF
Undefined—AF
Encoding: Memory or register with register
100000w mod reg r/m
Immediate with accumulator
101010w data
Immediate with memory or register
111101w mod 000 r/m data
Format: WAIT
Operation: The processor is placed in a wait state until activated by an external interrupt.
Flags: Affected—none
Encoding: 10011011
9B
Format: XCHG destination, source
Operation: The source operand and the destination operand are interchanged.
Flags: Affected—none
Encoding: Register with accumulator
10010reg
Memory or register with register
100001w r.o.d reg r/m
Performs a table lookup translation.
Format: XLAT source- table.
Operation: BX must contain the offset address of the source table, which is at most 256 bytes. AL should contain the index of the table element. The operation replaces AL by the contents of the table element addressed by BX and AL.
Flags: Affected—none
Encoding: 1101011
D7
Format: XOR destination, source
Operation: The exclusive OR operation is performed bit- wise with the source and destination operands; the result is stored in the destination. CF and OF are cleared.
Flags: Affected—CF, OF, PF, SF, ZF
Undefined: AF
Encoding: Memory or register with register
001100dw mod reg r/m
Immediate to accumulator
001101ow data
Immediate to memory or register
1000000w mod 110 r/m data
The 8087 uses several data types, when transferring data to or from memory, the memory data definition determines the data type format. Table F.3 shows the association between the 8087 data types and the memory data definitions. In this section we only give 8087 instructions for simple arithmetic operations. Check the 8087 manual for other instructions.
Format: FADD
of
FADD source
or
FADD destination, source
Operation: Adds a source operand to the destination. For the first form,
the source operand is the top of the stack and the destination is ST(1). The top of the stack is popped, and its value is
tion is ST(1). The top of the stack is popped, and its value is added to the new top. For the second form, the source is el
ther short real or long real in memory; the destination is
the top of the stack. For the third form, one of the operands
is the top of the stack and the other is another stack regis
ter; the stack is not popped.
Table F.3 8087 Data Types
| Data Type | Size (bits) | Memory Definition | Pointer Type |
| Word integer | 16 | DW | WORD PTR |
| Short integer | 32 | DD | DWORD PTR |
| Long integer | 64 | DQ | QWORD PTR |
| Packed decimal | 80 | DT | TBYTE PTR |
| Short real | 32 | DD | DWORD PTR |
| Long real | 64 | DQ | QWORD PTR |
| Temporary real | 80 | DT | TBYTE PTR |
FBLD: Packed Decimal LoadFormat: . FBLD. source. Operation: Loads a packed decimal number to the top of the stack. The source operand is of type DT (10 bytes).
FBSTP: Packed BCD Store and PopFormat: . FBSTP. destination Operation: Converts the top of the stack to a packed BCD format and stores the result in the memory destination. Then the stack is popped.
Format:
FDIV
or
FDIV source
or
FDIV destination, source
FDIV destination, sourceOperation: Divides the destination by the source. For the first form, the source operand is the top of the stack and the destination is ST(1). The top of the stack is popped and its value is used to divide into the new top. For the second form, the source is either short real or long real in memory; the destination is the top of the stack. For the third form, one of the operands is the top of the stack and the other is another stack register; the stack is not popped.
Format: FIADD source
FIADD: Integer AddFormat: FIADD sourceOperation: Adds the source operand to the top of the stack. The source operand can be either a short integer or a word integer.
Format: FIDIV source
FIDIV: Integer DivideFormat: FIDIV sourceOperation: Divides the top of the stack by the source. The source operand can be either a short integer or a word integer.
Format: FILD source
FILD: Integer LoadFormat: FILD sourceOperation: Loads a memory integer operand onto the top of the stack. The source operand is either word integer, short integer, or long integer.
Format: FIMUL source
FIMUL: Integer MultiplyFormat: FIMUL sourceOperation: Multiplies the source operand to the top of the stack. The source operand can be either a short integer or a word integer.
FIST: Integer StoreFormat: FIST destination
FIST: Integer StoreFormat: FIST destinationOperation: Rounds the top of the stack to an integer value and stores to a memory location. The destination may be word integer or short integer. The stack is not popped.
FISTP: Integer Store and PopFormat: - FISTP destinationOperation: Rounds the top of the stack to an integer value and stores to a memory location. Then the stack is popped. The destination may be word integer, short integer, or long integer.
FISUB: Integer SubtractFormat: FISUB sourceOperation: Subtracts the source operand from the top of the stack. The source operand can be either a short integer or a word integer.
Format: FLD source
Operation: Loads a real operand onto the top of the stack. The source may be a stack register ST(i), or a memory location. For a memory operand, the data type may be any of the real formats.
- Format: FMUL
or
FMUL source
or
FMUL destination, source
FMUL destination, sourceOperation: Multiples a source operand to the destination. For the first form, the source operand is the top of the stack and the destination is ST(1). The top of the stack is popped and its value is multiplied to the new top. For the second form, the source is either short real or long real in memory; the destination is the top of the stack. For the third form, one of the operands is the top of the stack and the other is another stack register; the stack is not popped.
Format: FST destination
FST: Store RealFormat: FST destinationOperation: Stores the top of the stack to a memory location or another stack register. The memory destination may be short real (doubleword) or long real (quadword). The stack is not popped.
Format: FSTP destination
FSTP: Store Real and PopFormat: FSTP destinationOperation: Stores the top of the stack to a memory location or another stack register. Then the stack is popped. The memory destination may be short real (doubleword), long real (quadword), or temporary real (10 bytes).
Format:
FSUB
or
FSUB source
or
FSUB destination, source
Operation: : Subtracts a source- operand from the destination. For the first form, the source operand is the top of the stack and the destination is ST(1). The top of the stack is popped and its value is subtracted from the new top. For the second form, the source is either short real or long real in memory; the destination is the top of the stack. For the third form, one of the operands is the top of the stack and the other is another stack register; the stack is not popped.
The real- mode 80286 instruction set includes all 8086 instructions plus the extended instruction set. The extended instruction set contains five groups of instructions, (1) multiply with immediate values (IMUL), (2) input and output strings (INS and OUTS), (3) stack operations (POPA, PUSH immediate, PUSHA), and (4) shifts and rotates with immediate count values, and (5) instructions for translating high- level language constructs (3. ROUND and ENTER). We only give the instructions in groups 1- 4.
Format: IMUL destination, immediate or
IMUL destination, source, immediate
Operation: For the first format, the immediate operand, which must be a byte, is multiplied with the destination, which must be a
16- bit register. The lower 16- bit of the result is stored in the register. For the second format, the 8- or 16- bit immediate
operand is multiplied with the source operand, which may
be a 16- bit register or a memory word. The lower 16- bit of
the result is stored in the destination, which must be a 16-
bit register. The flags CF and OF are set if the upper half of the product is not the sign- extension of the lower half.
Flags: Affected—CF, OF
Undefined—AF, PF, SF, ZF
Encoding: Ciiollos1 mod reg r/m data [data if s=0]
Transfers a byte or word string element from a port to memory. Multiple bytes or words can be transferred if the prefix RELP is used.
Format: INS: destination- string, port
"or
INSB
- or
INSW
Operation: A byte or word is transferred from the port designated by
DX to the location ES:DI. DI is then incremented by 1 (or 2
for word strings) if DF = 0; otherwise, DI is decremented by
1 (or 2 for word strings).
Flags: Affected—none
Encoding: 0110110w
Transfers a byte or word string element from memory to a port. Multiple bytes or words can be transferred if the prefix RED is used.
Format: OUTS destination- string, port or OUTSB or OUTSW
Operation: A byte or word is transferred from memory located at DS:SI to the port designated by DX. SI is then incremented by 1 (or 2 for word strings) if
Flags: Affected—none
Encoding: 011011w
Format: POPA Operation: The registers are popped in the order DI, SI, BP, SP, BX, DX, CX, and AX.
Encoding: 01100001
61
Format: PUSH data Operation: The data may be 8 or 16 bits. A data byte is signed extended into 16 bits before pushing onto the stack.
Flags: Affected—none
Encoding: 011010s0 data [data if s = 0]
Format: PUSHA Operation: The registers are pushed in the order AX, CX, DX, BX, original SP, BP, SI, and DI. Flags: Affected—none Encoding: 01100000 60
The general format of shifts and rotates with immediate count values is
Opcode destination, immediate
where opcode is any one of RCL, RCR, ROL, ROR, SAL, SHL, SAR, and SHR. If the immediate value is 1, then the instruction is the same as an 8086 instruction. For an immediate value of 2- 31, the instruction operates like an 8086 instruction in which CL contains the value. The 80286 does not allow a constant count value to be greater than 31.
The encodings for immediate values of 2- 31 are
RCL 1100000w mod 010 r/m RCR 1100000w mod 011 r/m ROL 1100000w mod 000 r/m ROR 1100000w mod 001 r/m
SAL/SHL - 1100000w mod 100. . r/m SAR 1100000w mod 111 r/m SHR 1100000w mod 101 r/m
The real- mode 80386 instruction set includes all real- mode 80286 instructions plus their 32- bit extensions, together with six groups of new instructions, (1) bit scans, (2) bit tests, (3) move with extensions, (4) set byte on condition, (5) double- precision shifts, and (6) move to or from special registers. We only give instructions in groups 1- 5.
- The bit scan instructions are BSF (bit scan forward) and BSR (bit scan reverse). They are used to scan an operand to find the first set bit, and they differ only in the direction of the scan.
Formats: BSF destination, source
or
BSR destination, source
Operation: The destination must be a register, the source is either a register or a memory location. They must be both words or both doublewords. The source is scanned for the first set bit. If the bits are all 0, then ZF is cleared; otherwise, ZI is set and the destination register is loaded with the bit position of the first set bit. For BSF the scanning is from bit 0 to the msb, and for BSR the scanning is from the msb to bit 0.
Flags: Affected- ZF
Encoding:
BSF 00001111 10111101 mod reg r/m BSR 00001111 10111101 mod reg r/m
The bit test instructions are BT (bit test), BTC (bit test and complement), BTR (bit test and reset), and BTS (bit test and set). They are used to copy a bit from the destination operand to the CF so that the bit can be tested by a JC or JNC instruction.
Format: BT destination, source
or
BTC destination, source
or
BTR destination, source
or
BTS destination, source
Operation: The source specifies a bit position in the destination to be copied to the CF. BT simply copies the bit to CF, BTC copies the bit and complements it in the destination, BTR copies the bit and resets it in the destination, and BTS copies the
bit and sets it in the destination. The source is either a 16- bit register, 32- bit register, or an 8- bit constant. The destination may be a 16- bit or 32- bit register or memory. If the source is a register, then the source and destination must have the same size.
Flags: Affected—CF
Encoding: Source is 8- bit immediate data:
BT
00001111 10111010 mod 100 r/m
BTC
00001111 10111010 mod 111 r/m
BTR
00001111 10111010 mod 110 r/m
BTS
00001111 10111010 mod 101 r/m
- Source is register:
BT
00001111 10100011 mod reg r/m
BTC
00001111 10111011 mod reg r/m
BTR
00001111 10110011 mod reg r/m
BTS
00001111 10101011 mod reg r/m
The move with extension instructions are MOVsX (move with sign- extend) and MOVZX (move with zero- extend). These instructions move a small source into a bigger destination and extend to the upper half with the sign or a zero.
Format: MOVsX destination, source
or
MOVZX destination, source
Operation:
Operation: The destination must be a register, the source is either a register or memory. If the source is a byte (or word) the destination is a word (or doubleword). MOVsX copies and sign extends the source into the destination. MOVZX copies and zero extends the source into the destination.
Flags: Affected—none
Encoding:
MOVsX
00001111 1011111w mod reg r/m
MOVZX
00001111 1011011w mod reg r/m
The set byte on condition instructions set the destination byte to 1 if the condition is true and clear it if the condition is false.
Format: SET(condition) destination
Operation: The destination is either an 8- bit register or memory. It is set to 1 if condition is true and to 0 if condition is false.
Flags: Affected—none
Encoding: 00001111 opcode mod 000 r/m
(the opcode byte is given in the following in hex)
The double- precision shift instructions are SHLD (double- precision shift left) and SHRD (double- precision shift right).
Format: SHLD destination, source, count or SHRD destination, source, count
Operation: The destination is either register or memory, the source is a register, and both must be of the same size (either 16 or 32 bits). Count is either an 8- bit constant or CL. The count specifies the number of shifts for the destination. Instead of shifting in zeros as in the case of the single- precision shifts, the bits shifted into the destination are from the source. However, the source is not altered. The SF, ZF, and PF flags are set according to the result; CF is set to the last bit shifted out; OF and AF are undefined.
Flags:
Affected—SF, ZF, PF, CF
Undefined—OF, AF
Encoding: Count is immediate data:
SHLD
00001111 10100100 mod reg r/m [disp] data
SHRD
00001111 10101100 mod reg r/m [disp] data
Count is CL:
SHLD
00001111 10100101 mod reg r/m [disp]
SHRD
00001111 10101101 mod reg r/m [disp]
This appendix describes the most important assembler directives. To explain the syntax, we will use the following notation:
I separates choices { } enclosed items are optional [ ]. repeat the enclosed items O or more times
If syntax is not given, the directive has no required or optional arguments.
Tells the assembler to arrange segments in alphabetical order. Placed before segment definitions.
Syntax: ASSUME segment_register:name i, segment_regis ter: name]
Tells the assembler to associate a segment register with a segment name.
Example: ASSUME CS:C_SEG, DS:D_SEG, SS:S_SEG, ES:D_SEG
Note: the name NOTHING cancels the current segment register association. In particular, ASSUME NOTHING cancels segment register associations made by previous ASSUME statements.
Syntax: CODE {name}
A simplified segment directive (MASM 5.0) for defining a code segment.
.COMM
Syntax: .COMM definition [,definition.]
where definition has the syntax NEARIFAR label:size:countl
label is a variable name.
size is BYTE, WORD, DWORD, QWORD, or TBYTE
count is the number of elements contained in the variable (default = 1)
Defines a communal variable; such a variable has both PUBLIC and EXTRN attributes, so it can be used in different assembly modules.
Examples: COMM NEAR WORD1:WORD COMM FAR ARR1:BYTE:10, ARR2:BYTE:20
Syntax: COMMENT delimiter {text} {text} delimiter {text}
where delimiter is the first nonblank character after the COMMENT directive. Used to define a comment. Causes the assembler to ignore all text between the first and second delimiters. Any text on the same line as the second delimiter is ignored as well.
COMMENT * Uses an asterisk as the delimiter. All this text is ignored * COMMENT
A simplified segment directive for defining a segment containing data that will not be changed by the program. Used mostly in assembly language routines to be called by a high- level language.
Syntax: .CRF {name [,name]}
.XCRF {name [,name]}
In the generation of the cross- reference (.CRF) file, .CRF directs the generation of cross- referencing of names in a program. .CRF with no arguments causes cross- referencing of all names. This is the default directive.
.XCRF turns off cross- referencing in general, or just for the specified names.
.XCRF ;turns off cross- referencing
.CREF ;.turns cross- referencing back
.XCRF NAME1,NAME2 ;turns off cross- referencing
;of NAME1 and NAME2
Simplified segment directives for defining data segments. .DATA defines an initialized data segment and .DATA? defines an uninitialized data segment. Uninitialized data consist of variables defined with "?"..DATA? is used mostly with assembly language routines to be called from a high- level language. For stand- alone assembly language programs, the .DATA segment may contain uninitialized data.
DB define byte
DD define doubleword (4 bytes)
DF define farword (6 bytes); used only with 80386 processor
DQ define quadword (8 bytes)
DT define tenbyte (10 bytes)
DW define word (2 bytes)
Syntax: {name} directive initializer [,initia:izer]
where name is a variable name. If name is missing, memory is allocated but no name is associated with it. Initializer is a constant, constant expression, or ?. Multiple values may be defined by using the DUP operator. See Chapter 10.
Tells the assembler to adopt the DOS segment- ordering convention. For a SMALL model program, the order is code, data, stack. This directive should appear before any segment definitions.
Used in a conditional block. The syntax is
Condition
statements1
ELSE
statements2
ENDIF
If Condition is true, statements1 are assembled; if Condition is false, statements2 are assembled. See Chapter 13 for the form of Condition.
END
Syntax: END {start_address}
Ends a source program. Start_address is a name where execution is to begin when the program is loaded. For a program with only one source module, start_address would ordinarily be the name of the main procedure or a label indicating the first instruction. For a program with several modules, each module must have an END but only one of them can specify a start_address.
ENDIF
Ends a conditional block. See Chapter 13.
ENDM
Ends a macro or repeat block. See MACRO and REPT.
ENDP
Ends a procedure. See PROC.
ENDS
Ends a segment or structure. See SEGMENT and STRUC.
Syntax: There are two forms, numeric equates and string equates. A numeric equate has the form
name EQU numeric_expression
A string equate has the form
name EQU
The EQU directive assigns the expression following EQU to the constant symbol name. Numeric_expression must evaluate to a number. The assembler replaces each occurrence of name in a program by numeric_expression or string. No memory is allocated for name. Name may not be redefined.
Examples:
MAX EQU 32767
MIN EQU MAX - 10
PROMPT EQU <'type a line of text:$'>
ARG EQU <[DI + 2]>
Use in a program:
DATA
MSG DB PROMPT
CODE
MAIN PROC
MOV AX,MIN ;equivalent to MOV AX,32757
MOV BX,ARG ;equivalent to MOV BX,[DI+2]
= (equal)
Syntax: name = expression
where expression is an integer, constant expression, or a one or two- character string constant.
The directive = works like EQU, except that names defined with = can be redefined later in a program.
Examples:
CTR = 1
MOV AX,CTR ;translates to MOV AX,1CTR = CTR + 5.
















































































































































OCR/ text extraction credit,
I am not the author of this book, just collected and OCR scanned for educational purposes. If somehow this book is owned by you or anything similar, please inform me. I will delete it asap.