GNU 8086 Assembly tutorial marco corvi oct 2003 version 1.1 ------------------------------------------------------------- This tutorial focus on AT&T assembly as implemented by the gnu gas assembler for the 8086. Contents. - MEMORY AND IO PORTS - - 386 Memory Map - - 386 IO Ports Map - - BIOS Interrupts - DIRECTIVES - - Assembly Directives - - Other Directives - REGISTER SET - - Intel Register Set - - Flag Register - - Floating Point Registers - INSTRUCTION SET - - Move Instructions - - Transfer Instructions - - Register Exchange - - Conversion Instructions - - Loading Instructions - - Memory Instructions - - Stack Instructions - - Interrupt Instruction - - Input/Output Instructions - - Arithmetics - - Binary Operations - - Bitwise Instructions - - Bit Instructions - - Comparison - - Flow Control - - Flow Control for String Operations - - Jump Instructions - - Cycle Instructions - - Flag Instruction - - Binary Coded Decimal - STACK MANAGEMENT - FLOATING POINT INSTRUCTIONS - INLINE ASSEMBLY
================================================================== i386 Architecture 31 - 3 2 1-0 Selector Index Table PL Table 0 = gdt 1 = ldt Segment descriptor 63 - 56 55 54 53 52 51 - 48 47 46-45 44-40 39 - 16 15 - 0 Data Base_H G B 0 Avl Limit_H P DPL 10EWA Base_L Limit_L Code Base_H G D 0 Avl Limit_H P DPL 11CRA Base_L Limit_L System Base_H G x 0 Avl Limit_H P DPL 0xxxx Base_L Limit_L TSS Base_H G 0 0 Avl Limit_H P DPL 010b1 Base_L Limit_L G = granularity B = big D = default E = expand downward W = writable A = accessed C = conforming R = readable b = busy P = present (P=1) 63 - 48 47 46-45 44-40 39-37 36-32 31 - 16 15 - 0 Call gate Offset_H P DPL 01100 0000 count selector offset_L Trap gate unused P DPL 00101 unused selector unused
================================================================== Memory and IO Ports 386 Memory Map 00000-003FF Interrupt descriptors: four bytes for each interrupt, 63 - - 48|47 46-45 44-40|39-37 36-32| 31 - - 16|15 - - 0 offset high P DPL flags 0 reserved selector offset low DPL (descriptor privilege level): the interrupt handler is executed only if the current privilege level is greater or equal to the DPL Kernel DPL is 00, user DPL is 11. The flag (type) is 11110 interrupt gate 01110 trap gate 00101 task gate Interrupts are identified by a number between 0 and 255. 0x00 - 0x1F exceptions and non-maskable interrupts 0x20 - 0x2F maskable interrupts 0x30 - 0xFF software interrupts 0-7 80x86 interrupts 8-F 8259 interrupts 10-1F BIOS interrupts 20-2F DOS (???) 30-4F Device and assignable interrupts Linux reassigns the interrupts, writing the address of the handler in the IDT, (in parenthesis the DOS interrupts) 80x86 INT 0 Divide by zero INT 1 Single Step (Debug) INT 2 NMI Non-Maskable Interrupt INT 3 Breakpoint INT 4 Overflow INT 5 Boundary verificaction (Print Screen) INT 6 Invalid Op. Code (reserved) INT 7 Device not available (reserved) 8259 INT 8 Double fault (Timer 18.2 Hz) INT 9 Coprocessor segment overrun (BIOS Keyboard Interrupt) INT A TSS not valid (reserved) INT B Segment not present (reserved) INT C Stack exception (reserved) INT D General protection (reserved) INT E Page Fault (Disk Interrupt) INT F reserved BIOS INT 10 Calcul error with float virgul (Video Interrupt) INT 11 Alignment check (Equipment Check) INT 12 Machine check (Memory Check) INT 13 Disk i/o INT 14 RS232 Interrupt INT 15 unused INT 16 Keyboard INT 17 Printer INT 18 ROM BASIC INT 19 Bootstrap load INT 1A Time of day INT 1B Control on keyboard break INT 1C Control on timer interrupt INT 1D Video init. table pointer INT 1E Disk param. table pointer INT 1F ASCII char. gener. table pointer Devices INT 33 Mouse Interrupt Some functions of the mouse interrupt (int 33). The function code is passed in EAX, parameters in the other registers, return values are in registers. function description parameters return --------------------------------------------------------------------------- 0x00 initialize - AX != 0 if o.k. BX number of buttons 0x01 incr. display cnt - - 0x02 decr. display cnt - - 0x03 get pos. and btns - BX btn status (0=L, 1=R, 2=C) CX:DX h:v coord 0x04 moveto CX:DX h:v pos - 0x05 press status BX btn to check AX btns status BX press count CX:DX h:v pos of last press 0x06 release status BX btn to check similar as above 0x07 set h limits CX:DX min:max - 0x08 set v limits CX:DX min:max - 0x09 set graphic shape BX:CX h:v hotsp - ES:DX mask seg:off 0x0A set text shape BX type (0=sw 1=hw) - CX screen mask / start scanline DX cursor mask / end scanline 0x0B move since last call - CX:DX h:v move 0x0C set handler CX event mask AX event mask ES:DX handler BX btns status CX:DX h:v coord 0x2a status + hotspot BX:CX h:v off of hotspot ---------------------------------------------------------------------------- 00400-0040F RS232 addresses of COM1, COM2, COM3, COM4 Example: if COM1=0378 then 0408=0x78 and 0409=0x03 00413 System memory size (units of 0x400=1 KB) 00417 Keyboard Flag Register: the bit-map (1=on 0=off) is Insert Caps Num Scroll Alt Ctrl LShift RShift 0041A Head of Keyboard Buffer (next char to get) 0041C Tail of Keyboard Buffer (next unused position) 0041E Keyboard Buffer contains 15 usable entries, each consisting of two bytes for ASCII/scancode 0043F Drive Motor State: bit-map (1 for running) is anyone ... ... ... Drive_D Drive_C Drive_B Drive_A 0046C Timer (dword) incremented by IRQ0 (int 8) 00475 Number of HardDisks on the system 00480 Pointer to Start of Keyboard Buffer (offset by 0x0400) 00482 Pointer to End of Keyboard Buffer 04DB9 Start of Device drivers, Buffers, File control entries, if in Low Memory 053F0 Start of resident COMMAND.COM (???) 00700-9FFFF Program and Data area 7C000 Boot record A0000-AFFFF EGA/VGA Video Graphics Buffer Area B0000-B0FFF Monochrome Video Buffer Area B8000-B8FA0 Color Text Screen Area C8000 Address of BIOS init routine F0000-FFFFF ROM BIOS information area FFFF0 Power on Reset Vector points to the address CS:IP when the PC is first turned on. a call to this vector will reboot the system FFFFE Model Value: FF=PC, FE=XT, FC=AT ================================================================== 386 System Descriptor Table 0 reserved 1 available 16-bit TSS Sys segment 2 LDT Sys segment 3 active 16-bit TSS Sys segment 4 16-bit call gate Gate 5 Task gate Gate 6 16-bit interrupt gate Gate 7 16-bit trap gate Gate 8 reserved 9 available 32-bit TSS Sys segment 10 reserved 11 active 32-bit TSS Sys segment 12 32-bit call gate Gate 13 reserved 14 32-bit interrupt gate Gate 15 32-bit trap gate Gate ================================================================== 386 IO Ports Map 0x0020 8259 Port Address (used to signal end-of-interrupt when 0x20 is written to it) 0x0021 8259 Interrupt Mask Register (1:disabled 0:enabled) Masks for interrupts 0-7 (bit j to int j): IRQ0 System Timer IRQ1 Keyboard IRQ2 Secondary I/O Channel IRQ3 COM2:COM4 IRQ4 COM1:COM3 IRQ5 LPT2 Parallel Printer (I/O Channel available for user) IRQ6 Floppy Controller (DO NOT USE) IRQ7 LPT1 Parallel Printer 0x0040 8253 Programmable Interval Timer Channel-0 A counter that counts from 0 to 65535 approx. 55 times every msec. This port is routed through IRQ0 to INT8 0x0041 8253 Programmable Interval Timer Channel-1 Interrupts the direct memory access controller as part of the memory refresh cycle. DO NOT MODIFY IT. 0x0042 8253 Programmable Interval Timer Channel-2 Used to specify the speaker period. 0x0060 Keyboard Scancode Keyboard is interfaced to this port: each key-press writes a scancode to it: Scancode Table (Partial) 1 Escape 11 0 21 Y 31 S 41 ~ 51 , 2 1 12 - 22 U 32 D 52 . 3 2 13 = 23 I 33 F 53 / 4 3 14 BackSp. 24 O 34 G 44 Z 5 4 15 Tab 25 P 35 H 45 X 6 5 16 Q 26 [ 36 J 46 C 7 6 17 W 27 ] 37 K 47 V 57 Space 8 7 18 E 28 Enter 38 L 48 B 9 8 19 R 39 ; 49 N 10 9 20 T 30 A 40 ' 50 M 0x0061 8255 Interface Channel bit-7 keyboard enable (0) bit-6 keyboard clicking on (1) bit-4 RAM parity error enable (0) bit-1 speaker enable (1) bit-0 8253 Channel-2 clock enable (1) 0x0200-0x0207 Joystick Controller (0:pressed, 1:no-contact) 0x0201 bits (j:joystick, b:button) 7 6 5 4 3 2 1 0 jB_b2 jB_b1 jA_b2 jA_b1 jB_y jB_x jA_y JA_x pinuot (15-pin female D-shell): 1, 8, 9, 15: power (+5 Volt) 4, 5, 12: ground 2:jA_b1 7:jA_b2 10:jB_b1 14:jB_b2 3:jA_x 6:jA_y 11:jB_x 13:jB_y 0x0278-0x027F Possible locations of parallel printer 0x02F8-0x02FF Secondary Asynchr. Communication Adapter (COM2) 0x0378-0x037F Possible locations of parallel printer 0x0378 Data byte: bits 7 6 5 4 3 2 1 0 pins 9 8 7 6 5 4 3 2 0x0379 Printer Status: bits (pins in parenthesis) 7(11-i):busy (0) 6(10-i):ack (1) 5(12-i):out-of-paper (1) 4(13-i):SLCT online (1) 3(15-i):error (1) 2:IRQ status (not a pin) 1:reserved 0:reserved 0x037A bits (pins in parenthesis) 4:enable parallel port IRQ (ie, low-to-high on ACK) (1), 3(17-o):SLCT_IN printer reads output (1) 2(16-o):initialize printer (0) 1(14-o):auto line feed (1) 0(1-io):STROBE (0) 0x03B8-0x03BF Possible locations of parallel printer 0x03F8-0x03FF Primary Asynchr. Communication Adapter (COM1) 0x03F8 RS232 Transmit/Receive buffer: store char to transmit here (if bit 7 of port 3FB=0) also receive char here (if bit 7 of port 3FB=0) LSB of baud rate divisor (if bit 7 of port 3FB=1) 0x03F9 RS232 Interrupt Enable Register MSB of baud rate divisor (if bit 7 of port 3FB=1) 0x03FB RS232 Line Control Register bit-7 Divisor Latch Access Bit (1) bit-6 Break Enabled (1) bit-5 Parity Enabled (1) bit-4 Parity 0:odd, 1:even bit-3 Parity 0:none, 1:parity bit-2 Stop bits 0:one 1:two bit-1,0 bits/char 00:five, 01:six, 10:seven, 11:eight 0x03FC Modem Control Register bit-1 Activate RTS (1) bit-0 Activate DTR (1) 0x03FD Line Status Register bit-6 Empty transmit shift register (1) bit-5 Empty transmit hold register (1) bit-4 Break error received (1) bit-3 Framing error received (1) bit-2 Parity error received (1) bit-1 Overrun error received (1) bit-0 Data received (1) 0x03FE Modem Status Register bit-7 Receive line signal detected (1) bit-6 Ring indicator (1) bit-5 Data Set Ready (1) bit-4 Clear To Send (1) bit-3 Change in line signal detect (1) bit-2 Change in ring indicator (1) bit-1 Change in Data Set Ready (1) bit-0 Change in Clear To Send (1) 0:output data to printer(1) ================================================================== BIOS Interrupts 0x10 Video Functions AX=0x00 Set Display Function as AL=0000.0000 Black/White 40x25 AL=0000.0001 16-color text 40x25 Al=0000.0010 B/W 80x25 AL=0000.0011 16-color 80x25 AL=0000.0100 320x200 CGA 4-color graphics AL=0000.0110 640x200 CGA B/W AL=0000.1001 320x200 MCGA 16-color AL=0001.0000 640x350 EGA 16-color AL=0001.0010 640x480 VGA 16-color AL=0001.0011 320x200 VGA 256-color AX=0x02 Set Cursor Position BH=Page_Number, DH=Row, DL=Column AX=0x03 Get Cursor Position BH=Page_Number, DH<=Row, DL<=Column AX=0x06 Ckear Screen / Scroll Window AL=N.of_Scroll_Lines, BH=Attr.for_New_Lines, CH=Upper_Left_Row, CL=Upper_Left_Column DH=Lower_Right_Row, DL=Lower_Right_Column AX=0x08 Read Character at Cursor BH=Page_Number, AH<=Attribute, AL<=ASCII_code AX=0x09 Write Attribute and Character at Cursor BH=Page_Number, BL=Attribute, AL=ASCII_code AX=0x0A Write Character(s) at Cursor BH=Page_Number, BL=Color, AL=ASCII_code, CX=N.of_Characters AX=0x0B Set Background Color BH=0x00, BL=Color AX=0x0C Set Pixel BH=Page, AL=Color, CX=Column, DX=Row AX=0x0D Read Pixel BH=Page, AL<=Color, CX=Column, DX=Row AX=0x0F Get Video Mode BH<=Page, AH<=N.of_Columns, AL<=Display_Mode AX=0x12 BL=0x10 Configuration Information BH<={0:color, 1:mono}, BL<=VGA_Memory{0:64K, 1:128K, 2:192K, 3:256K} AX=0x12 BL=0x32 Enable/Disable video AL={0:enable, 1:disable}, AL<=0x12 (on success) Video colors 0 black 1 blue 2 green 3 cyan 4 red 5 magenta 6 brown 7 white 8 gray 9 light blue A light green B light cyan C light red D light magenta E yellow F bright white 0x14 RS232 Functions AX=0x00 Set Parameters on Serial Port DX={0:COM1, 1:COM2}, AL=Parameters as follows: bit7-5 (baud): 010-> 300 011-> 600 100->1200 101->2400 110->4800 111->9600 bit4-3 (parity) 00->None 01->Odd 11->Even bit2 (stop) 0->1 1->2 bit1-0 (bits) 10->7 11->8 AX=0x01 Transmit Character AL=Char_to_Transmit, AH<=(bit7=1 in unable to send) AX=0x02 Receive Character AL<=Char_received, AH<=(greater that zero if error) AX=0x03 Get Line and Modem control Status AH<= bit7 timeout bit6-0 see port 3FD AL<= bit7-0 see port 3FE 0x16 Keyboard Functions AX=0x00 Read Character (pause until key is pressed) AH<=scancode, AL<=ASCII_code AX=0x01 Keyboard Status (no pause) AH<=Keyboard scancode (if ZF=0) AL<=Keyboard ASCII_code (if ZF=0) AX=0x02 Keyboard Flag Status AL<=Keyboard Flags : bits Insert Caps Num Scroll Alt Ctlr LShift RShift AX=0x03 Set Repeat Rate AL=0x05 BH=Delay before first repeat: 0:250ms, 1:500ms, 2=750ms, 3=1000ms BL=Repeat Rate : eg. 0:30.0, 1:26.7, 2:24.0, ..., 30:2.1, 31:2.0 0x19 Reboot Function ... 0x33 Mouse Function AX=0x03 Button Press Ax=0x11 Mouse Movement CX={neg.:left, pos.:right}, DX={neg.:up, pos.:down}
================================================================== Directives Assembly Directives ; colon starts a single line comment /* starts a multiline comment (can't be nested) */ .file "..." ; refers to the source file (useless) .version "..." .copyright "..." .include "file" ; includes "file" in the compilation ; this is equivalent to compile several files together, ; it is useful for including header files .block # ; reserves # blocks of space .blockz # ; also initialize them to zero .ascii "STRING" ; assemble the string literal(s) in consecutive addresses .asciz "STRING" ; also 0-terminate each string .data: ; starts a data section .data SUBSECT ; the args that follow are assembled at the end of SUBSECT .align # ; defines the byte alignment in the section, eg, 4 .balign # ; does a boundary alignment .type name, @object .size name, # label: ; a colon may immediately follow a label definition .size name, label-name .ident "..." name: .long value ; initializes the object "name" to "value" of type ".long" name: .string "..." ; initializes the object "name" to "..." of type "string" .byte EXPR ; one-byte value expression(s) .word EXPR .short EXPR ; two-byte integer .hword EXPR ; same as .short .int EXPR .long EXPR ; same as .int .float NUMS .single NUMS ; same as .float .double NUMS .string EXPR ; ascii string .octa BIGNUMS .quad BIGNUMS .section NAME, FLAGS, @TYPE ; start a section ; FLAGS is optional (a: alloce, w: write, x: exec) ; TYPE is optional (@progbits: has data, @nobits: no data) ; (COFF has different FLAGS, and no TYPE) .struct EXPR ; switch to absolute section and set offset to EXPR .text SUBSECT: ; starts the code section ; if SUBSECT is omitted the number 0 is used .align # ; specify the alignment for the current section .align #, P ; can also say the padding value (space if unspecified) .globl name ; define the routine "name" as global (available to "ld") .type name, @function .lcomm SYMBOL, LEN ; allocates LEN bytes for the local SYMBOL ; it goes in bss (is zeroed) and is not visible by "ld" name: ... ; here goes the code label: .size name, label-name .ident "..." .if EXPR ; process the following if EXPR is not zero .ifeq .ifne .ifge .ifgt .ifle .iflt .ifeqs .ifc .ifnc .ifdef .ifndef .ifnotdef .elseif .else .endif ; end of if directive .irp SYMBOL, VALUES ; evaluates the following statements assigning ; each VALUE to SYMBOL in turn .irpc SYMBOL, VALUES ; uses all the chars of the VALUES .rept NUM ; repeat the following statements NUM times .endr .equ SYMBOL, EXPR ; set SYMBOL to the value of EXPR .set SYMBOL, EXPR ; same .equiv SYMBOL, EXPR ; same but "as" signal error if SYMBOL already def. .internal ; define symbol visibility .hidden .protected .fill RPT, SZ, VAL ; produces RPT times of SZ bytes (max 8, def. 1) ; with value VAL (def. 0) .skip SIZE, FILL ; if FILL is omitted, space is used .space SIZE .func NAME: ; begin of function (for debug info) ; all functions have return type void .endfunc ; end of function .common SYMBOL, LEN ; define a symbol common (exported by "ld") ; can also say the alignment as third arg .err ; error .print MESSAGE ; print message .fail EXPR ; generates (and print) warn or error .macro ; defines a macro .exitm ; early exit from macro definition .purgem MACRO ; undef a macro .end ; end of assembly, "as" does not process further Other directives (practically useless): .abort ; makes "as" to stop the compilation .def NAME ; define debug info for symbol NAME (COFF only) .dim ; (compiler generated) .size ; aux. debug info .tag STRUCTNAME ; aux. debug info .type INT .val ADDR .endef ; end define .desc NAME, EXPR ; set the descriptor of symbol NAME to the value of EXPR ; (a.out only) .eject ; eject listing .extern ; ignored: "as" takes all undefined symbols as extern .ident ; ignored .lflags ; ignored .line EXPR ; set line number (a.out and COFF) .ln EXPR ; same .linkonce .list ; enable listing .nolist ; disable listing .psize ROWS, COLS ; set page size .title TITLE .sbttl TITLE ; use as subtitle on the listing .mri VAL ; if VAL is not zero enter MRI mode .org NEW_LC ; advance location counter .p2align ... ; pad location counter to boundary alignment .scl CLASS ; storage class (COFF) NOTES [1] a routine named "main" must be present in any executable, as the runtime _start expects so. [2] 386 assembly has the following units of storage: bit nibble = 4 bits byte = 8 bits word = 2 bytes = 16 bits long = 4 bytes = 32 bits (an integer) paragraph = 16 bytes = 128 bits page = 16 paragraphs = 2048 bits N.B. the size of the word can vary on some systems. [3] Invocation of assembly routine from C code assumes that - the routine parameters are pushed on the stack, starting with the last parameter (see the section "Stack Management"); - the return value is in the register EAX. ====================================================================== Other i386 Directives .code16 ; move "as" to 16-bit protected real-mode .code16gcc ; same but using 32-bit addressing and stack operations .code32 ; move "as" to 32-bit mode .arch CPU_TYPE ; specify the cpu (iX86, pentium, pentiumpro, k6, athlon)
================================================================== Register Set Intel Register Set The processor manipulates data that are in its registers. The following registers are available on Intel platforms: 32-bit registers: eax ebx ecx edx edi (destination index) esi (source index) ebp (frame pointer) esp (stack pointer) these are general purpose registers and can be used by the applications 16-bit registers (low ends): ax bx cx dx di si bp sp 8-bit registers (high/low parts): ah bh ch dh al bl cl dl intruction pointer register: eip flag register (extended 32 bits, lower 16 bits are usual flags) segment registers (8-bit): cs (code segment) ds (data segment) ss (stack segment) es fs gs (extra segment) control registers: cr0 Paging, Coprocessor type, Task switch, Emulate coproc., Math, Protection cr1 reserved cr2 page fault linear address cr3 page directory base register debug registers: db0 db1 db2 db3 db6 db7 test registers: tr6 tr7 floating-point 80-bit registers: st(0) thru st(7) arranged in a stack with st(0) at the top (ie, entry slot) these registers are overloaded by the 8 MMX registers mm0 thru mm7 floating point 16-bit control register: fc floating point 16-bit status register: fs floating point ... floating point ... floating point ... there are also 8 SSE registers registers: xmm0 thru xmm7 ====================================================================== Flag Register Flag-Register: FEDC BA98 7654 3210 **** **** **Id *LVR 0Npp ODIT SZ0A 0P1C Flags --------- ----------- OF Overflow SF Sign ---------- AL Alignment NT Nested-task DF Address ZF Zero PF Parity VM Virtual-mem pp I/O prot. IF Interrupt ---------- ---------- RF Resume pp I/O prot. TF Trace AC Auxiliary CF Carry ZF=1 if a&b=0x00 ZF=0 is the result of the last instruction affecting ZF was not 0 SF=1 if a&b has bit-7 set SF is set to 1 by certain instructions if the result is negative PF=1 if a&b has even number of bits set CF=1 depends of the arithmetic operation CF=1 usually denotes an arithmetic over/underflow OF=1 AF=1 DF=1 user settable with cld, std IF=1 hardware interrupt ====================================================================== Floating point registers FP Control: **** rrpp **PU OZDI where: rr round: 00 to nearest or even, 01 down 10 up 11 truncate pp precision: 00 24-bits (IEEE float 32) 01 reserved 10 53-bits (IEEE double 64) 11 64-bits (IEEE 80) exceptions: P precision U underflow O overflow Z zero divide D denormalize I invalid op. an interrupt is raised if a floating point condition occurs and the corresponding bit is set FP status: BCSS SCCC EFPU OZDI 3 210 B busy C coprocessor condition codes S top of stack: phys_reg = top_stack + logical_reg E Exception flag (set if any error condition bit (UOZDI) is set F Stack fault (overflow C1=1, or underflow C1=0) P Precision U Underflow O Overflow Z Zero divide D Denormalization I Inavlid operation UOZDI are set if the corresponding condition exists. they are set and cleared independenty of the control mask Floating point values in normalized form have the h.o. bit of the mantissa (the bit before the point) to 1. FPU does not store this bit. However, values close to zero must be stored "denormalized".
====================================================================== Instruction Set The assembly language is specified by its instruction set, ie, by the commands that can be given to the processor. Each instruction consists of an opcode (code of the operation) followed by the operands. Certain opcodes have no operand. Thus the instruction format is operation source destination In AT&T syntax the first operand is the "source" and the second is the "destination". This is different from Intel syntax which refers to the operands the opposite way, "operation destination source". The source is unaffected, the destination is usually affected by the instruction (in some cases the instruction does not change the destination). Memory references are immed_32(base_pointer, index_pointer, scale) where scale can be 1, 2, 4, 8. The actual index is immed_32 + base_pointer + index_pointer * scale C variable are referenced in asm by prefixing them with underscore (_), for example _array(%eax) refers to the content of the element (whose index is is eax) of the C variable "array". Another example, _var refers to the C variable "var". Opcodes are qualified by a type character: b, w, l for byte, word and double word (long), respectively. Groups of instructions: - MOVE and TRANSFER (mov) - REGISTER EXCHANGE (xchg) - CONVERSION (cbw, cwd, cwde, cdq) - LOADING and STORING (lods, lahf, lea, lds, les, lfs, lgs, lss, stos, xlat) - STACK (push, pop, pusha, popa, pushf, popf) - INTERRUPT (int, int3) - INPUT/OUTPUT (in, out, ins, outs) - ARITHMETIC (add, sub, mul, div, adc) - BIT OPS (bt, bsr bsf, btc, btr, bts) - BINARY (and, or, xor, not, neg) - BITWISE (shr, shl, rol, ror, rcl, rcr) - COMPARISON (cmp, test) - FLOW CONTROL, JUMP and LOOP (call, ret, iret, loop, jmp) - FLAG (cli, sti) - BCD (daa, das, aaa, aas, aam, aad, fbld, fbstp) - STRING (movs, lods, scas, lods, stos) - SYSTEM ( lmsw, smsw, hlt, arpl, lar, lsl, verr, verw, lldt, sldt, lgdt, sgdt, lidt, sidt, ltr, str, clts, wait ) - OTHER (rdtsc, rdmsr, ...) Prefixes: rep lock 0x66 0x67 esc (five bits: ...)
============================================================== MOVE AND TRANSFER MOVE INSTRUCTIONS (COPSZ affected) movl %ebx, %eax //move the content of EBX in EAX movl $0x00d8, %eax //move the immediate value 0x00d8 in EAX movw %ax, %bx movl (%edx, %ecx, 4), %eax //moves memory[EDX+4*ECX] in EAX //scale can be 1,2,4,8 movl 4(%edx, %ecx, 1), %eax //can specify an offset before the parenthesis movb %ah, %al movl (%esp), %edx movw %ax, 8(%esp, %ecx, 4) movzx // move and zero h.o. bits (move with zero extension) movsx // move with sign extension (??? "move string" see String Instructions) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Example (1): outreg.s A assembly routine that prints out the content pushed on the stack. This routine uses also other instructions, explained later. /* outreg.s */ .data .section .rodata .Lout: .string "Data %02x %02x %02x %02x \n" .text .align 4 .globl outreg .type outreg,@function outreg: pushl %ebp /* push Frame Pointer on the stack */ movl %esp, %ebp /* save Stack Pointer in Frame Pointer */ push %eax /* save EAX: avoid side effects */ movl 8(%ebp), %eax /* move the parameter in the register EAX */ andl $0x000000ff, %eax /* Least Significant Byte */ pushl %eax movl 8(%epb), %eax shrl $8, %eax /* shift 8 bit to the right */ andl $0x000000ff, %eax /* Second Least Significant Byte */ pushl %eax movl 8(%epb), %eax shrl $16, %eax andl $0x000000ff, %eax /* Third Byte */ pushl %eax movl 8(%epb), %eax shrl $24, %eax andl $0x000000ff, %eax /* Most Significal Byte */ pushl %eax pushl $.Lout /* push the (address of the) format string */ call printf addl $20, %esp /* 5 args * 4 = 20 bytes */ popl %eax /* restore register EAX */ movl %ebp, %esp popl %ebp ret Lfoutreg: .size outreg,Lfoutreg-outreg In order to test this routine you can write a simple test program like this one: /* test_outreg.c */ #include int main() { long i; for (i=0xabcd; i<0xbfff; i+= 0x1111) { outreg( i ); } exit(0); } - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Example (2): inreg.s Input of a number from the stdin. /* inreg.s */ .data .section .rodata .Lout: .string "%x" .Lprompt: .string "Enter a number: " .text .align 4 .globl inreg .type inreg,@function inreg: pushl %ebp /* push Frame Pointer on the stack */ movl %esp, %ebp /* save Stack Pointer in Frame Pointer */ subl $4, %esp /* local variable on the stack */ pushl $.Lprompt /* push the (address of the) format string */ call printf /* warn: the ret value of printf is in EAX */ addl $4, %esp movl %esp, %eax /* EAX = address of local variable */ pushl %eax pushl $.Lout /* push the (address of the) format string */ call scanf addl $8, %esp popl %eax /* pop value of local variable in EAX */ movl %ebp, %esp popl %ebp ret Lfinreg: .size inreg,Lfinreg-inreg And to test this routine, /* test_inreg.c */ #include int main() { long j; do { j = inreg(); printf(" You entered %x\n", j); } while ( j != 0 ); exit(0); } - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - More on input. Suppose to have a local integer variable, ie, an int allocated on the stack, say at -4(%ebp), and to want to read a value into it. The following does this .section .rodata .LC0: .string "%d" .text leal -4(%ebp),%eax /* load effective address in EAX */ pushl %eax pushl $.LC0 call scanf addl $8,%esp ------------------------------------------------------------------ TRANSFER INSTRUCTIONS (no falg affected) movsb // moves a sequence of bytes from ES:SI to DS:DI // increasing ESI, EDI if DF=0, and decreasing them if DF=1 movsw For an example see the "Cycle Instruction" section.
================================================================== REGISTER EXCHANGE REGISTER EXCHANGE (no flag affected) xchgl %eax, %ebx Example: (3) ... [ TODO ]
================================================================== CONVERSION CONVERSION (COPSZ affected) cbw // convert a byte in AL to a word in AX by extending the sign-bit cwd // convert word in AX to double-word in DX:AX cwde // convert word in AX to double-word in EAX cdq // convert double-word in EAX to quad-word in EDX:EAX - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Example (4) We consider a simple main that utilizes the routine "outreg.s" to show the effect of the instruction "cbw". .text .align 4 .globl main .type main,@function main: pushl %ebp movl %esp, %ebp movl $0x00fedcba, %eax pushl %eax call outreg /* prints "Data 0 fe dc ba" */ popl %eax /* pop the value of EAX from the stack */ cbw pushl %eax call outreg /* prints "Data 0 fe ff ba" */ addl $4, %esp movl %ebp, %esp popl %ebp pushl $0 call exit .Lfmain: .size main,.Lfmain-main
================================================================== LOADING AND STORING LOADING INSTRUCTIONS (no flag affected) // for loading sequences of bytes into registers lods // load EAX (from address ESI), // increment or decrement ESI, depending of DF lahf // load flag register to AH as SZ*A.*P*C lea // load address of source operator (Load Effective Address) lds // load address of source to register and address of source+2 to DS les lfs lgs lss // same with ES, FS, GS, SS lms // load 16-bit reg/mem to machine status word (CR0) // can be used to switch to protected-mode (must be followed by an // intrasegment jump to flush the instruction queue) // Better use 'mov' to move to/from CR's ltr // load reg/mem to task register Notice. movl $label, %eax // EAX contains the address of the "label" statement leal label, %ecx // ECX contains the address of the "label" statement movl label, %ebx // EBX contains the value of the label statement For example, if 0x8048420