Homeworks academic service


Everything you need to know about x86 architecture

This is most easily done on Linux harder but possible on Windows. Here is a sample function in assembly language: At the very least, being able to compile code gives you a way to verify that your assembly programs are syntactically correct. The underlying concepts are still the same in both cases, but the notation is a bit different. Basic execution environment An x86 CPU has eight 32-bit general-purpose registers.

Other CPU architectures would simply name them r0, r1. Each register can hold any 32-bit integer value. The x86 architecture actually has over a hundred registers, but we will only cover specific ones when needed.

  • Finally we looked briefly at the stack, calling convention, advanced instructions, virtual memory address translation, and differences in the x86-64 mode;
  • However, the rules on when it is capable of doing two actions at once known as pairing are very complicated.

As a first approximation, a CPU executes a list of instructions sequentially, one by one, in the order listed in the source code. Later on, we will see how the code path can go non-linearly, covering concepts like if-then, loops, and function calls. There are actually eight 16-bit and eight 8-bit registers that are subparts of the eight 32-bit general-purpose registers.

These features come from the 16-bit era of x86 CPUs, but still have some occasional use in 32-bit mode. Whenever the value of a 16-bit or 8-bit register is modified, the upper bits belonging to the full 32-bit register will remain unchanged.

Basic arithmetic instructions The most basic x86 arithmetic instructions operate on two 32-bit registers. The first operand acts as a source, and the second operand acts as both a source and destination. Many instructions fit this important schema — for example: A few arithmetic instructions take only one register as an argument, for example: The bit shifting and rotation instructions take a 32-bit register for the value to be shifted, and the fixed 8-bit register cl for the shift count.

Many arithmetic instructions can take an immediate value as the first operand. The immediate value is fixed not variableand is coded into the instruction itself.

  1. Also, each process could have its own unique set of pages, and never see the contents of other processes or the operating system kernel.
  2. In particular, it is possible to jump to the value of a register.
  3. The simplest thing you can do with memory is to read or write a single byte. There is one special instruction that uses memory addressing but does not actually access memory.
  4. The xchg instruction automatically obeys the previous rules whenever it exchanges a value with memory.
  5. The call instruction is like jmp, except that before jumping it first pushes the next instruction address onto the stack.

An aside Now is a good time to talk about one principle in assembly programming: Not every desirable operation is directly expressible in one instruction. In typical programming languages that most people use, many constructs are composable and adaptable to different situations, and arithmetic can be nested. In assembly language however, you can only write what the instruction set allows. To illustrate with examples: When performing bit shifting, the shift count must be either a hard-coded immediate value or the register cl.

It cannot be any other register. If the shift count was in another register, then the value needs to be copied to cl first. Flags register and comparisons There is a 32-bit register named eflags which is implicitly read or written in many instructions.

In other words, its value plays a role in the instruction execution, but the register is not mentioned in the assembly code. Arithmetic instructions such as addl usually update eflags based on the computed result. Some instructions read the flags — for example adcl adds two numbers and uses the carry flag as a third operand: Some instructions directly affect a single flag bit, such as cld clearing the direction flag DF. Comparison instructions affect eflags without changing any general-purpose registers.

Most of the time, the instruction after a comparison is a conditional jump covered later. So far, we know that some flag bits are related to arithmetic operations. Other flag bits are concerned with how the CPU behaves — such as whether to accept hardware interrupts, virtual 8086 mode, and other system management stuff that is mostly of concern to OS developers, not to application developers.

Why Intel x86 must die: Our cloud-centric future depends on open source chips

For the most part, the eflags register is largely ignorable. The system flags are definitely ignorable, and the arithmetic flags can be forgotten except for comparisons and bigint arithmetic operations.

Memory addressing, reading, writing The CPU by itself does not make a very useful computer. When storing a value longer than a byte, the value is encoded in little endian. When reading values from memory, the same rule applies — the bytes at lower memory addresses get loaded into the lower parts of a register. It should go without saying that the CPU has instructions to read and write memory.

Specifically, you can load or store one or more bytes at any memory address you choose. The simplest thing you can do with memory is to read or write a single byte: Next, many arithmetic instructions can take one memory operand never two. This is probably easier to illustrate than describe: The memory addressing modes are valid wherever a memory operand is permitted.

Also note that the address being computed is a temporary value that is not saved in any register. This is good because if you wanted to compute the address explicitly, you would need to allocate a register for it, and having only 8 GPRs is rather tight when you want to store other variables.

X86 Architecture

There is one special instruction that uses memory addressing but does not actually access memory. The leal load effective address instruction computes the final memory address according to the addressing mode, and stores the result in a register. Note that this is entirely an arithmetic operation and does not involve dereferencing a memory address.

Jumps, labels, and machine code Each assembly language instruction can be prefixed by zero or more labels. These labels will be useful when we need to jump to a certain instruction.

Here is a simple infinite loop: Conditional jump instructions include: There are 16 of them in all, and some have synonyms — e.

An example of using conditional jump: In particular, it is possible to jump to the value of a register: Now is a perfect time to discuss a concept that was glossed over in section 1 about instructions and execution. Each instruction in assembly language is ultimately translated into 1 to 15 bytes of machine code, and these machine instructions are strung together to create an executable file.

The CPU has a 32-bit register named eip extended instruction pointer which, during program execution, holds the memory address of the current instruction being executed. Note that there are very few ways to read or write the eip register, hence why it behaves very differently from the 8 main general-purpose registers. Whenever an instruction is executed, the CPU knows how many bytes long it was, and advances eip by that amount so that it points to the next instruction.

Writing machine code by hand is very unpleasant I mean, assembly language is unpleasant enough alreadybut there are a couple of minor capabilities gained. By writing machine code, you can encode some instructions in alternate ways e.

0. Introduction

The stack The stack is conceptually a region of memory addressed by the esp register. The x86 ISA has a number of instructions for manipulating the stack. Although all of this functionality can be achieved with movl, addl, etc.

In x86, the stack grows downward, from larger memory addresses toward smaller ones. The stack is important for function calls.

  • All data prefetches are abandoned;
  • Many arithmetic instructions can take an immediate value as the first operand.

The call instruction is like jmp, except that before jumping it first pushes the next instruction address onto the stack. Also, the standard C calling convention puts some or all the function arguments on the stack. Accessing these two registers is awkward because they cannot be used in typical movl or arithmetic instructions.

Calling convention When we compile C code, it is translated into assembly code and ultimately machine code. The calling convention applies to a C function calling another C function, a piece of assembly code calling a C function, or a C function calling an assembly function.

It does not apply to a piece of assembly code calling an arbitrary piece of assembly code; there are no restrictions in this case. On 32-bit x86 on Linux, the calling convention is named cdecl. Actually, esi and edi increment if the direction flag is 0; otherwise they decrement if DF is 1. Examples of other string instructions include cmpsb, scasb, stosb. A string instruction can be modified with the rep prefix see also repe and repne so that it gets executed ecx times with ecx decrementing automatically.

They represent some of the mindset of the CISC design, where it is normal for programmers to code directly in assembly, so it provides higher level features to make the work easier. But the modern solution is to write in C or an even higher level language, and rely on a compiler to generate the tedious assembly code. As for SSE, a 128-bit xmm register can be interpreted in many ways depending on the instruction being executed: For example, one SSE instruction would copy 16 bytes 128 bits from memory into an xmm register, and one SSE instruction would add two xmm registers together treating each one as eight 16-bit words in parallel.

The idea behind SIMD is to execute one instruction to operate on many data values at once, which is faster than operating on each value individually because fetching and executing every instruction incurs a certain amount everything you need to know about x86 architecture overhead.

A cautious programmer might choose to prototype a program using scalar operations, verify its correctness, and gradually convert it to use the faster SSE instructions while ensuring it still computes the same results. Virtual memory Up to now, we assumed that when an instruction requests to read from or write to a memory address, it will be exactly the address handled by the RAM. But if we introduce a translation layer in between, we can do some interesting things.

  1. An aside Now is a good time to talk about one principle in assembly programming. Generally speaking, the experience is better because there are more registers to work with, and a few minor unnecessary features have been removed.
  2. There are 16 of them in all, and some have synonyms — e. Finally we looked briefly at the stack, calling convention, advanced instructions, virtual memory address translation, and differences in the x86-64 mode.
  3. Basic execution environment An x86 CPU has eight 32-bit general-purpose registers. Basic arithmetic instructions The most basic x86 arithmetic instructions operate on two 32-bit registers.
  4. While issuing the instruction, the CPU will have exclusive access to the bus.
  5. The notation for most modes can be deduced without much difficulty.

This concept is known as virtual memory, paging, and other names. The basic idea is that there is a page table, which describes what each page block of 4096 bytes of the 32-bit virtual address space is mapped to. Or for example, the same virtual address 0x08000000 could be mapped to a different page of physical RAM in each application process that is running. Also, each process could have its own unique set of pages, and never see the contents of other processes or the operating system kernel.

The concept of paging is mostly of concern to OS writers, but its behavior sometimes affects the application programmer so they should be aware of its existence. Note that the address mapping need not be 32 bits to 32 bits.

For example, 32 bits of virtual address space can be mapped onto 36 bits of physical memory space PAE. Or a 64 bit virtual address space can be mapped onto 32 bits of physical memory space on a computer with only 1 GiB of RAM. Elsewhere on the web there are plenty of articles and reference materials to explain all the differences in detail.