• <ins id="pjuwb"></ins>
    <blockquote id="pjuwb"><pre id="pjuwb"></pre></blockquote>
    <noscript id="pjuwb"></noscript>
          <sup id="pjuwb"><pre id="pjuwb"></pre></sup>
            <dd id="pjuwb"></dd>
            <abbr id="pjuwb"></abbr>
            posts - 71,  comments - 41,  trackbacks - 0
            ? 2003 by Charles C. Lin. All rights reserved.

            Background

            You should know what UB and 2C representation is. You should also know about sign-extension.

            ISA

            When you started learning how to program, you were told that your program had to be compiled. That is, it had to be converted from a high-level language into a low-level language. For C and C++, the low-level language is basically machine code.

            An ISA defines the machine and assembly code used by a CPU.

            ISA stands for "Instruction Set Architecture". Effectively, the ISA is the programmer's view of the computer.

            An ISA consists of:

            • The instruction set This is the set of instructions supported. This is the part that's usually called the assembly language.
            • The register set This is the set of registers you can use. (There are other hidden registers which you can't use directly. They are used indirectly, however).
            • The address space This is the set of memory addresses that can be used by your program.

            The ISA is basically a hardware specification. It's the view of the hardware as seen by an assembly language programmer.

            The ISP (instruction set processor) is an implementation of the ISA. There may be many implementations for a given ISA. For example, IA32 is the instruction set architecture for x86 processors. Intel has the Pentium and Celeron lines of CPUs that implement this ISA. AMD also has its own CPUs that implement the ISA. Each implementation is different, but they all run code written in IA32.

            Why You Need to Know About Instructions

            We study instruction sets because that's what CPUs process. They run one instruction after another. In order to understand how a computer works, you need to know what instructions are, and more importantly, how to write them.

            There are two ways to write instructions. Either you can write them in assembly language, which is human-readable. Or you can write them in machine code, which is basically, 0's and 1's. CPUs process machine code, but humans usually program in assembly language.

            You need to know both, in order to understand how a CPU works.

            The MIPS ISA

            There are two ways to write instructions. You can write it in assembly language, which is human readable, or you can write it in machine code, which is 0's and 1's. For MIPS32, each machine code instruction is a 32-bit bitstring.

            The "32" in MIPS32 refers to the size of the registers (i.e., how many bits each register holds) and to the number of bits used in an address. There is also a MIPS64, which has 64 bit addresses and 64 bit registers.

            The MIPS32 architecture contains 32 general purpose int registers. The registers are named $r0, $r1, ..., $r31. Each register can store 32 bits. Most of the times the registers either store signed or unsigned ints. However, sometimes they store addresses, and occasionally ASCII characters, etc.

            MIPS also has 32 floating point registers, but we won't worry about them too much.

            Unlike programming languages where you can declare as many variables as you want, you can't create any more registers. The number of registers doesn't change.

            MIPS32 allows you to access data in memory using 32 bit addresses. In principle, you can access up to 232 different addresses, using 32 bits. In practice, some of those addresses may be invalid. For example, the CPU may simply not have that much memory (232 addresses is 4 GB). Thus, you might be able to generate the 32-bit address, but there may be nothing stored at that address (an error usually occurs when you access an invalid address).

            In MIPS, nearly all registers are general purpose. You can classify ISAs into those that use general purpose registers (i.e., instructions can refer to any register---all registers perform the same operations) or special purpose (certain instructions can only be used on specific, i.e., not all, registers).

            However, there is at least one exception. $r0 is not general purpose. It is hardwired to 0. No matter what you do to this register, it always has a value of 0. You might wonder why such a register is needed in MIPS.

            The designers of MIPS used benchmarks (programs used to determine the performance of a CPU), which convinced them that having a register hardwired to 0 would improve the performance (speed) of the CPU as opposed to not having it. Not everyone agrees a register hardwired to 0 is essential, so not all ISAs have a zero register.

            Assembly vs. Machine Code

            CPUs process binary bitstrings. These bitstrings are really instructions, encoded in 0's and 1's. When people began to write programs for computers, they wrote it in binary. That is, programs were written in 0's and 1's. The code probably looked something like this:

            0000 0000 0101 1000 0000 0000 0101 1000
            1010 1101 0000 1011 1000 1100 1001 0110
            

            This is called machine code.

            As you might imagine, machine code was difficult to read and difficult to debug. The amount of time wasted trying to find whether you had accidentally written a 0 instead of a 1, lead to the invention of assembly language.

            Assembly language is a somewhat more human-readable version of machine code. For example, assembly code might look something like:

            add   $r2, $r3, $r4
            addi  $r2, $r3, -10
            

            While you may not understand the code above, you've certainly got a much better chance of figuring it out than the machine code equivalent. Each line of assembly code contains an instruction. Each instruction tells the computer one small task to accomplish. Instructions are the building blocks of programs.

            CPUs can't handle assembly code directly. Instead, assembly code is translated to machine code. If this sounds like compiling, that's because it basically is comiling. However, people usually call the process of translating assembly to machine code assembling, instead of compiling.

            You'll write code in assembly, and learn how to translate some instructions from assembly to machine code. It's very important that you understand the machine code, because that's what the CPU processes. Furthemore, by studying machine code, you get to see how information is encoded into 0's and 1's, and you get to see how the CPU uses these binary values to execute the instruction in hardware.

            Encoding Registers

            In the previous set of notes, we talked about how many bits you needed to create N different labels. We assume each label has k bits long.

            You need k = ceil( lg N ) bits to uniquely label N items.

            MIPS32 has 32 integer registers. We want to label each register by a number, so instructions can refer to registers by number. Since MIPS has 32 registers, you need ceil( lg 32 ) = 5 bits.

            If we think of the 5 bit numbers as unsigned binary numbers, then the registers are numbered from 0 up to 31, inclusive. In fact, that's exactly how MIPS numbers its registers. Registers are numbered from $r0 up to $r31. The binary equivalent are numbered from 00000 to 11111.

            In assembly language, you'd write $r6. In machine code, you'd write the same register as: 00110. In assembly language, you'd write $r30. In machine code, you'd write 11110.

            This is important because we're going to use register encoding in the machine language instructions for MIPS. Recall that machine code is a 32-bit bitstring. When we refer to registers within the instruction, it's going to be using the 5 bit binary numbers written in UB (unsigned binary).

            What is an instruction?

            An assembly language instruction is basically a function call. Like C functions, assembly language instructions have a fixed number of arguments. You can't add or remove the number of arguments.

            Like C functions, arguments of assembly language instructions have type. Or at least, something that resembles type. Basically, there are 4 kinds of "types" for MIPS.

            • Registers ($r0, $r1,..., $r31)

            • Immediates Constants, such as, 10, -20, etc. Sometimes written in hexadecimal, e.g., 0x3a.

            • Register Offset This is a constant and a register, written as -10($r3) or 214($r4). That is, you write the immediate (constant) value, then a left parenthesis, then a register, then a right parenthesis.

              The computation is performed by adding the contents of the register to the offset, usually resulting in a 32 bit address. Thus, -10($r3) is -10 added to the contents of register 3. This result is "temporary" and register 3 is not modified (just like x + y in a programming language merely adds x to y, but the sum does not change x or y

            • Labels There are identifiers to locations in memory. Generally, you write labels in uppercase letters and underscores, such as FOR_LOOP.

            For the most part, we'll only consider registers and immediate values.

            Let's consider two examples of instructions and their operands:

            • add $r2, $r3, $r4 This instruction adds the contents of register 3 and register 4, and places the result in register. It's basically R[2] = R[3] + R[4], if you pretend that the registers form an array.

            • addi $r2, $r3, -10 This instruction adds the contents of register 3 to -10 and places the result in register 2. It's basically R[2] = R[3] - 10

            The first instruction is an add instruction. add requires exactly 3 operands (arguments). Each operand must be a valid register. The operand can not be anything besides a register. In particular, you can not create expressions such as:

            # WRONG! Operands can't be expressions
            add $r2, $r3, (add $r4, $r5, $r6) 
            
            The second instruction is an addi instruction. addi must also have three operands. The first two operands must be registers, while the third one must be an integer between -215 to 215 - 1, inclusive.

            There is a reason for this restriction in value, which we will discuss momentarily.

            Unlike higher level programming languages, you can't create new registers. You're forced to use the ones available. You can't create new instructions either. You must use the ones provided in the instruction set.

            Machine Code

            A machine language instruction ususally consists of:

            • opcode This is a binary representation of the instruction. For example, an add instruction has an opcode of 000 000.
            • operands Operands means the same thing as arguments. It's older terminology usually associated with assembly/machine code instructions.

            MIPS divides instructions into three formats. Instructions are either R-type (register type), I-type (immediate type), or J-type (jump type). The types refer to the format, not to its purpose. (For example, branch instructions are I-type, because of its format, even if it would seem like it should be J-type).

            Here are the layouts of the three kinds of instructions.

            R-type Instruction

            OpcodeRegister sRegister tRegister dShift AmtFunction
            B31..26B25..21B20..16B15..11B11..6B5..0
            ooo ooossssstttttdddddaaaaaffffff

            • R-type instructions are short for "register type" instructions.
            • Bits B31..26 are used for the opcode. For R-type instructions, the opcode is almost always 000 000. Normally, this makes no sense, because every instruction should have a unique opcode. However, bits B5..0 (the function part) uses 6 bits to specify the instruction. Only R-type instruction uses a function.
            • Bits B25..21 specify a 5-bit UB encoding for the first source register.
            • Bits B20..16 specify a 5-bit UB encoding for the second source register.
            • Bits B15..11 specify a 5-bit UB encoding for the destination register. This specifies which register stores the result of the operation.
            • Bits B11..6 specify the shift amount. This is usually 00000, except for shift instructions.
            • Bits B5..0 specify a 6-bit function. Each R-type instruction has a unique 6 bit value. For example, add has a 6-bit value that's different from sub. add and sub are two different instructions.

            I-type Instruction

            OpcodeRegister sRegister tImmediate
            B31..26B25..21B20..16B15..0
            ooo ooossssstttttiiii iiii iiii iiii

            • I-type instructions are short for "immediate type" instructions.
            • Bits B31..26 are used for the opcode. Unlike R-type instructions, the 6-bit value is NOT 000 000. There is no function code for I-type instructions.
            • Bits B25..21 specify a 5-bit UB encoding for the source register.
            • Bits B20..16 specify a 5-bit UB encoding for the destination register. Although this is called register t, instead of register d, it is treated as the destination register for I-type instructions.
            • Bits B15..0 is the 16-bit immediate value. This may be a 16-bit UB number or a 16-bit 2C number. Notice that the immediate value is encoded directly into the instruction.

            J-type Instruction

            OpcodeTarget
            B31..26B25..0
            ooo ooott tttt tttt tttt tttt tttt tttt

            • J-type instructions are short for "jump type" instructions.
            • Bits B31..26 are used for the opcode. Unlike R-type instructions, the 6-bit value is NOT 000 000. There is no function code for J-type instructions.
            • Bits B25..0 are used for the offset. This is usually used to generate an address.

            Notice that the J-type instruction has no source or destination registers.

            add, an R-type instruction

            The general format for an add instruction is:
            add $rd, $rs, $rt
            
            $rd, $rs, and $rt are not real registers. They are merely place holders. For example, if we write add $r2, $r3, $r4, then for this particular example, $rd = $r2, $rs = $r3, and $rt = $r4.

            In assembly language, the instructions are written with the destination register (i.e. register d), then the first source register, (i.e. register s) then the second source register (i.e. register t).

            Note: This is NOT the same order as it is written in machine code. In assembly, it's destination, source 1, source 2. In MIPS machine code, it's written source 1, source 2, destination.

            Don't ask me why the MIPS folks did it that way. They just did.

            Let's translate the following instruction into MIPS assembly.

            add $r2, $r3, $r4
            

            For add, the opcode is 000 000. The function code is 100 000. Since the shift amount isn't used, it's set to 00000.

            We encode $r2 as 00010, $r3 as 00011, and $r4 as 00100.

            This is how the machine code equivalent looks:

            OpcodeRegister sRegister tRegister dShift AmtFunction
            B31..26B25..21B20..16B15..11B11..6B5..0
            ? $r3$r4$r2? ?
            000 00000011001000001000000100 000

            Again, notice that bits B25..21 is source 1 (i.e., $r3), then B20..16 is source 2 (i.e., $r4), then B15..11 is the destination register (i.e., $r2).

            It's important that you learn how to translate a few instructions, because the CPU manipulates the binary version of this, not the assembly version. In particular, pay attention to how the registers are encoded, and just as importantly, which bits refer to which registers.

            addi, an I-type instruction

            addi stands for add immediate. It's an I-type instruction.

            The general format for an addi instruction is:

            addi $rt, $rs, IMMED
            
            For I-type instructions, $rt is the destination register (not $rs). $rs is still the first source register. For addi, the immediate value is written in base 10 (or possibly, hexadecimal), but it eventually gets translated to 2C.

            Let's look at a specific example.

            addi $r3, $r10, -3
            

            This instruction adds the contents of register 10, to the value -3, and stores the result in register 3.

            The opcode for addi is 001 000. In 2C, you write -3ten as 1111 1111 1111 1101.

            This is how the instruction is encoded.

            OpcodeRegister sRegister tImmediate
            B31..26B25..21B20..16B15..0
            ? $r10$r3-10, represented in 2C
            001 00001010000111111 1111 1111 1101

            Again, notice that in the assembly code $r3 (i.e., the destination register) appears first, while in the machine code $r3 appears second. Also, notice that the immediate value is written in 16 bits, two's complement.

            Now that you see why it's written in 16 bits, 2C, you see why the immediate value can only be between -2-15 through 215 - 1. This is the range of valid values for 16 bit 2C.

            The assembler must translate base 10 representation to 2C representation when translating addi from assembly to machine code.

            Some instructions encode the immediate in 2C, while other instructions encode it in UB.

            Summary

            This section on instructions is not trying to teach you how to program in MIPS assembly. Instead, it's to briefly introduce you to what an instruction is, and how it is encoded.

            While it's useful to know how to program in MIPS assembly, it's isn't essential to understand how a CPU works. To understand how a CPU works, at least, initially, all you need to know is what an instruction looks like in binary, and what that individual instruction is supposed to do.

            posted on 2007-01-23 15:42 Charles 閱讀(440) 評論(0)  編輯 收藏 引用 所屬分類: 拿來主義
            <2008年8月>
            272829303112
            3456789
            10111213141516
            17181920212223
            24252627282930
            31123456

            決定開始寫工作日記,記錄一下自己的軌跡...

            常用鏈接

            留言簿(4)

            隨筆分類(70)

            隨筆檔案(71)

            charles推薦訪問

            搜索

            •  

            積分與排名

            • 積分 - 50447
            • 排名 - 449

            最新評論

            閱讀排行榜

            評論排行榜

            中文字幕热久久久久久久| 伊人久久综在合线亚洲2019| 久久精品国产精品青草app| 2021国内精品久久久久久影院| 88久久精品无码一区二区毛片 | 国产精品热久久无码av| 欧美va久久久噜噜噜久久| 久久综合色之久久综合| 精品久久久无码中文字幕天天| 精品免费久久久久国产一区| 久久男人Av资源网站无码软件| 日韩人妻无码一区二区三区久久| 久久久久国产精品人妻| 亚洲国产成人精品无码久久久久久综合| 久久免费美女视频| 久久久久久久97| 久久精品国产亚洲网站| 99热成人精品免费久久| 久久亚洲AV无码西西人体| 青青热久久国产久精品| 亚洲精品乱码久久久久久自慰| av无码久久久久久不卡网站| 99久久99久久久精品齐齐| 伊人色综合久久天天| 久久香蕉国产线看观看猫咪?v| 亚洲va中文字幕无码久久不卡| 久久成人国产精品| 国产精品成人久久久久三级午夜电影| 国产高清美女一级a毛片久久w| 三级韩国一区久久二区综合 | 久久国产精品国产自线拍免费| 久久久久免费精品国产| 久久成人精品| 亚洲va久久久噜噜噜久久男同| 一本久久久久久久| 伊人久久国产免费观看视频| 久久婷婷五月综合97色| 久久综合九色综合久99| 久久亚洲美女精品国产精品| 国产成人精品久久亚洲高清不卡 国产成人精品久久亚洲高清不卡 国产成人精品久久亚洲 | 国产免费久久精品丫丫|