SlideShare a Scribd company logo
Assembler




            1
System Software

• components
     – translator
         • assembler
         • compiler
         • interpreter
     – system manager
         • operating system
     – other utilities
         • loader
         • linker
         • DBMS, editor, debugger, ...
•   purpose of this course
     – understand how to build system software
     – understand how these components work




                                                 2
Issues in System Software

•   not many in this area
     – mature area
•   advanced architectures complicates system software
     – superscalar CPU
     – memory model
     – multiprocessor
•   new applications
     – embedded systems
     – mobile/ubiquitous computing




                                                         3
Assembler Overview

•   functions
     – translate programs written in assembly language to machine code
          • mnemonic code to machine code
          • symbols to addresses
     – handles
          • constants
          • literals
          • addressing
•   32 bit constant or address
•   32 bit offset




                                                              4
Assembler Overview (cont’d)
•   pass 1: loop until the end of the program
     1. read in a line of assembly code
     2. assign an address to this line
           • increment N (word addressing or byte addressing)
     3. save address values assigned to labels
           • in symbol tables
     4. process assembler directives
           • constant declaration
           • space reservation
•   pass2: same loop
     1. read in a line of code
     2. translate op code
           using op code table
     3. change labels to address
           using the symbol table
     4. process assembler directives
     5. produce object program

                                                                5
Data Structures for Assembler

                               add $t0, $t1, $t2   000000 01001 01010 01000 00000 100000

•   op code table
     – looked up for the translation of mnemonic code
         • key: mnemonic code
         • result: bits
     – hashing is usually used
         • once prepared, the table is not changed
         • efficient lookup is desired
         • since mnemonic code is predefined, the hashing function can
            be tuned a priori
     – the table may have the instruction format and length
         • to decide where to put op code bits, operands bits, offset bits
         • for variable instruction size
         • used to calculate the address



                                                                  6
Data Structures for Assembler (cont’d)
                                                       .text
                                                       .globl main
•   symbol table                               main:
                                                       la      $t0, array
     – stored and looked up to assign                  lw      $t1, count
       address to labels                               lw      $t2, ($t0)
                                               loop:
         • efficient insertion and retrieval           lw   $t3, 4($t0)
            is needed                                  ble  $t3, $t2, loop2
         • deletion does not occur                     move $t2, $t3

     – difficulties in hashing                 loop2: add $t1, $t1, -1
                                                      add $t0, $t0, 4
         • non random keys                            bnez $t1, loop
     – problem
                                                       …
         • the size varies widely                      ….
                                                       .data
                                               array:          .word 3, 5, 5, 1, 6, 7, …..
                                               count:          .word 15
                                               string1:        .asciiz “nmax = “




                                                                  7
Symbol Table Construction


        .text
        .globl main                           symbol name       value
main:                                            main            0
        la      $t0, array
        lw      $t1, count                       loop            12
        lw      $t2, ($t0)
loop:                                            loop2           24
        lw   $t3, 4($t0)
        ble  $t3, $t2, loop2                      …
        move $t2, $t3
                                                 array          408
loop2: add $t1, $t1, -1                          count          468
       add $t0, $t0, 4
       bnez $t1, loop                           string1         472
        …                                        bad            478
        ….
        .data
array:          .word 3, 5, 5, 1, 6, 7, …..
count:          .word 15
string1:        .asciiz “nmax = “
bad:            .word 7


                                                            8
Assembler Algorithm: pass1
begin
   if starting address is given
       LOCCTR = starting address;
   else
       LOCCTR = 0;
   while OPCODE != END do                ;; or EOF
       begin
       read a line from the code
       if there is a label
              if this label is in SYMTAB, then error
              else insert (label, LOCCTR) into SYMTAB
       search OPTAB for the op code
       if found
              LOCCTR += N ;; N is the length of this instruction (4 for MIPS)
       else if this is an assembly directive
              update LOCCTR as directed
       else error
       write line to intermediate file
       end
   program size = LOCCTR - starting address;
end



                                                            9
Assembler Algorithm: pass2
begin
   read a line;
   if op code = START then ;; .globl xxx for MIPS
       write header record;
   while op code != END do ;; or EOF
       begin
       search OPTAB for the op code;
       if found
              if the operand is a symbol then
                      replace it with an address using SYMTAB;
               assemble the object code;
       else if is a defined directive            add $t0, $t1, $t2 =>
              convert it to object code;         000000 01001 01010 01000 00000 100000
       add object code to the text;
       read next line;
       end
   write End record to the text;
   output text;
end




                                                             10
Program Relocation
                                 0         .
                                           .
                .                    jump to 1004          1004
                .                          .
                              1076                         5000         .
          jump to 1004                                                  .
                .                                                 jump to 1004
                                                                        .
                                                           6076



                              program is loaded at 0       program is loaded at 5000

•   motivations for relocation
     – a program may consists of several pieces of codes that are assembled
       independently
     – when a program is assembled, it is impossible to know the exact location
       where the program starts




                                                                11
Program Relocation (cont’d)

•   distances from the origin of a program do not change
     – make the address relative to the origin
     – provides loader with information about
          • which address needs fixing
          • length of address field
     – the loader change those addresses as
          • distance + start address of a program
     – only absolute addresses need to be changed




                                                           12
Literals

•   usage
     – encoded as an operand (similar to the immediate in MIPS, but different)
          • load $7, =X’0A7F’
     – simple way to declare a constant
     – assembler does
          • declare a constant with a label
          • use the label to use the value
•   comparison with immediate
     – literal is an assembler directive
          • immediate is a machine recognizable data
     – full word can be used for literals
          • immediate: full word – (opcode, registers)
     – values are obtained from data memory - slow
          • immediate data is within the instruction itself




                                                                13
Literals (cont’d)

•   literal pool
      – assembler collects all the literals into one or more literal pools
      – default location is at the end of the program
           • for better code reading
      – programmer can declare a place (LTORG)
           • to use PC-relative addressing
           • to keep data close to instruction
•   optimization
      – make one literal for the same value
           • compare character string or value?
               – x’454F46’ = c’EOF’
           • value comparison needs evaluation
•   literal table
      – name(label), operand value, operand length, address in the table
      – name and value are all used as a key


                                                                     14
Literal Handling Algorithm

pass 1
   at a recognition of a literal
       search LITTAB by name
       if found but different value, error
       else if the same value, no action
       else if not found insert a new literal (no address yet)
   if the code is LTORG or END
       allocate each literal assigning an address

pass 2
   replace each literal with the address in the LITTAB
   if these addresses are absolute,
       prepare modification for relocation




                                                        15
Symbol Defining Statement

•   MAXLEN            EQU 4096
     – makes program structure better
     – easier to modify a single location
     – easier to remember than numbers
     – registers can be given meaningful names
     – (maxlen = 4096) in MIPS
•   assembler
     – searches SYMTAB and replace the symbol with the value in the table
     – resulting object code is the same as using the value instead of symbol
     – remember that with 2 passes there is restriction
               X    EQU Y
               Y    EQU 100
          • X cannot be defined in pass 1




                                                                 16
Expressions

BUFFER: .space 4096            ; reserve 4096 bytes here
BUFEND:            ; set current location to BUFFEND
(MAXLEN = BUFEND – BUFFER) ; calculate the size of the buffer



•   allows simple arithmetic operations in symbol definition
•   operands may have relative values for relocation
     – relative values should be modified by the loader later
          • we need to know which is relative
     – symbol table needs a type field to discern absolute symbols from relative
        symbols




                                                                17
Expression Rules


•   basic
     – constant is absolute
     – address is relative
•   using expressions
     – expression with absolute arguments is absolute
     – expression that has multiplication and division is absolute
     – relative_1 - relative_2 is absolute
          • dependencies on starting address are canceled out
     – all the other expressions having relative terms are neither relative nor
        absolute (error?)
          • constant - relative
          • relative_1 + relative_2
          • 3 x relative_1




                                                                   18
Program Blocks

source                     object code

block 0
                             block 0
block 1
              assembled
block 2
block 0                      block 1

block 1
                             block 2
block 2




                                       19
Program Blocks (cont’d)

•   motivation
     – programmer’s view may be different from machine’s view
          • affects only efficiency not functionality
     – addressing can be simplified
          • large data area can be moved to the end of code while source code places
            it close to the instructions that use this data
•   data structure and algorithm
     – block table (name, block number, address, length)
     – pass 1
          • maintain separate LOCCTR for each block
              – each label is assigned address relative to the start of the block that contains it
         • SYMTAB stores block number for each symbol
         • store starting address of each block in block table
     – pass 2
         • assign address to each symbol by adding the relative address to the block
           starting address

                                                                          20
Control Sections

•   control section is a part of program that can be assembled independent of
    other parts
     – a large problem can be divided into many control sections
     – each control section can be developed independently
     – each control section can be modified independently
•   symbols defined in other control sections
     – called external
     – assembler prepares those symbols
     – loader & linker resolves the value of external symbols




                                                             21
Control Sections (cont’d)

•   a table prepared by assembler
     – define record
          • name of symbol defined in this control section
          • relative address of the symbol
     – refer record
          • name of external symbols
     – modification record
          • starting address of field to be modified
          • length of this field
          • name of external symbol
•   loader
     – for every external symbol
          • find the relative address from the define record
          • add the starting address of the control section where the symbol is defined
          • modify the field


                                                                  22
One-Pass Assembler

•   problem
     – forward reference: reference to symbols that are not defined yet
•   why do we need one-pass assembler?
     – fast
          • useful for program development and testing
          • university computing environment
•   load-and-go assembler
     – writes the object code on memory not on disk file
     – since it is on memory it is easy to modify a part of object code




                                                                23
One-Pass Assembler (cont’d)

•   one-pass assembler for load-and-go
     – stores undefined symbols in the SYMTAB with the address of the field that
        references this symbol
     – when the symbol is defined later, look up the SYMTAB and modify the field
        with correct address
          • there may be many places to be modified
•   what if object code is written on disk?
     – bring back the text to memory
          • efficiency of one-pass assembler cannot be justified
     – make loader to modify the address at loading time
          • modification record again
•   optimization
     – require all the data declaration be placed at the beginning of the program
          • reduces reference resolution




                                                              24
Multi-Pass Assembler
  •   support forwarding reference even though it is bad for program readability

                        at 1, store in a table two tuples
                           (A, 1, B/2, 0)
                           1: one symbol is missing
1.(A = B/2)                0: no other symbol depends on A
2.(B = C-D)                (B, *, , &LB)
   ....                    *: don’t know how many symbols missing yet
8. C .....
9. D ..…                   LB: list of symbols that depend on B (now, there is only A in this list)
                        at 2,
                           insert (C,*, ,&LC), (D,*, ,&LD)
                                    LC and LD contains only B
                           modify (B,*, ,&LB) as (B,2,C-D,&LB)
                        after 8
                           from LC, B is found
                           change 2 to 1 in the B tuple meaning one symbol remains to be defined
                        after 9
                           from LD, B is found
                           now evaluate B with defined C, D values
                        since B is done
                           from LB, A is found
                           now A can be evaluated
                                                                       25

More Related Content

PPT
Assembler
manpreetgrewal
 
PDF
Introduction to systems programming
Mukesh Tekwani
 
PPT
Introduction to Compiler Construction
Sarmad Ali
 
PPT
Symbol table management and error handling in compiler design
Swati Chauhan
 
PPTX
System software - macro expansion,nested macro calls
SARASWATHI S
 
PPTX
Unit 4 sp macro
Deepmala Sharma
 
PPTX
Exception handling c++
Jayant Dalvi
 
PPTX
Language processing activity
Dhruv Sabalpara
 
Assembler
manpreetgrewal
 
Introduction to systems programming
Mukesh Tekwani
 
Introduction to Compiler Construction
Sarmad Ali
 
Symbol table management and error handling in compiler design
Swati Chauhan
 
System software - macro expansion,nested macro calls
SARASWATHI S
 
Unit 4 sp macro
Deepmala Sharma
 
Exception handling c++
Jayant Dalvi
 
Language processing activity
Dhruv Sabalpara
 

What's hot (20)

PDF
Learning Python with PyCharm EDU
Sergey Aganezov
 
PPTX
Directed Acyclic Graph Representation of basic blocks
Mohammad Vaseem Akaram
 
PPT
Interpreters & Debuggers
Malek Sumaiya
 
PDF
Introduction to the LLVM Compiler System
zionsaint
 
PPTX
Different types of Symmetric key Cryptography
subhradeep mitra
 
PPTX
Generating code from dags
indhu mathi
 
PPT
Intermediate code generation (Compiler Design)
Tasif Tanzim
 
PPTX
COCOMO Model in software project management
Syed Hassan Ali
 
PDF
Type conversion in Compiler Construction
Muhammad Haroon
 
PPTX
Introduction to HTML5 and CSS3 (revised)
Joseph Lewis
 
PPT
Message Authentication Code & HMAC
Krishna Gehlot
 
PPT
Assembler
Temesgen Molla
 
PPTX
Three address code In Compiler Design
Shine Raj
 
PPT
Methods in C#
Prasanna Kumar SM
 
PPTX
Coding standards and guidelines
brijraj_singh
 
PDF
Lexical Analysis - Compiler design
Aman Sharma
 
PDF
Operator precedence
Akshaya Arunan
 
PPT
Assembler design option
Mohd Arif
 
PDF
Principles of-programming-languages-lecture-notes-
Krishna Sai
 
PDF
Xml schema
Prabhakaran V M
 
Learning Python with PyCharm EDU
Sergey Aganezov
 
Directed Acyclic Graph Representation of basic blocks
Mohammad Vaseem Akaram
 
Interpreters & Debuggers
Malek Sumaiya
 
Introduction to the LLVM Compiler System
zionsaint
 
Different types of Symmetric key Cryptography
subhradeep mitra
 
Generating code from dags
indhu mathi
 
Intermediate code generation (Compiler Design)
Tasif Tanzim
 
COCOMO Model in software project management
Syed Hassan Ali
 
Type conversion in Compiler Construction
Muhammad Haroon
 
Introduction to HTML5 and CSS3 (revised)
Joseph Lewis
 
Message Authentication Code & HMAC
Krishna Gehlot
 
Assembler
Temesgen Molla
 
Three address code In Compiler Design
Shine Raj
 
Methods in C#
Prasanna Kumar SM
 
Coding standards and guidelines
brijraj_singh
 
Lexical Analysis - Compiler design
Aman Sharma
 
Operator precedence
Akshaya Arunan
 
Assembler design option
Mohd Arif
 
Principles of-programming-languages-lecture-notes-
Krishna Sai
 
Xml schema
Prabhakaran V M
 
Ad

Similar to Assembler (20)

PPTX
Dr.C S Prasanth-Physics ppt.pptx computer
kavitamittal18
 
PPTX
MIPS Architecture
Dr. Balaji Ganesh Rajagopal
 
PPTX
Lecture 2 coal sping12
Rabia Khalid
 
PDF
Creating a Fibonacci Generator in Assembly - by Willem van Ketwich
Willem van Ketwich
 
PDF
lec6_mips-instructions-III.pdf21rewrwaef
TheBreaker8
 
PDF
Fuzzing - Part 1
UTD Computer Security Group
 
PDF
Return Oriented Programming
UTD Computer Security Group
 
PDF
Cs4hs2008 track a-programming
Rashi Agarwal
 
PDF
02 isa
marangburu42
 
PPT
MIPS instruction set microprocessor lecture notes
RevathiSoundiran1
 
PDF
Python
AyushRawat160694
 
PPT
Symbol Table, Error Handler & Code Generation
Akhil Kaushik
 
PPTX
C language
Robo India
 
PPT
Assembly language
Piyush Jain
 
PDF
Online Analytics with Hadoop and Cassandra
Robbie Strickland
 
PDF
Theperlreview
Casiano Rodriguez-leon
 
PDF
Hash Functions FTW
sunnygleason
 
PDF
Cache aware hybrid sorter
Manchor Ko
 
PPT
Advance ROP Attacks
n|u - The Open Security Community
 
Dr.C S Prasanth-Physics ppt.pptx computer
kavitamittal18
 
MIPS Architecture
Dr. Balaji Ganesh Rajagopal
 
Lecture 2 coal sping12
Rabia Khalid
 
Creating a Fibonacci Generator in Assembly - by Willem van Ketwich
Willem van Ketwich
 
lec6_mips-instructions-III.pdf21rewrwaef
TheBreaker8
 
Fuzzing - Part 1
UTD Computer Security Group
 
Return Oriented Programming
UTD Computer Security Group
 
Cs4hs2008 track a-programming
Rashi Agarwal
 
02 isa
marangburu42
 
MIPS instruction set microprocessor lecture notes
RevathiSoundiran1
 
Symbol Table, Error Handler & Code Generation
Akhil Kaushik
 
C language
Robo India
 
Assembly language
Piyush Jain
 
Online Analytics with Hadoop and Cassandra
Robbie Strickland
 
Theperlreview
Casiano Rodriguez-leon
 
Hash Functions FTW
sunnygleason
 
Cache aware hybrid sorter
Manchor Ko
 
Ad

More from Mohd Arif (20)

PPT
Bootp and dhcp
Mohd Arif
 
PPT
Arp and rarp
Mohd Arif
 
PPT
User datagram protocol
Mohd Arif
 
PPT
Project identification
Mohd Arif
 
PPT
Project evalaution techniques
Mohd Arif
 
PPT
Presentation
Mohd Arif
 
PPT
Pointers in c
Mohd Arif
 
PPT
Peer to-peer
Mohd Arif
 
PPT
Overview of current communications systems
Mohd Arif
 
PPT
Overall 23 11_2007_hdp
Mohd Arif
 
PPT
Objectives of budgeting
Mohd Arif
 
PPT
Network management
Mohd Arif
 
PPT
Networing basics
Mohd Arif
 
PPT
Loaders
Mohd Arif
 
PPT
Lists
Mohd Arif
 
PPT
Iris ngx next generation ip based switching platform
Mohd Arif
 
PPT
Ip sec and ssl
Mohd Arif
 
PPT
Ip security in i psec
Mohd Arif
 
PPT
Intro to comp. hardware
Mohd Arif
 
PPT
Heap sort
Mohd Arif
 
Bootp and dhcp
Mohd Arif
 
Arp and rarp
Mohd Arif
 
User datagram protocol
Mohd Arif
 
Project identification
Mohd Arif
 
Project evalaution techniques
Mohd Arif
 
Presentation
Mohd Arif
 
Pointers in c
Mohd Arif
 
Peer to-peer
Mohd Arif
 
Overview of current communications systems
Mohd Arif
 
Overall 23 11_2007_hdp
Mohd Arif
 
Objectives of budgeting
Mohd Arif
 
Network management
Mohd Arif
 
Networing basics
Mohd Arif
 
Loaders
Mohd Arif
 
Lists
Mohd Arif
 
Iris ngx next generation ip based switching platform
Mohd Arif
 
Ip sec and ssl
Mohd Arif
 
Ip security in i psec
Mohd Arif
 
Intro to comp. hardware
Mohd Arif
 
Heap sort
Mohd Arif
 

Recently uploaded (20)

PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PDF
Software Development Methodologies in 2025
KodekX
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
The Future of Artificial Intelligence (AI)
Mukul
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Software Development Methodologies in 2025
KodekX
 

Assembler

  • 2. System Software • components – translator • assembler • compiler • interpreter – system manager • operating system – other utilities • loader • linker • DBMS, editor, debugger, ... • purpose of this course – understand how to build system software – understand how these components work 2
  • 3. Issues in System Software • not many in this area – mature area • advanced architectures complicates system software – superscalar CPU – memory model – multiprocessor • new applications – embedded systems – mobile/ubiquitous computing 3
  • 4. Assembler Overview • functions – translate programs written in assembly language to machine code • mnemonic code to machine code • symbols to addresses – handles • constants • literals • addressing • 32 bit constant or address • 32 bit offset 4
  • 5. Assembler Overview (cont’d) • pass 1: loop until the end of the program 1. read in a line of assembly code 2. assign an address to this line • increment N (word addressing or byte addressing) 3. save address values assigned to labels • in symbol tables 4. process assembler directives • constant declaration • space reservation • pass2: same loop 1. read in a line of code 2. translate op code using op code table 3. change labels to address using the symbol table 4. process assembler directives 5. produce object program 5
  • 6. Data Structures for Assembler add $t0, $t1, $t2 000000 01001 01010 01000 00000 100000 • op code table – looked up for the translation of mnemonic code • key: mnemonic code • result: bits – hashing is usually used • once prepared, the table is not changed • efficient lookup is desired • since mnemonic code is predefined, the hashing function can be tuned a priori – the table may have the instruction format and length • to decide where to put op code bits, operands bits, offset bits • for variable instruction size • used to calculate the address 6
  • 7. Data Structures for Assembler (cont’d) .text .globl main • symbol table main: la $t0, array – stored and looked up to assign lw $t1, count address to labels lw $t2, ($t0) loop: • efficient insertion and retrieval lw $t3, 4($t0) is needed ble $t3, $t2, loop2 • deletion does not occur move $t2, $t3 – difficulties in hashing loop2: add $t1, $t1, -1 add $t0, $t0, 4 • non random keys bnez $t1, loop – problem … • the size varies widely …. .data array: .word 3, 5, 5, 1, 6, 7, ….. count: .word 15 string1: .asciiz “nmax = “ 7
  • 8. Symbol Table Construction .text .globl main symbol name value main: main 0 la $t0, array lw $t1, count loop 12 lw $t2, ($t0) loop: loop2 24 lw $t3, 4($t0) ble $t3, $t2, loop2 … move $t2, $t3 array 408 loop2: add $t1, $t1, -1 count 468 add $t0, $t0, 4 bnez $t1, loop string1 472 … bad 478 …. .data array: .word 3, 5, 5, 1, 6, 7, ….. count: .word 15 string1: .asciiz “nmax = “ bad: .word 7 8
  • 9. Assembler Algorithm: pass1 begin if starting address is given LOCCTR = starting address; else LOCCTR = 0; while OPCODE != END do ;; or EOF begin read a line from the code if there is a label if this label is in SYMTAB, then error else insert (label, LOCCTR) into SYMTAB search OPTAB for the op code if found LOCCTR += N ;; N is the length of this instruction (4 for MIPS) else if this is an assembly directive update LOCCTR as directed else error write line to intermediate file end program size = LOCCTR - starting address; end 9
  • 10. Assembler Algorithm: pass2 begin read a line; if op code = START then ;; .globl xxx for MIPS write header record; while op code != END do ;; or EOF begin search OPTAB for the op code; if found if the operand is a symbol then replace it with an address using SYMTAB; assemble the object code; else if is a defined directive add $t0, $t1, $t2 => convert it to object code; 000000 01001 01010 01000 00000 100000 add object code to the text; read next line; end write End record to the text; output text; end 10
  • 11. Program Relocation 0 . . . jump to 1004 1004 . . 1076 5000 . jump to 1004 . . jump to 1004 . 6076 program is loaded at 0 program is loaded at 5000 • motivations for relocation – a program may consists of several pieces of codes that are assembled independently – when a program is assembled, it is impossible to know the exact location where the program starts 11
  • 12. Program Relocation (cont’d) • distances from the origin of a program do not change – make the address relative to the origin – provides loader with information about • which address needs fixing • length of address field – the loader change those addresses as • distance + start address of a program – only absolute addresses need to be changed 12
  • 13. Literals • usage – encoded as an operand (similar to the immediate in MIPS, but different) • load $7, =X’0A7F’ – simple way to declare a constant – assembler does • declare a constant with a label • use the label to use the value • comparison with immediate – literal is an assembler directive • immediate is a machine recognizable data – full word can be used for literals • immediate: full word – (opcode, registers) – values are obtained from data memory - slow • immediate data is within the instruction itself 13
  • 14. Literals (cont’d) • literal pool – assembler collects all the literals into one or more literal pools – default location is at the end of the program • for better code reading – programmer can declare a place (LTORG) • to use PC-relative addressing • to keep data close to instruction • optimization – make one literal for the same value • compare character string or value? – x’454F46’ = c’EOF’ • value comparison needs evaluation • literal table – name(label), operand value, operand length, address in the table – name and value are all used as a key 14
  • 15. Literal Handling Algorithm pass 1 at a recognition of a literal search LITTAB by name if found but different value, error else if the same value, no action else if not found insert a new literal (no address yet) if the code is LTORG or END allocate each literal assigning an address pass 2 replace each literal with the address in the LITTAB if these addresses are absolute, prepare modification for relocation 15
  • 16. Symbol Defining Statement • MAXLEN EQU 4096 – makes program structure better – easier to modify a single location – easier to remember than numbers – registers can be given meaningful names – (maxlen = 4096) in MIPS • assembler – searches SYMTAB and replace the symbol with the value in the table – resulting object code is the same as using the value instead of symbol – remember that with 2 passes there is restriction X EQU Y Y EQU 100 • X cannot be defined in pass 1 16
  • 17. Expressions BUFFER: .space 4096 ; reserve 4096 bytes here BUFEND: ; set current location to BUFFEND (MAXLEN = BUFEND – BUFFER) ; calculate the size of the buffer • allows simple arithmetic operations in symbol definition • operands may have relative values for relocation – relative values should be modified by the loader later • we need to know which is relative – symbol table needs a type field to discern absolute symbols from relative symbols 17
  • 18. Expression Rules • basic – constant is absolute – address is relative • using expressions – expression with absolute arguments is absolute – expression that has multiplication and division is absolute – relative_1 - relative_2 is absolute • dependencies on starting address are canceled out – all the other expressions having relative terms are neither relative nor absolute (error?) • constant - relative • relative_1 + relative_2 • 3 x relative_1 18
  • 19. Program Blocks source object code block 0 block 0 block 1 assembled block 2 block 0 block 1 block 1 block 2 block 2 19
  • 20. Program Blocks (cont’d) • motivation – programmer’s view may be different from machine’s view • affects only efficiency not functionality – addressing can be simplified • large data area can be moved to the end of code while source code places it close to the instructions that use this data • data structure and algorithm – block table (name, block number, address, length) – pass 1 • maintain separate LOCCTR for each block – each label is assigned address relative to the start of the block that contains it • SYMTAB stores block number for each symbol • store starting address of each block in block table – pass 2 • assign address to each symbol by adding the relative address to the block starting address 20
  • 21. Control Sections • control section is a part of program that can be assembled independent of other parts – a large problem can be divided into many control sections – each control section can be developed independently – each control section can be modified independently • symbols defined in other control sections – called external – assembler prepares those symbols – loader & linker resolves the value of external symbols 21
  • 22. Control Sections (cont’d) • a table prepared by assembler – define record • name of symbol defined in this control section • relative address of the symbol – refer record • name of external symbols – modification record • starting address of field to be modified • length of this field • name of external symbol • loader – for every external symbol • find the relative address from the define record • add the starting address of the control section where the symbol is defined • modify the field 22
  • 23. One-Pass Assembler • problem – forward reference: reference to symbols that are not defined yet • why do we need one-pass assembler? – fast • useful for program development and testing • university computing environment • load-and-go assembler – writes the object code on memory not on disk file – since it is on memory it is easy to modify a part of object code 23
  • 24. One-Pass Assembler (cont’d) • one-pass assembler for load-and-go – stores undefined symbols in the SYMTAB with the address of the field that references this symbol – when the symbol is defined later, look up the SYMTAB and modify the field with correct address • there may be many places to be modified • what if object code is written on disk? – bring back the text to memory • efficiency of one-pass assembler cannot be justified – make loader to modify the address at loading time • modification record again • optimization – require all the data declaration be placed at the beginning of the program • reduces reference resolution 24
  • 25. Multi-Pass Assembler • support forwarding reference even though it is bad for program readability at 1, store in a table two tuples (A, 1, B/2, 0) 1: one symbol is missing 1.(A = B/2) 0: no other symbol depends on A 2.(B = C-D) (B, *, , &LB) .... *: don’t know how many symbols missing yet 8. C ..... 9. D ..… LB: list of symbols that depend on B (now, there is only A in this list) at 2, insert (C,*, ,&LC), (D,*, ,&LD) LC and LD contains only B modify (B,*, ,&LB) as (B,2,C-D,&LB) after 8 from LC, B is found change 2 to 1 in the B tuple meaning one symbol remains to be defined after 9 from LD, B is found now evaluate B with defined C, D values since B is done from LB, A is found now A can be evaluated 25