SlideShare a Scribd company logo
1
The ARM Architecture
(with focus on Cortex-M3)
Joe Bungo
Applications Engineer
ARM University Program
2
Agenda
 Introduction to ARM Ltd
ARM Architecture/Programmers Model
Data Path and Pipelines
System Design
Development Tools
3
ARM Ltd
 Founded in November 1990
 Spun out of Acorn Computers
 Initial funding from Apple, Acorn and VLSI
 Designs the ARM range of RISC processor cores
 Licenses ARM core designs to semiconductor
partners who fabricate and sell to their
customers
 ARM does not fabricate silicon itself
 Also develop technologies to assist with the design-
in of the ARM architecture
 Software tools, boards, debug hardware
 Application software
 Bus architectures
 Peripherals, etc
4
ARM’s Activities
memory
SoC
Processors
System Level IP:
Data Engines
Fabric
3D Graphics
Physical IP
Software IP
Development Tools
Connected Community
5
ARM Connected Community – 700+
5
6
Huge Range of Applications
Energy Efficient Appliances
IR Fire
Detector
Intelligent
Vending
Tele-parking
Utility
Meters
Exercise
MachinesIntelligent toys
Equipment Adopting 32-bit ARM
Microcontrollers
7
World’s Smallest ARM Computer?
A CB
Wirelessly networked into large scale
sensor arrays
Battery Solar Cells
Processor, SRAM and PMU
University of Michigan
Sensors, timers
Cortex-M0 +16KB RAM 65nm
UWB Radio antenna
10 kB Storage memory
~3fW/bit
12µAh Li-ion Battery
Wireless Sensor Network
Cortex-M0; 65¢
8
World’s Largest ARM Computer?
4200 ARM powered
Neutrino Detectors
Work supported by the National Science Foundation and University of Wisconsin-Madison
70 bore holes 2.5km deep
60 detectors per string
starting 1.5km down
1km3 of active telescope
9
From 1mm3 to 1km3
1mm3 1km3
10¢ $1000
Mobile
Embedded Consumer
Mobile Computing Server
Enterprise PC
Home
HPC
10
Agenda
Introduction to ARM Ltd
 ARM Architecture/Programmers Model
Data Path and Pipelines
System Design
Development Tools
11
ARM Cortex Processors (v7)
ARM Cortex-A family (v7-A):
 Applications processors for full OS
and 3rd
party applications
ARM Cortex-R family (v7-R):
 Embedded processors for real-time
signal processing, control applications
ARM Cortex-M family (v7-M):
 Microcontroller-oriented processors
for MCU and SoC applications
Cortex-R4
Cortex-A8
SC300™
Cortex-M1
Cortex™-M3
...2.5GHz
x1-4
Cortex-A9
12k gates...
Cortex-M0
Cortex-M4
x1-4
Cortex-A5
1-2
HeronR
x1-4
Cortex-A15
12
Cortex family
Cortex-A8
 Architecture v7A
 MMU
 AXI
 VFP & NEON support
Cortex-R4
 Architecture v7R
 MPU (optional)
 AXI
 Dual Issue
Cortex-M3
 Architecture v7M
 MPU (optional)
 AHB Lite & APB
13
Relative Performance*
*Represents attainable speeds in 130, 90, 65, or 45nm processes
Cortex-
M0
Cortex-
M3
ARM7 ARM926 ARM1026 ARM1136 ARM1176 Cortex-A8
Cortex-A9
Dual-core
Max Freq (MHz) 50 150 184 470 540 610 750 1100 2000
Min Power (mW/MHz) 0.012 0.06 0.35 0.235 0.36 0.335 0.568 0.43 0.5
0
500
1000
1500
2000
2500
MaxFrequency(Mhz)
14
Data Sizes and Instruction Sets
 The ARM is a 32-bit architecture.
 When used in relation to the ARM:
 Byte means 8 bits
 Halfword means 16 bits (two bytes)
 Word means 32 bits (four bytes)
 Most ARM’s implement two instruction sets
 32-bit ARM Instruction Set
 16-bit Thumb Instruction Set
 Jazelle cores can also execute Java bytecode
15
ARM and Thumb Performance
Memory width (zero wait state)
0
5000
10000
15000
20000
25000
30000
32-bit 16-bit 16-bit with
32-bit stack
ARM
Thumb
Dhrystone 2.1/sec
@ 20MHz
16
The Thumb-2 instruction set
 Variable-length instructions
 ARM instructions are a fixed length of 32 bits
 Thumb instructions are a fixed length of 16
bits
 Thumb-2 instructions can be either 16-bit or
32-bit
 Thumb-2 gives approximately 26%
improvement in code density over ARM
 Thumb-2 gives approximately 25%
improvement in performance over
Thumb
17
Cortex-M Programmer’s Model
 Fully programmable in C
 Stack-based exception model
 Only two processor modes
 Thread Mode for User tasks
 Handler Mode for OS tasks and exceptions
 Vector table contains addresses
Process
r8
r9
r10
r11
r12
sp
lr
r15 (pc)
xPSR
r0
r1
r2
r3
r4
r5
r6
r7
Main
sp
18
ARM Cortex-M3
Application code
OS
System Call (SVCall)
Undefined Instruction
Privileged
Cortex-M3 Processor Privilege
Memory
Instructions & Data
Aborts
Interrupts
Reset
Non-Privileged
Supervisor
User
Handler Mode
Thread Mode
19
Cortex-M3 Interrupt Handling
 One Non-Maskable Interrupt (INTNMI) supported
 1-240 prioritizable interrupts supported
 Interrupts can be masked
 Implementation option selects number of interrupts supported
 Nested Vectored Interrupt Controller (NVIC) is tightly coupled with processor core
 Interrupt inputs are active HIGH
Cortex-M3
Processor Core
INTNMI
NVIC
Cortex-M3
1-240 Interrupts
INTISR[239:0]
…
20
Cortex-M3 Exception Handling
 Reset : power-on or system reset
 NMI : cannot be stopped or preempted by any exception other than reset
 Faults
 Hard Fault : default Fault or any fault unable to activate
 Memory Manage : MPU violations
 Bus Fault : prefetch and memory access violations
 Usage Fault : undef instructions, divide by zero, etc.
 SVCall : privileged OS requests
 Debug Monitor : debug monitor program
 PendSV : pending SVCalls
 SysTick Interrupt : internal sys timer, i.e., used by RTOS to periodically
check resources or peripherals
 External Interrupt : i.e., external peripherals
21
Cortex-M3 Program Status Register
 One Status Register consisting of
 APSR - Application Program Status Register – ALU flags
 IPSR - Interrupt Program Status Register – Interrupt/Exception No.
 EPSR - Execution Program Status Register
 IT field – If/Then block information
 ICI field – Interruptible-Continuable Instruction information
 xPSR
 Composite of the 3 PSRs
 Stored on the stack on exception entry
IT/ICIIT
2731
N Z C V Q
28 7
ISR Number
1623 15 0242526 10
T
22
Conditional Execution
ITTET EQ
Inst 1
Inst 2
Inst 3
Inst 4
 If – Then (IT) instruction added (16 bit)
 Up to 3 additional “then” or “else” conditions maybe specified (T or E)
 Makes up to 4 following instructions conditional
 Any normal ARM condition code can be used
 16-bit instructions in block do not affect condition code flags
 Apart from comparison instruction
 32 bit instructions may affect flags (normal rules apply)
 Current “if-then status” stored in CPSR
 Conditional block maybe safely interrupted and returned to
 Must NOT branch into or out of ‘if-then’ block
MOVEQ
ADDEQ
SUBNE
ORREQ
23
Load/Store
Miscellaneous
Classes of Instructions (v4T)
Data Operations
MOV PC, Rm
Bcc
BL
BLX
Change of Flow
24
Data processing Instructions
 Consist of :
 Arithmetic: ADD ADC SUB SBC RSB RSC
 Logical: AND ORR EOR BIC
 Comparisons: CMP CMN TST TEQ
 Data movement: MOV MVN
 These instructions only work on registers, NOT memory.
 Syntax:
<Operation>{<cond>}{S} Rd, Rn, Operand2
 Comparisons set flags only - they do not specify Rd
 Data movement does not specify Rn
 Second operand is sent to the ALU via barrel shifter.
25
Register, optionally with shift operation
 Shift value can be either be:
 5 bit unsigned integer
 Specified in bottom byte of
another register.
 Used for multiplication by constant
Immediate value
 8 bit number, with a range of 0-255.
 Rotated right through even
number of positions
 Allows increased range of 32-bit
constants to be loaded directly into
registers
Result
Operand
1
Barrel
Shifter
Operand
2
ALU
Using a Barrel Shifter:The 2nd Operand
26
Single register data transfer
LDR STR Word
LDRB STRB Byte
LDRH STRH Halfword
LDRSB Signed byte load
LDRSH Signed halfword load
 Memory system must support all access sizes
 Syntax:
 LDR{<cond>}{<size>} Rd, <address>
 STR{<cond>}{<size>} Rd, <address>
e.g. LDREQB
27
Agenda
Introduction to ARM Ltd
ARM Architecture/Programmers Model
 Data Path and Pipelines
System Design
Development Tools
28
Cortex-M3 Datapath
Register
Bank Mul/Div
Address
Incrementer
ALU
B
A
INTADDR
I_HADDR
Address
Register
Barrel
Shifter
Writeback
ALU
Read Data
Register
Write Data
Register
Instruction
Decode
I_HRDATA
D_HWDATA
D_HRDATA
Address
Incrementer
D_HADDR
Address
Register
29
 Cortex-M3 has 3-stage fetch-decode-execute pipeline
 Similar to ARM7
 Cortex-M3 does more in each stage to increase overall
performance
Cortex-M3 Pipeline
Branch forwarding & speculation
1st Stage - Fetch 2nd Stage - Decode 3rd Stage - Execute
Execute stage branch (ALU branch & Load Store Branch)
Fetch
(Prefetch)
AGU
Instruction
Decode &
Register Read
Branch
Address
Phase & Write
Back
Data Phase
Load/Store &
Branch
Multiply & Divide
Shift ALU & Branch
Write
30
ARM10 vs. ARM11 Pipelines
ARM11
Fetch
1
Fetch
2
Decode Issue
Shift ALU Saturate
Write
back
MAC
1
MAC
2
MAC
3
Address
Data
Cache
1
Data
Cache
2
Shift + ALU
Memory
Access Reg
Write
FETCH DECODE EXECUTE MEMORY WRITE
Reg Read
Multiply
Branch
Prediction
Instruction
Fetch
ISSUE
ARM or
Thumb
Instruction
Decode Multiply
Add
ARM10
31
Full Cortex-A8 Pipeline Diagram
13-Stage Integer Pipeline 10-Stage NEON Pipeline
NEON
Load queue
NEON
Instruction
Decode
Instruction Execute and Load/Store
E1 E3 E4 M1E2 M2 M3 N1 N6N2 N3 N4 N5E5
LS pipe 0 or 1
Instruction
Fetch
F1 F2F0 D1 D2 D3 D4
Instruction Decode
L3 memory system
BIU pipeline
L2 Data ArrayL2 Tag Array
L1 L2 L3 L4 L5 L6 L8
L1 data cache miss
L1 instruction cache miss
Branch mispredict penalty
NEON store data
Integer register writeback
NEON register writebackReplay penalty
Architecturalregisterfile
D0 E0
L7
Embedded Trace Macrocell
T10T3T0 T4 T5 T6 T7 T8 T9T2T1 T11
M0
T13T12
MUL pipe 0
ALU pipe 0
ALU pipe 1
Integer ALU pipe
Integer MUL pipe
Integer shift pipe
Non-IEEE FP ADD pipe
Non-IEEE FP MUL pipe
IEEE FP engine
LS permute pipe
NEONregisterfile
L2 data
External trace port
L1 data
32
Agenda
Introduction to ARM Ltd
ARM Architecture/Programmers Model
Data Path and Pipelines
 System Design
Development Tools
33
High Performance
ARM processor
High-bandwidth
on-chip RAM
High
Bandwidth
External
Memory
Interface
DMA
Bus Master
APB
Bridge
Keypad
UART
PIO
TimerAHB
APB
High Performance
Pipelined
Burst Support
Multiple Bus Masters
Low Power
Non-pipelined
Simple Interface
An Example AMBA System
34
Agenda
Introduction to ARM Ltd
ARM Architecture/Programmers Model
Data Path and Pipelines
System Design
 Development Tools
35
ARM Debug Architecture
ARM
core
ETM
TAP
controller
Trace PortJTAG port
Ethernet
Debugger (+ optional
trace tools)
 EmbeddedICE Logic
 Provides breakpoints and processor/system
access
 JTAG interface (ICE)
 Converts debugger commands to JTAG
signals
 Embedded trace Macrocell (ETM)
 Compresses real-time instruction and data
access trace
 Contains ICE features (trigger & filter logic)
 Trace port analyzer (TPA)
 Captures trace in a deep buffer
EmbeddedICE
Logic
36
Keil Development Tools for ARM
 Includes ARM macro assembler, compilers (ARM RealView C/C++
Compiler, Keil CARM Compiler, or GNU compiler), ARM linker, Keil uVision
Debugger and Keil uVision IDE
 Keil uVision Debugger accurately simulates on-chip peripherals (I2
C, CAN,
UART, SPI, Interrupts, I/O Ports, A/D and D/A converters, PWM, etc.)
 Evaluation Limitations
 16K byte object code + 16K data limitation
 Some linker restrictions such as base addresses for code/constants
 GNU tools provided are not restricted in any way
 http://www.keil.com/demo/
37
Keil Development Tools for ARM
38
University Resources
http://www.arm.com/support/university/
University@arm.com
39
Your Future at ARM…
 Graduate and Internship/Co-op Opportunities
 Engineering: Memory, Validation, Performance, DFT, R&D, GPU and more!
 Sales and Marketing: Corporate and Technical
 Corporate: IT, Patents, Services (Training and Support), and Human
Resources
 Incredible Culture and Comprehensive Benefit Package
 Competitive Reward
 Work/Life Balance
 Personal Development
 Brilliant Minds and Innovative Solutions
 Keep in Touch!
 www.arm.com/about/careers
40
TI Panda Board
OMAP4430 Processor
 1 GHz Dual-core ARM
Cortex-A9 (NEON+VFP)
 C64x+ DSP
 PowerVR SGX 3D GPU
 1080p Video Support
POP Memory
 1 GB LPDDR2 RAM
USB Powered
 < 4W max consumption
(OMAP small % of that)
 Many adapter options
(Car, wall, battery, solar, ..)
41
Project Ideas Using Panda
 OS Projects
 OS porting to ARM/Cortex (TI OMAP)
 MythTV system
 “Super-Panda” – stack of Pandas as compute engine and task
distribution
 Linux applications
 NEON Optimization Projects
 Codec optimization in ffmpeg (pick your favorite codec)
 Voice and image recognition
 Open-source Flash player optimizations (swfdec)
42
Fin
43
Nokia N95 Multimedia Computer
Symbian OS™ v9.2
Operating System supporting ARM
processor-based mobile devices,
developed using ARM® RealView®
Compilation Tools
OMAP™ 2420
Applications Processor
ARM1136™ processor-based
SoC, developed using Magma ®
Blast® family and winner of
2005 INSIGHT Award for ‘Most
Innovative SoC’
Connect. Collaborate. Create.
Mobiclip™ Video Codec
Software video codec for ARM
processor-based mobile devices
ST WLAN Solution
Ultra-low power 802.11b/g WLAN
chip with ARM9™ processor-based
MAC
S60™ 3rd Edition
S60 Platform supporting ARM
processor-based mobile devices
44
Beagle Board
45
$149
> 1000 participants
and growing
Open access to
hardware
documentation
Wikis, blogs,
promotion of
community
activity
Free
software
Freedom to
innovate
Personally
affordable
Active &
technical
community
Opportunity
to tinker and
learn
Instant access to
>10 million lines
of code
Addressing
open source
community
needs
Targeting community development
46
OMAP3530 Processor
 600MHz Cortex-A8
 NEON+VFPv3
 16KB/16KB L1$
 256KB L2$
 430MHz C64x+ DSP
 32K/32K L1$
 48K L1D
 32K L2
 PowerVR SGX GPU
 64K on-chip RAM
POP Memory
 128MB LPDDR RAM
 256MB NAND flash USB Powered
 2W maximum consumption
 OMAP is small % of that
 Many adapter options
 Car, wall, battery, solar, …
Peripheral I/O
 DVI-D video out
 SD/MMC+
 S-Video out
 USB 2.0 HS OTG
 I2C, I2S, SPI,
MMC/SD
 JTAG
 Stereo in/out
 Alternate power
 RS-232 serial
3”
Fast, low power, flexible expansion
47
Peripheral I/O
 DVI-D video out
 SD/MMC+
 S-Video out
 USB HS OTG
 I2C, I2S, SPI,
MMC/SD
 JTAG
 Stereo in/out
 Alternate power
 RS-232 serial
3”
Other Features
 4 LEDs
 USR0
 USR1
 PMU_STAT
 PWR
 2 buttons
 USER
 RESET
 4 boot sources
 SD/MMC
 NAND flash
 USB
 Serial
On-going collaboration at BeagleBoard.org
 Live chat via IRC for 24/7 community support
 Links to software projects to download
And more…
48
Project Ideas Using Beagle
 OS Projects
 OS porting to ARM/Cortex (TI OMAP)
 MythTV system
 “Super-Beagle” – stack of Beagles as compute engine and task
distribution
 Linux applications
 NEON Optimization Projects
 Codec optimization in ffmpeg (pick your favorite codec)
 Voice and image recognition
 Open-source Flash player optimizations (swfdec)

More Related Content

What's hot (20)

PPT
ARM - Advance RISC Machine
EdutechLearners
 
PPSX
LECT 1: ARM PROCESSORS
Dr.YNM
 
PPT
Arm architecture
Pantech ProLabs India Pvt Ltd
 
PPSX
Lect 2 ARM processor architecture
Dr.YNM
 
PPTX
CISC & RISC Architecture
Suvendu Kumar Dash
 
PPT
Introduction to embedded systems
Amr Ali (ISTQB CTAL Full, CSM, ITIL Foundation)
 
PPTX
Unit 4 _ ARM Processors .pptx
VijayKumar201823
 
PDF
Unit II Arm7 Thumb Instruction
Dr. Pankaj Zope
 
PDF
ARM Architecture
Dwight Sabio
 
PPTX
ARM- Programmer's Model
Ravikumar Tiwari
 
PPTX
AVR ATmega32
Prashant Tiwari
 
PPTX
LPC 2148 ARM MICROCONTROLLER
sravannunna24
 
PPT
DDR2 SDRAM
Subash John
 
PDF
Embedded Systems (18EC62) - ARM Cortex-M3 Instruction Set and Programming (Mo...
Shrishail Bhat
 
PDF
8086 memory segmentation
mahalakshmimalini
 
PPTX
Unit vi (1)
Siva Nageswararao
 
PPTX
Arm Processors Architectures
Mohammed Hilal
 
PDF
Arm instruction set
Mathivanan Natarajan
 
PDF
Unit II Arm 7 Introduction
Dr. Pankaj Zope
 
PPTX
Introduction to AVR Microcontroller
Mahmoud Sadat
 
ARM - Advance RISC Machine
EdutechLearners
 
LECT 1: ARM PROCESSORS
Dr.YNM
 
Lect 2 ARM processor architecture
Dr.YNM
 
CISC & RISC Architecture
Suvendu Kumar Dash
 
Introduction to embedded systems
Amr Ali (ISTQB CTAL Full, CSM, ITIL Foundation)
 
Unit 4 _ ARM Processors .pptx
VijayKumar201823
 
Unit II Arm7 Thumb Instruction
Dr. Pankaj Zope
 
ARM Architecture
Dwight Sabio
 
ARM- Programmer's Model
Ravikumar Tiwari
 
AVR ATmega32
Prashant Tiwari
 
LPC 2148 ARM MICROCONTROLLER
sravannunna24
 
DDR2 SDRAM
Subash John
 
Embedded Systems (18EC62) - ARM Cortex-M3 Instruction Set and Programming (Mo...
Shrishail Bhat
 
8086 memory segmentation
mahalakshmimalini
 
Unit vi (1)
Siva Nageswararao
 
Arm Processors Architectures
Mohammed Hilal
 
Arm instruction set
Mathivanan Natarajan
 
Unit II Arm 7 Introduction
Dr. Pankaj Zope
 
Introduction to AVR Microcontroller
Mahmoud Sadat
 

Similar to Arm cortex-m3 by-joe_bungo_arm (20)

PDF
Arm architecture overview
Sathish Arumugasamy
 
PPTX
UNIT 2.pptx
lalithamani sampath
 
PPT
ARM_2.ppt
MostafaParvin1
 
PPT
ARM Architecture
Kshitij Gorde
 
PPTX
EC8791 ARM Processor and Peripherals.pptx
deviifet2015
 
PDF
ARM Microcontrollers and Embedded Systems-Module 1_VTU
Girish M
 
PPT
ARM Introduction
Ramasubbu .P
 
PPTX
Arm arc-2016
Mohammed Gomaa
 
PDF
Unit II arm 7 Instruction Set
Dr. Pankaj Zope
 
PPTX
UNIT 3.pptx
BLACKSPAROW
 
PPTX
ARM introduction registers architectures
KNaveenKumarECE
 
PPTX
MPU Chp2.pptx
EE2k2016YasirJavaid
 
PDF
Arm cm3 architecture_and_programmer_model
Ganesh Naik
 
PPT
ARM-Introduction, registers and processor states.ppt
ECEHITS
 
PPTX
Introduction to arm processor
RAMPRAKASHT1
 
PDF
02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC
IEEE SSCS AlexSC
 
PPTX
unit 1ARM INTRODUCTION.pptx
KandavelEee
 
PPTX
ARM-7 ADDRESSING MODES INSTRUCTION SET
SasiBhushan22
 
PPTX
Introduction to Processor Design and ARM Processor
Darling Jemima
 
Arm architecture overview
Sathish Arumugasamy
 
UNIT 2.pptx
lalithamani sampath
 
ARM_2.ppt
MostafaParvin1
 
ARM Architecture
Kshitij Gorde
 
EC8791 ARM Processor and Peripherals.pptx
deviifet2015
 
ARM Microcontrollers and Embedded Systems-Module 1_VTU
Girish M
 
ARM Introduction
Ramasubbu .P
 
Arm arc-2016
Mohammed Gomaa
 
Unit II arm 7 Instruction Set
Dr. Pankaj Zope
 
UNIT 3.pptx
BLACKSPAROW
 
ARM introduction registers architectures
KNaveenKumarECE
 
MPU Chp2.pptx
EE2k2016YasirJavaid
 
Arm cm3 architecture_and_programmer_model
Ganesh Naik
 
ARM-Introduction, registers and processor states.ppt
ECEHITS
 
Introduction to arm processor
RAMPRAKASHT1
 
02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC
IEEE SSCS AlexSC
 
unit 1ARM INTRODUCTION.pptx
KandavelEee
 
ARM-7 ADDRESSING MODES INSTRUCTION SET
SasiBhushan22
 
Introduction to Processor Design and ARM Processor
Darling Jemima
 
Ad

Recently uploaded (20)

PPTX
Public_Speaking_Skills_Themed_Presentation.pptx
sohail890880
 
PDF
Left Holding the Bag sequence 2 Storyboard by Mark G
MarkGalez
 
PPTX
Tags_of_Chaman_Lifestyle Balochistan.pptx
MuhammadAkramKhan9
 
PPTX
Role & Etiquette of a Medical Representative – Do’s & Don’ts Inside the Docto...
Sujoy Dasgupta
 
PPTX
TLE WEEK 2lessonpara sa mga estufyante nga diba
angelagyanpiol
 
PDF
Meatball of Canyon Valley sequence 2 storyboard by Mark G.
MarkGalez
 
PPTX
Presentation.pptxjjjnjnnnnnnnnnnnnnnnnnnnn
simajameel01
 
PPTX
Campus Deck_All catrerr prospect park on the best
VaishnaviChitale
 
PDF
Student Visa vs Work Visa: Which Is Right for You? | Amit Kakkar Easy Visa
Amit Kakkar
 
PPTX
unit2_cdunit2_cdunit2_cdunit2_cdunit2_cd.pptx
shella20221
 
PPTX
FARZ ACADEMY MRCP EXAM PREPARATION-GUIDE & TIPS.pptx
dawnmarketingmaveric
 
PPTX
9e3e3981-1864-438b-93b4-ebabcb5090d0.pptx
SureshKumar565390
 
PDF
【2nd】Explanatory material of DTU(230207).pdf
kewalsinghpuriya
 
PPTX
Marketplace for AI-Powered Freelancers - Botpool
Botpool
 
PPTX
tech vs soft skill .pptxhgdvnhygnuufcbnbg
spnr2427
 
PPTX
How To Write A ResumeCV - Resume Writing Tips
yeasinArafath6
 
PPTX
Mastering-Communication-Your-Essential-Skills-Toolkit.pptx.pptx
rahulkesharwani642
 
PDF
Digital Marketing Success Case Study presentation.
shamshanashefeer
 
PPTX
FSS seminar-cours-work the future of material surfaces.pptx
sanjaychief112
 
PDF
A Guide To Why Doing Nothing Is Powerful
Lokesh Agrawal
 
Public_Speaking_Skills_Themed_Presentation.pptx
sohail890880
 
Left Holding the Bag sequence 2 Storyboard by Mark G
MarkGalez
 
Tags_of_Chaman_Lifestyle Balochistan.pptx
MuhammadAkramKhan9
 
Role & Etiquette of a Medical Representative – Do’s & Don’ts Inside the Docto...
Sujoy Dasgupta
 
TLE WEEK 2lessonpara sa mga estufyante nga diba
angelagyanpiol
 
Meatball of Canyon Valley sequence 2 storyboard by Mark G.
MarkGalez
 
Presentation.pptxjjjnjnnnnnnnnnnnnnnnnnnnn
simajameel01
 
Campus Deck_All catrerr prospect park on the best
VaishnaviChitale
 
Student Visa vs Work Visa: Which Is Right for You? | Amit Kakkar Easy Visa
Amit Kakkar
 
unit2_cdunit2_cdunit2_cdunit2_cdunit2_cd.pptx
shella20221
 
FARZ ACADEMY MRCP EXAM PREPARATION-GUIDE & TIPS.pptx
dawnmarketingmaveric
 
9e3e3981-1864-438b-93b4-ebabcb5090d0.pptx
SureshKumar565390
 
【2nd】Explanatory material of DTU(230207).pdf
kewalsinghpuriya
 
Marketplace for AI-Powered Freelancers - Botpool
Botpool
 
tech vs soft skill .pptxhgdvnhygnuufcbnbg
spnr2427
 
How To Write A ResumeCV - Resume Writing Tips
yeasinArafath6
 
Mastering-Communication-Your-Essential-Skills-Toolkit.pptx.pptx
rahulkesharwani642
 
Digital Marketing Success Case Study presentation.
shamshanashefeer
 
FSS seminar-cours-work the future of material surfaces.pptx
sanjaychief112
 
A Guide To Why Doing Nothing Is Powerful
Lokesh Agrawal
 
Ad

Arm cortex-m3 by-joe_bungo_arm

  • 1. 1 The ARM Architecture (with focus on Cortex-M3) Joe Bungo Applications Engineer ARM University Program
  • 2. 2 Agenda  Introduction to ARM Ltd ARM Architecture/Programmers Model Data Path and Pipelines System Design Development Tools
  • 3. 3 ARM Ltd  Founded in November 1990  Spun out of Acorn Computers  Initial funding from Apple, Acorn and VLSI  Designs the ARM range of RISC processor cores  Licenses ARM core designs to semiconductor partners who fabricate and sell to their customers  ARM does not fabricate silicon itself  Also develop technologies to assist with the design- in of the ARM architecture  Software tools, boards, debug hardware  Application software  Bus architectures  Peripherals, etc
  • 4. 4 ARM’s Activities memory SoC Processors System Level IP: Data Engines Fabric 3D Graphics Physical IP Software IP Development Tools Connected Community
  • 6. 6 Huge Range of Applications Energy Efficient Appliances IR Fire Detector Intelligent Vending Tele-parking Utility Meters Exercise MachinesIntelligent toys Equipment Adopting 32-bit ARM Microcontrollers
  • 7. 7 World’s Smallest ARM Computer? A CB Wirelessly networked into large scale sensor arrays Battery Solar Cells Processor, SRAM and PMU University of Michigan Sensors, timers Cortex-M0 +16KB RAM 65nm UWB Radio antenna 10 kB Storage memory ~3fW/bit 12µAh Li-ion Battery Wireless Sensor Network Cortex-M0; 65¢
  • 8. 8 World’s Largest ARM Computer? 4200 ARM powered Neutrino Detectors Work supported by the National Science Foundation and University of Wisconsin-Madison 70 bore holes 2.5km deep 60 detectors per string starting 1.5km down 1km3 of active telescope
  • 9. 9 From 1mm3 to 1km3 1mm3 1km3 10¢ $1000 Mobile Embedded Consumer Mobile Computing Server Enterprise PC Home HPC
  • 10. 10 Agenda Introduction to ARM Ltd  ARM Architecture/Programmers Model Data Path and Pipelines System Design Development Tools
  • 11. 11 ARM Cortex Processors (v7) ARM Cortex-A family (v7-A):  Applications processors for full OS and 3rd party applications ARM Cortex-R family (v7-R):  Embedded processors for real-time signal processing, control applications ARM Cortex-M family (v7-M):  Microcontroller-oriented processors for MCU and SoC applications Cortex-R4 Cortex-A8 SC300™ Cortex-M1 Cortex™-M3 ...2.5GHz x1-4 Cortex-A9 12k gates... Cortex-M0 Cortex-M4 x1-4 Cortex-A5 1-2 HeronR x1-4 Cortex-A15
  • 12. 12 Cortex family Cortex-A8  Architecture v7A  MMU  AXI  VFP & NEON support Cortex-R4  Architecture v7R  MPU (optional)  AXI  Dual Issue Cortex-M3  Architecture v7M  MPU (optional)  AHB Lite & APB
  • 13. 13 Relative Performance* *Represents attainable speeds in 130, 90, 65, or 45nm processes Cortex- M0 Cortex- M3 ARM7 ARM926 ARM1026 ARM1136 ARM1176 Cortex-A8 Cortex-A9 Dual-core Max Freq (MHz) 50 150 184 470 540 610 750 1100 2000 Min Power (mW/MHz) 0.012 0.06 0.35 0.235 0.36 0.335 0.568 0.43 0.5 0 500 1000 1500 2000 2500 MaxFrequency(Mhz)
  • 14. 14 Data Sizes and Instruction Sets  The ARM is a 32-bit architecture.  When used in relation to the ARM:  Byte means 8 bits  Halfword means 16 bits (two bytes)  Word means 32 bits (four bytes)  Most ARM’s implement two instruction sets  32-bit ARM Instruction Set  16-bit Thumb Instruction Set  Jazelle cores can also execute Java bytecode
  • 15. 15 ARM and Thumb Performance Memory width (zero wait state) 0 5000 10000 15000 20000 25000 30000 32-bit 16-bit 16-bit with 32-bit stack ARM Thumb Dhrystone 2.1/sec @ 20MHz
  • 16. 16 The Thumb-2 instruction set  Variable-length instructions  ARM instructions are a fixed length of 32 bits  Thumb instructions are a fixed length of 16 bits  Thumb-2 instructions can be either 16-bit or 32-bit  Thumb-2 gives approximately 26% improvement in code density over ARM  Thumb-2 gives approximately 25% improvement in performance over Thumb
  • 17. 17 Cortex-M Programmer’s Model  Fully programmable in C  Stack-based exception model  Only two processor modes  Thread Mode for User tasks  Handler Mode for OS tasks and exceptions  Vector table contains addresses Process r8 r9 r10 r11 r12 sp lr r15 (pc) xPSR r0 r1 r2 r3 r4 r5 r6 r7 Main sp
  • 18. 18 ARM Cortex-M3 Application code OS System Call (SVCall) Undefined Instruction Privileged Cortex-M3 Processor Privilege Memory Instructions & Data Aborts Interrupts Reset Non-Privileged Supervisor User Handler Mode Thread Mode
  • 19. 19 Cortex-M3 Interrupt Handling  One Non-Maskable Interrupt (INTNMI) supported  1-240 prioritizable interrupts supported  Interrupts can be masked  Implementation option selects number of interrupts supported  Nested Vectored Interrupt Controller (NVIC) is tightly coupled with processor core  Interrupt inputs are active HIGH Cortex-M3 Processor Core INTNMI NVIC Cortex-M3 1-240 Interrupts INTISR[239:0] …
  • 20. 20 Cortex-M3 Exception Handling  Reset : power-on or system reset  NMI : cannot be stopped or preempted by any exception other than reset  Faults  Hard Fault : default Fault or any fault unable to activate  Memory Manage : MPU violations  Bus Fault : prefetch and memory access violations  Usage Fault : undef instructions, divide by zero, etc.  SVCall : privileged OS requests  Debug Monitor : debug monitor program  PendSV : pending SVCalls  SysTick Interrupt : internal sys timer, i.e., used by RTOS to periodically check resources or peripherals  External Interrupt : i.e., external peripherals
  • 21. 21 Cortex-M3 Program Status Register  One Status Register consisting of  APSR - Application Program Status Register – ALU flags  IPSR - Interrupt Program Status Register – Interrupt/Exception No.  EPSR - Execution Program Status Register  IT field – If/Then block information  ICI field – Interruptible-Continuable Instruction information  xPSR  Composite of the 3 PSRs  Stored on the stack on exception entry IT/ICIIT 2731 N Z C V Q 28 7 ISR Number 1623 15 0242526 10 T
  • 22. 22 Conditional Execution ITTET EQ Inst 1 Inst 2 Inst 3 Inst 4  If – Then (IT) instruction added (16 bit)  Up to 3 additional “then” or “else” conditions maybe specified (T or E)  Makes up to 4 following instructions conditional  Any normal ARM condition code can be used  16-bit instructions in block do not affect condition code flags  Apart from comparison instruction  32 bit instructions may affect flags (normal rules apply)  Current “if-then status” stored in CPSR  Conditional block maybe safely interrupted and returned to  Must NOT branch into or out of ‘if-then’ block MOVEQ ADDEQ SUBNE ORREQ
  • 23. 23 Load/Store Miscellaneous Classes of Instructions (v4T) Data Operations MOV PC, Rm Bcc BL BLX Change of Flow
  • 24. 24 Data processing Instructions  Consist of :  Arithmetic: ADD ADC SUB SBC RSB RSC  Logical: AND ORR EOR BIC  Comparisons: CMP CMN TST TEQ  Data movement: MOV MVN  These instructions only work on registers, NOT memory.  Syntax: <Operation>{<cond>}{S} Rd, Rn, Operand2  Comparisons set flags only - they do not specify Rd  Data movement does not specify Rn  Second operand is sent to the ALU via barrel shifter.
  • 25. 25 Register, optionally with shift operation  Shift value can be either be:  5 bit unsigned integer  Specified in bottom byte of another register.  Used for multiplication by constant Immediate value  8 bit number, with a range of 0-255.  Rotated right through even number of positions  Allows increased range of 32-bit constants to be loaded directly into registers Result Operand 1 Barrel Shifter Operand 2 ALU Using a Barrel Shifter:The 2nd Operand
  • 26. 26 Single register data transfer LDR STR Word LDRB STRB Byte LDRH STRH Halfword LDRSB Signed byte load LDRSH Signed halfword load  Memory system must support all access sizes  Syntax:  LDR{<cond>}{<size>} Rd, <address>  STR{<cond>}{<size>} Rd, <address> e.g. LDREQB
  • 27. 27 Agenda Introduction to ARM Ltd ARM Architecture/Programmers Model  Data Path and Pipelines System Design Development Tools
  • 28. 28 Cortex-M3 Datapath Register Bank Mul/Div Address Incrementer ALU B A INTADDR I_HADDR Address Register Barrel Shifter Writeback ALU Read Data Register Write Data Register Instruction Decode I_HRDATA D_HWDATA D_HRDATA Address Incrementer D_HADDR Address Register
  • 29. 29  Cortex-M3 has 3-stage fetch-decode-execute pipeline  Similar to ARM7  Cortex-M3 does more in each stage to increase overall performance Cortex-M3 Pipeline Branch forwarding & speculation 1st Stage - Fetch 2nd Stage - Decode 3rd Stage - Execute Execute stage branch (ALU branch & Load Store Branch) Fetch (Prefetch) AGU Instruction Decode & Register Read Branch Address Phase & Write Back Data Phase Load/Store & Branch Multiply & Divide Shift ALU & Branch Write
  • 30. 30 ARM10 vs. ARM11 Pipelines ARM11 Fetch 1 Fetch 2 Decode Issue Shift ALU Saturate Write back MAC 1 MAC 2 MAC 3 Address Data Cache 1 Data Cache 2 Shift + ALU Memory Access Reg Write FETCH DECODE EXECUTE MEMORY WRITE Reg Read Multiply Branch Prediction Instruction Fetch ISSUE ARM or Thumb Instruction Decode Multiply Add ARM10
  • 31. 31 Full Cortex-A8 Pipeline Diagram 13-Stage Integer Pipeline 10-Stage NEON Pipeline NEON Load queue NEON Instruction Decode Instruction Execute and Load/Store E1 E3 E4 M1E2 M2 M3 N1 N6N2 N3 N4 N5E5 LS pipe 0 or 1 Instruction Fetch F1 F2F0 D1 D2 D3 D4 Instruction Decode L3 memory system BIU pipeline L2 Data ArrayL2 Tag Array L1 L2 L3 L4 L5 L6 L8 L1 data cache miss L1 instruction cache miss Branch mispredict penalty NEON store data Integer register writeback NEON register writebackReplay penalty Architecturalregisterfile D0 E0 L7 Embedded Trace Macrocell T10T3T0 T4 T5 T6 T7 T8 T9T2T1 T11 M0 T13T12 MUL pipe 0 ALU pipe 0 ALU pipe 1 Integer ALU pipe Integer MUL pipe Integer shift pipe Non-IEEE FP ADD pipe Non-IEEE FP MUL pipe IEEE FP engine LS permute pipe NEONregisterfile L2 data External trace port L1 data
  • 32. 32 Agenda Introduction to ARM Ltd ARM Architecture/Programmers Model Data Path and Pipelines  System Design Development Tools
  • 33. 33 High Performance ARM processor High-bandwidth on-chip RAM High Bandwidth External Memory Interface DMA Bus Master APB Bridge Keypad UART PIO TimerAHB APB High Performance Pipelined Burst Support Multiple Bus Masters Low Power Non-pipelined Simple Interface An Example AMBA System
  • 34. 34 Agenda Introduction to ARM Ltd ARM Architecture/Programmers Model Data Path and Pipelines System Design  Development Tools
  • 35. 35 ARM Debug Architecture ARM core ETM TAP controller Trace PortJTAG port Ethernet Debugger (+ optional trace tools)  EmbeddedICE Logic  Provides breakpoints and processor/system access  JTAG interface (ICE)  Converts debugger commands to JTAG signals  Embedded trace Macrocell (ETM)  Compresses real-time instruction and data access trace  Contains ICE features (trigger & filter logic)  Trace port analyzer (TPA)  Captures trace in a deep buffer EmbeddedICE Logic
  • 36. 36 Keil Development Tools for ARM  Includes ARM macro assembler, compilers (ARM RealView C/C++ Compiler, Keil CARM Compiler, or GNU compiler), ARM linker, Keil uVision Debugger and Keil uVision IDE  Keil uVision Debugger accurately simulates on-chip peripherals (I2 C, CAN, UART, SPI, Interrupts, I/O Ports, A/D and D/A converters, PWM, etc.)  Evaluation Limitations  16K byte object code + 16K data limitation  Some linker restrictions such as base addresses for code/constants  GNU tools provided are not restricted in any way  http://www.keil.com/demo/
  • 39. 39 Your Future at ARM…  Graduate and Internship/Co-op Opportunities  Engineering: Memory, Validation, Performance, DFT, R&D, GPU and more!  Sales and Marketing: Corporate and Technical  Corporate: IT, Patents, Services (Training and Support), and Human Resources  Incredible Culture and Comprehensive Benefit Package  Competitive Reward  Work/Life Balance  Personal Development  Brilliant Minds and Innovative Solutions  Keep in Touch!  www.arm.com/about/careers
  • 40. 40 TI Panda Board OMAP4430 Processor  1 GHz Dual-core ARM Cortex-A9 (NEON+VFP)  C64x+ DSP  PowerVR SGX 3D GPU  1080p Video Support POP Memory  1 GB LPDDR2 RAM USB Powered  < 4W max consumption (OMAP small % of that)  Many adapter options (Car, wall, battery, solar, ..)
  • 41. 41 Project Ideas Using Panda  OS Projects  OS porting to ARM/Cortex (TI OMAP)  MythTV system  “Super-Panda” – stack of Pandas as compute engine and task distribution  Linux applications  NEON Optimization Projects  Codec optimization in ffmpeg (pick your favorite codec)  Voice and image recognition  Open-source Flash player optimizations (swfdec)
  • 43. 43 Nokia N95 Multimedia Computer Symbian OS™ v9.2 Operating System supporting ARM processor-based mobile devices, developed using ARM® RealView® Compilation Tools OMAP™ 2420 Applications Processor ARM1136™ processor-based SoC, developed using Magma ® Blast® family and winner of 2005 INSIGHT Award for ‘Most Innovative SoC’ Connect. Collaborate. Create. Mobiclip™ Video Codec Software video codec for ARM processor-based mobile devices ST WLAN Solution Ultra-low power 802.11b/g WLAN chip with ARM9™ processor-based MAC S60™ 3rd Edition S60 Platform supporting ARM processor-based mobile devices
  • 45. 45 $149 > 1000 participants and growing Open access to hardware documentation Wikis, blogs, promotion of community activity Free software Freedom to innovate Personally affordable Active & technical community Opportunity to tinker and learn Instant access to >10 million lines of code Addressing open source community needs Targeting community development
  • 46. 46 OMAP3530 Processor  600MHz Cortex-A8  NEON+VFPv3  16KB/16KB L1$  256KB L2$  430MHz C64x+ DSP  32K/32K L1$  48K L1D  32K L2  PowerVR SGX GPU  64K on-chip RAM POP Memory  128MB LPDDR RAM  256MB NAND flash USB Powered  2W maximum consumption  OMAP is small % of that  Many adapter options  Car, wall, battery, solar, … Peripheral I/O  DVI-D video out  SD/MMC+  S-Video out  USB 2.0 HS OTG  I2C, I2S, SPI, MMC/SD  JTAG  Stereo in/out  Alternate power  RS-232 serial 3” Fast, low power, flexible expansion
  • 47. 47 Peripheral I/O  DVI-D video out  SD/MMC+  S-Video out  USB HS OTG  I2C, I2S, SPI, MMC/SD  JTAG  Stereo in/out  Alternate power  RS-232 serial 3” Other Features  4 LEDs  USR0  USR1  PMU_STAT  PWR  2 buttons  USER  RESET  4 boot sources  SD/MMC  NAND flash  USB  Serial On-going collaboration at BeagleBoard.org  Live chat via IRC for 24/7 community support  Links to software projects to download And more…
  • 48. 48 Project Ideas Using Beagle  OS Projects  OS porting to ARM/Cortex (TI OMAP)  MythTV system  “Super-Beagle” – stack of Beagles as compute engine and task distribution  Linux applications  NEON Optimization Projects  Codec optimization in ffmpeg (pick your favorite codec)  Voice and image recognition  Open-source Flash player optimizations (swfdec)