SlideShare a Scribd company logo
Lecture 2 – Distributed Filesystems 922EU3870 – Cloud Computing and Mobile Platforms, Autumn 2009 2009/9/21 Ping Yeh ( 葉平 ), Google, Inc.
Outline Get to know the numbers
Filesystems overview
Distributed file systems Basic (example: NFS)
Shared storage (example: Global FS)
Wide-area (example: AFS)
Fault-tolerant (example: Coda)
Parallel (example: Lustre)
Fault-tolerant and Parallel (example: dCache) The Google File System
Homework
Numbers real world engineers should know L1 cache reference 0.5 ns Branch mispredict 5  ns L2 cache reference 7  ns Mutex lock/unlock 100  ns Main memory reference 100  ns Compress 1 KB with Zippy 10,000  ns Send 2 KB through 1 Gbps network 20,000  ns Read 1 MB sequentially from memory 250,000  ns Round trip within the same data center 500,000  ns Disk seek 10,000,000  ns Read 1 MB sequentially from network 10,000,000  ns Read 1 MB sequentially from disk 30,000,000  ns Round trip between California and Netherlands 150,000,000  ns
The Joys of Real Hardware Typical first year for a new cluster: ~0.5  overheating  (power down most machines in <5 mins, ~1-2 days to recover) ~1  PDU failure  (~500-1000 machines suddenly disappear, ~6 hours to come back) ~1  rack-move  (plenty of warning, ~500-1000 machines powered down, ~6 hours) ~1  network rewiring  (rolling ~5% of machines down over 2-day span) ~20  rack failures  (40-80 machines instantly disappear, 1-6 hours to get back) ~5  racks go wonky  (40-80 machines see 50% packetloss) ~8  network maintenances  (4 might cause ~30-minute random connectivity losses) ~12  router reloads  (takes out DNS and external vips for a couple minutes) ~3  router failures  (have to immediately pull traffic for an hour) ~dozens of minor  30-second blips for dns ~1000  individual machine failures ~thousands of  hard drive failures slow disks, bad memory, misconfigured machines, flaky machines,  etc.
File Systems Overview System that permanently stores data
Usually layered on top of a lower-level physical storage medium
Divided into logical units called “files” Addressable by a filename  (“foo.txt”) Files are often organized into directories Usually supports hierarchical nesting (directories)
A path is the expression that joins directories and filename to form a unique “full name” for a file. Directories may further belong to a volume
The set of valid paths form the  namespace  of the file system.
What Gets Stored User data itself is the bulk of the file system's contents
Also includes meta-data on a volume-wide and per-file basis: Volume-wide: Available space Formatting info character set ... Per-file: name owner modification date physical layout...
High-Level Organization Files are typically organized in a “tree” structure made of nested directories
One directory acts as the “root”
“links” (symlinks, shortcuts, etc) provide simple means of providing multiple access paths to one file
Other file systems can be “mounted” and dropped in as sub-hierarchies (other drives, network shares)
Typical operations on a file: create, delete, rename, open, close, read, write, append. also lock for multi-user systems.
Low-Level Organization (1/2) File data and meta-data stored separately
File descriptors + meta-data stored in inodes (Un*x) Large tree or table at designated location on disk
Tells how to look up file contents Meta-data may be replicated to increase system reliability
Low-Level Organization (2/2) “Standard” read-write medium is a hard drive (other media: CDROM, tape, ...)
Viewed as a sequential array of blocks
Usually address ~1 KB chunk at a time
Tree structure is “flattened” into blocks
Overlapping writes/deletes can cause fragmentation: files are often not stored with a linear layout inodes store all block numbers related to file
Fragmentation
Filesystem Design Considerations Namespace: physical, logical
Consistency: what to do when more than one user reads/writes on the same file?
Security: who can do what to a file? Authentication/ACL
Reliability: can files not be damaged at power outage or other hardware failures?
Local Filesystems on Unix-like Systems Many different designs
Namespace: root directory “/”, followed by directories and files.
Consistency: “sequential consistency”, newly written data are immediately visible to open reads (if...)
Security: uid/gid, mode of files
kerberos: tickets Reliability: superblocks, journaling, snapshot more reliable filesystem on top of existing filesystem: RAID computer
Namespace Physical mapping: a directory and all of its subdirectories are stored on the same physical media. /mnt/cdrom
/mnt/disk1, /mnt/disk2, … when you have multiple disks Logical volume: a logical namespace that can contain multiple physical media or a partition of a physical media still mounted like /mnt/vol1
dynamical resizing by adding/removing disks without reboot
splitting/merging volumes as long as no data spans the split

More Related Content

What's hot (20)

PPS
Planning Site Navigation
Mukesh Tekwani
 
PDF
Lecture 1 introduction to parallel and distributed computing
Vajira Thambawita
 
PPTX
Overview of physical storage media
Srinath Sri
 
PPTX
Distributed Database Management System
AAKANKSHA JAIN
 
PDF
NFS(Network File System)
udamale
 
PPT
Presentation on backup and recoveryyyyyyyyyyyyy
Tehmina Gulfam
 
PDF
Resource management
Dr Sandeep Kumar Poonia
 
PPT
Storage Area Network (San)
sankcomp
 
PDF
Distributed objects
Sharafat Husen
 
PDF
Distributed Coordination-Based Systems
Ahmed Magdy Ezzeldin, MSc.
 
PPTX
DHCP & DNS
NetProtocol Xpert
 
PPTX
Deductive databases
John Popoola
 
PDF
Database System Architecture
Vignesh Saravanan
 
PPTX
Data cube computation
Rashmi Sheikh
 
PDF
Course outline of parallel and distributed computing
National College of Business Administration & Economics ( NCBA&E)
 
PPTX
Introduction to Distributed System
Sunita Sahu
 
PPT
Distributed file systems dfs
Pragati Startup Presentation Designer firm
 
PPTX
Underlying principles of parallel and distributed computing
GOVERNMENT COLLEGE OF ENGINEERING,TIRUNELVELI
 
PPT
File models and file accessing models
ishmecse13
 
PPT
Chapter 6-Consistency and Replication.ppt
sirajmohammed35
 
Planning Site Navigation
Mukesh Tekwani
 
Lecture 1 introduction to parallel and distributed computing
Vajira Thambawita
 
Overview of physical storage media
Srinath Sri
 
Distributed Database Management System
AAKANKSHA JAIN
 
NFS(Network File System)
udamale
 
Presentation on backup and recoveryyyyyyyyyyyyy
Tehmina Gulfam
 
Resource management
Dr Sandeep Kumar Poonia
 
Storage Area Network (San)
sankcomp
 
Distributed objects
Sharafat Husen
 
Distributed Coordination-Based Systems
Ahmed Magdy Ezzeldin, MSc.
 
DHCP & DNS
NetProtocol Xpert
 
Deductive databases
John Popoola
 
Database System Architecture
Vignesh Saravanan
 
Data cube computation
Rashmi Sheikh
 
Course outline of parallel and distributed computing
National College of Business Administration & Economics ( NCBA&E)
 
Introduction to Distributed System
Sunita Sahu
 
Distributed file systems dfs
Pragati Startup Presentation Designer firm
 
Underlying principles of parallel and distributed computing
GOVERNMENT COLLEGE OF ENGINEERING,TIRUNELVELI
 
File models and file accessing models
ishmecse13
 
Chapter 6-Consistency and Replication.ppt
sirajmohammed35
 

Similar to Distributed File System (20)

PDF
009709863.pdf
KalsoomTahir2
 
PPT
Distributed file systems
Sri Prasanna
 
PPT
Distributed file systems (from Google)
Sri Prasanna
 
PPT
Distributed computing seminar lecture 3 - distributed file systems
tugrulh
 
PPT
Lec3 Dfs
mobius.cn
 
PPT
Dfs (Distributed computing)
Sri Prasanna
 
PPT
Magnetic disk - Krishna Geetha.ppt
ComputerScienceDepar6
 
PPT
Distributed File Systems
awesomesos
 
PDF
Xen server storage Overview
Nuno Alves
 
PDF
Posscon2013
Dru Lavigne
 
PPTX
Hadoop
Esraa El Ghoul
 
PPT
Presentation on nfs,afs,vfs
Prakriti Dubey
 
PPTX
I/O System and Case study
Lavanya G
 
PPTX
File system Os
Nehal Naik
 
PPT
FILE STRUCTURE IN DBMS
Abhishek Dutta
 
PPTX
SAN BASICS..Why we will go for SAN?
Saroj Sahu
 
PPTX
Files and directories in Linux 6
Meenakshi Paul
 
DOCX
linux file sysytem& input and output
MythiliA5
 
DOCX
File system interfacefinal
marangburu42
 
009709863.pdf
KalsoomTahir2
 
Distributed file systems
Sri Prasanna
 
Distributed file systems (from Google)
Sri Prasanna
 
Distributed computing seminar lecture 3 - distributed file systems
tugrulh
 
Lec3 Dfs
mobius.cn
 
Dfs (Distributed computing)
Sri Prasanna
 
Magnetic disk - Krishna Geetha.ppt
ComputerScienceDepar6
 
Distributed File Systems
awesomesos
 
Xen server storage Overview
Nuno Alves
 
Posscon2013
Dru Lavigne
 
Presentation on nfs,afs,vfs
Prakriti Dubey
 
I/O System and Case study
Lavanya G
 
File system Os
Nehal Naik
 
FILE STRUCTURE IN DBMS
Abhishek Dutta
 
SAN BASICS..Why we will go for SAN?
Saroj Sahu
 
Files and directories in Linux 6
Meenakshi Paul
 
linux file sysytem& input and output
MythiliA5
 
File system interfacefinal
marangburu42
 
Ad

Recently uploaded (20)

PDF
Python basic programing language for automation
DanialHabibi2
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Python basic programing language for automation
DanialHabibi2
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
From Code to Challenge: Crafting Skill-Based Games That Engage and Reward
aiyshauae
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
From Sci-Fi to Reality: Exploring AI Evolution
Svetlana Meissner
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Timothy Rottach - Ramp up on AI Use Cases, from Vector Search to AI Agents wi...
AWS Chicago
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
Ad

Distributed File System

Editor's Notes

  • #4: memory: 1 GHz bus * 32 bit * 1B/8b = 4 GB/s 1MB / 4 GBps = 1/4 ms = 250 micro-seconds network: 1 Gbps = 100 MB/s 1MB / 100 MBps = 0.01 s = 10 ms
  • #41: 1.NAL-Network Abstraction Layers 2.Drivers: ext2 OBD, OBD filter makes other file systems recognizable such as XFS, JFS, and Ext3 3.NIO portal API provides for interoperation with a variety of network transports through NAL 4.When a Lustre inode represents a file, the metadata merely holds reference to the file data obj. stored on the OST’s