CRAMM: Virtual Memory Support for Garbage-Collected Applications

CRAMM: Virtual Memory Support for Garbage-Collected Applications Ting Yang, Emery Berger, Scott Kaplan † , Eliot Moss Department of Computer Science Dept. of Math and Computer Science † University of Massachusetts Amherst College {tingy,emery,moss}@cs.umass.edu [email_address]

Motivation: Heap Size Matters GC languages Java, C#, Python, Ruby, etc. Increasingly popular Heap size critical Too large: Paging (10-100x slower) Too small: Excessive # collections hurts throughput Heap Size (120MB) Memory (100MB) JVM VM/OS Disk Heap Size ( 6 0MB) Memory (100MB)

What is the right heap size? Find the sweet spot : Large enough to minimize collections Small enough to avoid paging BUT: sweet spot changes constantly (multiprogramming) CRAMM: C ooperative R obust A utomatic M emory M anagement Goal : through cooperation with OS & GC, keep garbage-collected applications running at their sweet spot

CRAMM Overview Cooperative approach: Collector-neutral heap sizing model (GC) suitable for broad range of collectors Statistics-gathering VM (OS) Automatically resizes heap in response to memory pressure Grows to maximize space utilization Shrinks to eliminate paging Improves performance by up to 20x Overhead on non-GC apps: 1-2.5%

Outline Motivation CRAMM overview Automatic heap sizing Information gathering Experimental results Conclusion

GC : How do we choose a good heap size?

GC: Collector-neutral model SemiSpace (copying) a ≈ ½ b ≈ JVM, code + app’s live size heapUtilFactor: constant dependent on GC algorithm Fixed overhead : Libraries, codes, copying (app’s live size)

GC: a collector-neutral WSS model SemiSpace (copying) MS (non-copying) a ≈ ½ b ≈ JVM, code + app’s live size a ≈ 1 b ≈ JVM, code heapUtilFactor: constant dependent on GC algorithm Fixed overhead : Libraries, codes, copying (app’s live size)

GC: Selecting new heap size GC: heapUtilFactor (a) & cur_heapSize VMM: WSS & available memory Set heap size so that working set just fits in current available memory

Heap Size vs. Execution time, WSS 1/x shape Y=0.99*X + 32.56 Linear shape

VM : How do we collect information to support heap size selection? (with low overhead) WSS, Available Memory

Calculating WSS w.r.t 5% Memory reference sequence LRU Queue Pages in Least Recently Used order Hit Histogram Fault Curve 1 14 5 1 1 14 11 4 Associated with each LRU position pages faults d e f g h i j k l m n c k l m n c b c d e f g h i j k l m n c k l m n a b a a b c d e f g h i j k l m n a b c d e f g h i j k l m n a b d e f g h i j c k l n m a b c d e f g h i j k m n l a b c d e f g h i j l m n k a b d e f g h i j k l m n c 4 n 3 2 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 m n l m n k l m n c k l m n a b c d e f g h i j k l m n c k l m n

Computing hit histogram Not possible in standard VM: Global LRU queues No per process/file information or control Difficult to estimate app’s WSS / available memory CRAMM VM: Per process/file page management: Page list: Active , Inactive , Evicted Add & maintain histogram

Managing pages for a process Active (CLOCK) Inactive (LRU) Evicted (LRU) Major fault Evicted Refill & Adjustment Minor fault Pages protected by turning off permissions (minor fault) Pages evicted to disk. (major fault) Histogram Pages faults Header Page Des AVL node

Controlling overhead Buffer Active (CLOCK) Inactive (LRU) Evicted (LRU) Pages protected by turning off permissions (minor fault) Pages evicted to disk. (major fault) Histogram Pages faults control the boundary: 1% of execution time Header Page Des AVL node

Calculating available memory What’s “available”? Page cache Are pages from closed files “free”? Policy decision: yes Easy to distinguish in CRAMM – on separate list Available Memory = resident application pages + free pages in the system + pages from closed files

Experimental Evaluation Experimental setup: CRAMM (Jikes RVM + Linux), unmodified Jikes RVM, JRockit, HotSpot GC: GenCopy, CopyMS, MS, SemiSpace, GenMS SPECjvm98, DaCapo, SPECjbb, ipsixql + SPEC2000 Experiments: Dynamic memory pressure Overhead w/o memory pressure

Dynamic Memory Pressure (1) stock w/o pressure 296.67 secs 1136 majflts CRAMM w/ pressure 302.53 secs 1613 majflts 98% CPU Stock w/ pressure 720.11 secs 39944 majflts 48% CPU I nitial heap size: 120MB Elapsed Time (seconds) GenMS – SPECjbb (Modified) w/ 160M memory s tock w/o pressure CRAMM w/ pressure # transactions finished (thousands) S tock w/ pressure

Dynamic Memory Pressure (2) SPECjbb (modified): Normalized Elapsed Time JRockit HotSpot CRAMM-GenMS CRAMM-MS CRAMM-SS HotSpot JRockit # transactions finished (thousands)

CRAMM VM: Efficiency Overhead: on average, 1% - 2.5% CRAMM VM Overhead 0 0.5 1 1.5 2 2.5 3 3.5 4 SPEC2Kint SPEC2Kfp Java- GenCopy Java- SemiSpace Java- MarkSweep Java-GenMS Java- CopyMS % Overhead Additional Overhead Histogram Collection

Conclusion Cooperative Robust Automatic Memory Management ( CRAMM ) GC: Collector-neutral WSS model VM: Statistics-gathering virtual memory manager Dynamically chooses nearly-optimal heap size for garbage-collected applications Maximizes use of memory without paging Minimal overhead (1% - 2.5%) Quickly adapts to memory pressure changes http://www.cs.umass.edu/~tingy/CRAMM

Backup Slides Example of paging Problem, javac Understanding fault curves Characterizing Paging Behavior Using fault curves / LRU SegQ design Collecting fault curves on the fly Calculating WSS of GCed applications

Characterizing Paging Behavior Memory reference sequence LRU Queue Pages in Least Recently Used order Hit Histogram Fault Curve 1 14 5 1 1 14 11 4 12 pages 5 pages Associated with each LRU position d e f g h i j k l m n c k l m n c b c d e f g h i j k l m n c k l m n a b a a b c d e f g h i j k l m n a b c d e f g h i j k l m n a b d e f g h i j c k l n m a b c d e f g h i j k m n l a b c d e f g h i j l m n k a b d e f g h i j k l m n c 4 n 3 2 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 m n l m n k l m n c k l m n a b c d e f g h i j k l m n c k l m n

Heap Size = 240Mb Memory = 145Mb # of Faults ≈ 1000 50 seconds extreme paging substantial paging: “ looping” behavior fits into memory Fault curve: Relationship of heap size, real memory & page faults Heap size= 0.5 second

VMM design: SegQ LRU Queue Hit Histogram Active Inactive Evicted Active / Inactive Boundary CLOCK algorithm Strict LRU Adaptive control of Inactive list size Major fault (on disk) Minor fault (in memory) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

VMM design: SegQ Active Inactive Evicted Active / Inactive Boundary CLOCK algorithm Strict LRU WSS What is the WSS w.r.t 5%? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

CRAMM System: Demo JVM Memory (150MB) Other Apps Heap Size (120MB) Need 6 0MB memory Polling Memory Status GC: Memory exhausted, triggers a full collection … Collection Finished GC: Collection is finished; Memory is released VM: Calculates app’s WSS and Available memory WSS, Available Memory GC: Choose a new heap size using WSS model Heap Size ( 1 00 MB) Heap Size ( 90 MB) Heap Size ( 150 MB) VM M GC: Shrinks the heap size again Other apps finished GC: Grows the heap size to make better use of memory

CRAMM VM: Control overhead Goal: 1% of execution time < 0.5%: grow inactive list size > 1.5%: shrink inactive list size When Interval: 1/16 seconds # of minflts > (interval * 2%) / minflt_cost How Grow: min(active, inactive)/32 Shrink: min(active, inactive)/8 Refill: min(active, inactive)/16

CRAMM vs. Bookmarking Collector Two different approaches CRAMM: A new VMM Moderate modifications in collectors Heap level control (coarse granularity) BC: A new collector Moderate modifications in VMM Page level control (fine granularity)

Static Memory Pressure optimal

Dynamic Memory Pressure (1) Initial heap size: 120MB stock w/o pressure 336.17 secs 1136 majflts CRAMM w/ pressure 386.88 secs 1179 majflts 98% CPU Stock w/ pressure 928.49 secs 47941 majflts 36% CPU

Dynamic Memory Pressure (1) Available memory Heap size Sample after every collection adaptive

Appel _213_javac 60MB real memory Too small: GC a lot Too large: page a lot Optimal Problem & Motivation Heap size vs Running time

Manage processes/files mem_info structure organization Unused list : closed files Normal list : running processes, files in use Handling files close: deactivate all pages, move to unused list open: move to normal list, rebuild its active list Eviction policy Scan Unused list first Then select from normal list in round-robin manner

Behind the WSS model? Stop-the-world, tracing collectors Two phases: mutator and collector Mutator: runs the app Allocation, references existing objects Collector: Traces pointers for live objects GC behavior dominates: no “ infrequently ” used pages Base rate: at least once for each GC cycle Working Set Size (WSS) the amount of memory needed so that the time spent on page faults is lower than certain percent of total execution time. (5%)

GC gives more choices ! Non-GCed Application GCed Application W(k,t) k k W(k,t) Heap: 20MB Heap: 30MB Heap: 45MB Heap: 65MB Working Set Size W(k, t) : at time t , the set of all pages used k most recent references Memory pressure , scan frequency , k , WSS , more pages can be evicted, page faults , running time Larger search space. Change heap size, change WSS, avoid page faults, less impact on running time Hmm… a search problem! Search Criteria Working Set Size: The amount of memory needed so that the time spent on page faults is lower than certain percent of total execution time. (typical value: 5%)

CRAMM: Virtual Memory Support for Garbage-Collected Applications

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to CRAMM: Virtual Memory Support for Garbage-Collected Applications (20)

More from Emery Berger (20)

Recently uploaded (20)

CRAMM: Virtual Memory Support for Garbage-Collected Applications