SlideShare a Scribd company logo
Ram Lakshmanan | architect: yCrash
‘16 artifacts’ to capture when there is a
production problem
Opensource script
https://docs.ycrash.io/ycrash-features/ycrash-faq/only-capture-artifacts.html
To capture all 16 artifacts
1. GC Log
360-degree data
What is GC Log?
https://tier1app.com/dist/sample/g1-repeatedGC.txt
How to capture GC Log?
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<file-path>
Till Java 8
-Xlog:gc*:file=<file-path>
From Java 9
Why GC log is important?
https://blog.ycrash.io/2021/10/15/interesting-garbage-collection-patterns/
Fig: Healthy Saw-tooth GC pattern
Fig: Heavy caching GC Pattern
Fig: Acute Memory Leak Pattern
Fig: Consecutive Full GC pattern
Fig: Memory Leak GC pattern
How to analyze GC Log?
https://developer.ibm.com/javas
dk/tools/
IBM GC & Memory visualizer
GC Viewer
https://github.com/chewiebug/G
CViewer
GCeasy
https://gceasy.io/
Google Garbage cat (cms)
https://code.google.com/archiv
e/a/eclipselabs.org/p/garbagec
at
HP Jmeter
https://h20392.www2.hpe.com/
portal/swdepot/displayProductI
nfo.do?productNumber=HPJM
ETER
03
02
01
05
04
1. GC Log
2. Thread Dump
360-degree data
What is Thread dump?
https://tier1app.com/dist/sample/threaddump_QC1-031214.txt
2019-12-26 17:13:23
Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.7-b01 mixed mode):
"Reconnection-1" prio=10 tid=0x00007f0442e10800 nid=0x112a waiting on condition [0x00007f042f719000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x007b3953a98> (a java.util.concurrent.locks.AbstractQueuedSynchr)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.lang.Thread.run(Thread.java:722)
:
:
1
2
3
1 Timestamp at which thread dump was triggered
2 JVM Version info
3 Thread Details - <<details in following slides>>
Anatomy of thread dump
"InvoiceThread-A996" prio=10 tid=0x00002b7cfc6fb000 nid=0x4479 runnable [0x00002b7d17ab8000]
java.lang.Thread.State: RUNNABLE
at com.buggycompany.rt.util.ItinerarySegmentProcessor.setConnectingFlight(ItinerarySegmentProcessor.java:380)
at com.buggycompany.rt.util.ItinerarySegmentProcessor.processTripType0(ItinerarySegmentProcessor.java:366)
at com.buggycompany.rt.util.ItinerarySegmentProcessor.processItineraryByTripType(ItinerarySegmentProcessor.java:254)
at com.buggycompany.rt.util.ItinerarySegmentProcessor.templateMethod(ItinerarySegmentProcessor.java:399)
at com.buggycompany.qc.gds.InvoiceGeneratedFacade.readTicketImage(InvoiceGeneratedFacade.java:252)
at com.buggycompany.qc.gds.InvoiceGeneratedFacade.doOrchestrate(InvoiceGeneratedFacade.java:151)
at com.buggycompany.framework.gdstask.BaseGDSFacade.orchestrate(BaseGDSFacade.java:32)
at com.buggycompany.framework.gdstask.BaseGDSFacade.doWork(BaseGDSFacade.java:22)
at com.buggycompany.framework.concurrent.BuggycompanyCallable.call(buggycompanyCallable.java:80)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
"InvoiceThread-A996" prio=10 tid=0x00002b7cfc6fb000 nid=0x4479 runnable [0x00002b7d17ab8000]
java.lang.Thread.State: RUNNABLE
at com.buggycompany.rt.util.ItinerarySegmentProcessor.setConnectingFlight(ItinerarySegmentProcessor.java:380)
at com.buggycompany.rt.util.ItinerarySegmentProcessor.processTripType0(ItinerarySegmentProcessor.java:366)
at com.buggycompany.rt.util.ItinerarySegmentProcessor.processItineraryByTripType(ItinerarySegmentProcessor.java:254)
at com.buggycompany.rt.util.ItinerarySegmentProcessor.templateMethod(ItinerarySegmentProcessor.java:399)
at com.buggycompany.qc.gds.InvoiceGeneratedFacade.readTicketImage(InvoiceGeneratedFacade.java:252)
at com.buggycompany.qc.gds.InvoiceGeneratedFacade.doOrchestrate(InvoiceGeneratedFacade.java:151)
at com.buggycompany.framework.gdstask.BaseGDSFacade.orchestrate(BaseGDSFacade.java:32)
at com.buggycompany.framework.gdstask.BaseGDSFacade.doWork(BaseGDSFacade.java:22)
at com.buggycompany.framework.concurrent.BuggycompanyCallable.call(buggycompanyCallable.java:80)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
1 2 3 4 5
6
7
1 Thread Name - InvoiceThread-A996
2 Priority - Can have values from 1 to 10
3 Thread Id - 0x00002b7cfc6fb000 – Unique ID assigned by JVM. It's returned by calling the Thread.getId() method.
4 Native Id - 0x4479 - This ID is highly platform dependent. On Linux, it's the pid of the thread. On Windows, it's simply the OS-level
thread ID within a process. On Mac OS X, it is said to be the native pthread_t value.
5 Address space - 0x00002b7d17ab8000 -
6 Thread State - RUNNABLE
7 Stack trace -
6 Thread states
RUNNABLE
TERMINATED
NEW
TIMED_WAITING
Thread.sleep(10);
WAITING
03
02
01
06
05
public void synchronized getData() {
makeDBCall();
}
BLOCKED
04
Thread 1: Runnable
Thread 2: BLOCKED
wait();
Thread 1: Runnable
How to analyze Thread dump?
https://www.ibm.com/support/pages/ibm-thread-and-
monitor-dump-analyzer-java-tmda
IBM TDMA
yCrash
https://ycrash.io/
FastThread
https://fastthread.io/
03
02
01
https://tinyurl.com/wq95weo
Sample thread report
1. GC Log
2. Thread Dump
3. Heap Dump
360-degree data
What is Heap dump?
https://tier1app.com/dist/sample/small-hd.bin
How to analyze Heap dump?
jhat (oracle.com)
Jhat
Eclipse MAT
https://www.eclipse.org/mat
HeapHero
https://heaphero.io/
03
02
01
https://tinyurl.com/5sxz7dsr
Sample heap report
yCrash
https://ycrash.io/
04
1. GC Log
2. Thread Dump
3. Heap Dump
4. Heap Substitute
360-degree data
What is ‘heap substitute’ data?
GC.class_histogram:
num #instances #bytes class name
----------------------------------------------
1: 82990 21331670776 [B
2: 93138 7188159200 [I
3: 10054 139610768 [Ljava.lang.Object;
4: 1053236 127593568 [C
5: 1052271 33672672 java.lang.String
6: 825643 33025720
com.tier1app.heaphero.superpower.analysis.DupStringFinder$UniqueStrin
g
7: 252122 14118832
com.tier1app.heaphero.superpower.dump.objects.JPrimitiveArray
8: 281097 13492656
com.tier1app.heaphero.superpower.dump.objects.JObject
9: 250384 10015360
com.tier1app.heaphero.superpower.analysis.DupPrimitiveArrayFinder$Uni
queArray
10: 262144 8388608
org.apache.logging.log4j.core.async.AsyncLoggerConfigDisruptor$Log4jEv
entWrapper
:
:
Total 5765291 28986068816
VM.system_properties:
#Tue Feb 22 13:46:41 UTC 2022
app=yc
java.runtime.name=Java(TM) SE Runtime Environment
sun.boot.library.path=/home/ec2-user/java8/jdk1.8.0_131/jre/lib/amd
java.vm.version=25.131-b11
java.vm.vendor=Oracle Corporation
java.vendor.url=http://java.oracle.com/
path.separator=:
java.vm.name=Java HotSpot(TM) 64-Bit Server VM
file.encoding.pkg=sun.io
user.country=US
sun.java.launcher=SUN_STANDARD
:
GC.heap_info:
VM.flags:
-XX:CICompilerCount=12 -XX:InitialHeapSize=2147483648 -
XX:MaxHeapSize=118111600640 -XX:MaxNewSize=39370358784 -
XX:MinHeapDeltaBytes=524288 -XX:NewSize=715653120 -
XX:OldSize=1431830528 -XX:+UseParallelGC
How to capture Heap Substitute?
jcmd GC.class_histogram
jcmd VM.system_properties
jcmd GC.heap_info
1. GC Log
2. Thread Dump
3. Heap Dump
360-degree data
5. top
4. Heap Substitute
What is ‘top’ data?
How to capture ‘top’ data?
Command: top
Important sections in ‘top’
1. Processes CPU & Memory
2. Load Average
3. CPU utilization
Real problem in trading app
Load Avg of an ec2 instance
Load Avg of another ec2 instance
1. GC Log
2. Thread Dump
3. Heap Dump
360-degree data
6. ps
5. top
4. Heap Substitute
What is ‘ps’ data?
How to capture ‘ps’ data?
Command: ps -ef
‘ps’ ‘top’
1. GC Log
2. Thread Dump
3. Heap Dump
360-degree data
6. ps
5. top
4. Heap Substitute
7. top -H
What is ‘top -H’ data?
How to capture ‘top -H’ data?
Command : top –H –p {PID}
Best strategy to troubleshoot ‘‘CPU spikes“
‘top –H ’ ‘thread dump’
1. GC Log
2. Thread Dump
3. Heap Dump
360-degree data
6. ps
8. Disk Usage
5. top
4. Heap Substitute
7. top -H
What is ‘disk usage’ data?
How to capture ‘disk usage’ data?
Command : df -h
Analyze ‘disk’ usage data
1. GC Log
2. Thread Dump
9. dmesg
3. Heap Dump
360-degree data
6. ps
8. Disk Usage
5. top
4. Heap Substitute
7. top -H
What is ‘dmesg’ data?
How to capture ‘dmesg’ data?
Command : dmesg -T
Real world problems
HR Cloud app: CPU spike, puppet password expiration
OutOfMemoryError: https://answers.ycrash.io/question/jvm-restarts?q=526
1. GC Log
10. netstat
2. Thread Dump
9. dmesg
3. Heap Dump
360-degree data
6. ps
8. Disk Usage
5. top
4. Heap Substitute
7. top -H
What is ‘netstat’ data?
How to capture ‘netstat’ data?
Command : netstat -an
Analyze ‘netstat’ data
1. GC Log
10. netstat
2. Thread Dump
9. dmesg
3. Heap Dump
360-degree data
6. ps
8. Disk Usage
5. top
11. ping
4. Heap Substitute
7. top -H
What is ‘ping’ data?
How to capture ‘ping’ data?
Command : ping <host>
Analyze ‘ping’ data
1. GC Log
10. netstat
12. vmstat
2. Thread Dump
9. dmesg
3. Heap Dump
360-degree data
6. ps
8. Disk Usage
5. top
11. ping
4. Heap Substitute
7. top -H
What is ‘vmstat’ data?
How to capture ‘vmstat’ data?
Command : vmstat <interval> <count>
Analyze ‘vmstat’ data
1. GC Log
10. netstat
12. vmstat
2. Thread Dump
9. dmesg
3. Heap Dump
360-degree data
6. ps
8. Disk Usage
5. top 13. iostat
11. ping
14. Kernel Params
4. Heap Substitute
7. top -H
‘kernel params’?
How to capture ‘kernel params’?
Command : sysctl -a
Analyze ‘kernel params’
1. GC Log
10. netstat
12. vmstat
2. Thread Dump
9. dmesg
3. Heap Dump
360-degree data
6. ps
8. Disk Usage
5. top 13. iostat
11. ping
14. Kernel Params
15. App Logs
4. Heap Substitute
7. top -H
Application log
Capture last 1000 lines
1. GC Log
10. netstat
12. vmstat
2. Thread Dump
9. dmesg
3. Heap Dump
360-degree data
6. ps
8. Disk Usage
5. top 13. iostat
11. ping
14. Kernel Params
15. App Logs
16. metadata
4. Heap Substitute
7. top -H
What is metada?
hostName=ip-172-31-4-96.us-west-1.compute.internal
processId=5013
appName=aps
whoami=ec2-user
javaVersion=java version "1.8.0_131", Java(TM) SE Runtime Environment (build 1.8.0_131-b11), Java
HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode),
osVersion=Linux ip-172-31-4-96.us-west-1.compute.internal 4.14.243-185.433.amzn2.x86_64 #1 SMP Mon
Aug 9 05:55:52 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux,
tags=manualCapture, release 2.01
Thank You my Friends!
Ram Lakshmanan
ram@tier1app.com
@tier1app
linkedin.com/company/ycrash
This deck will be published in: https://blog.ycrash.io
Script to capture: https://docs.ycrash.io/ycrash-features/ycrash-faq/only-capture-artifacts.html

More Related Content

What's hot (20)

PPT
Troubleshooting performanceavailabilityproblems (1)
Tier1 app
 
PPTX
How to write memory efficient code?
Tier1 app
 
PPTX
7 habits of highly effective Performance Troubleshooters
Tier1 app
 
PPTX
Become a Garbage Collection Hero
Tier1app
 
PPTX
Become a GC Hero
Tier1app
 
PPTX
GC Tuning & Troubleshooting Crash Course
Tier1 app
 
PPTX
Accelerating Incident Response To Production Outages
Tier1 app
 
PPTX
Don't dump thread dumps
Tier1app
 
PPTX
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Tier1app
 
PDF
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
Badoo Development
 
PPTX
Shooting the troubles: Crashes, Slowdowns, CPU Spikes
Tier1 app
 
PDF
Nine Circles of Inferno or Explaining the PostgreSQL Vacuum
Alexey Lesovsky
 
PDF
Troubleshooting PostgreSQL with pgCenter
Alexey Lesovsky
 
PPTX
Am I reading GC logs Correctly?
Tier1 App
 
PPT
jvm goes to big data
srisatish ambati
 
PDF
Embedded systems
Katy Anton
 
PDF
GitLab PostgresMortem: Lessons Learned
Alexey Lesovsky
 
PDF
Profiling and optimizing go programs
Badoo Development
 
KEY
Varnish @ Velocity Ignite
Artur Bergman
 
PDF
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Troubleshooting performanceavailabilityproblems (1)
Tier1 app
 
How to write memory efficient code?
Tier1 app
 
7 habits of highly effective Performance Troubleshooters
Tier1 app
 
Become a Garbage Collection Hero
Tier1app
 
Become a GC Hero
Tier1app
 
GC Tuning & Troubleshooting Crash Course
Tier1 app
 
Accelerating Incident Response To Production Outages
Tier1 app
 
Don't dump thread dumps
Tier1app
 
Modern Engineer’s Troubleshooting Tools, Techniques & Tricks at Confoo 2018
Tier1app
 
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
Badoo Development
 
Shooting the troubles: Crashes, Slowdowns, CPU Spikes
Tier1 app
 
Nine Circles of Inferno or Explaining the PostgreSQL Vacuum
Alexey Lesovsky
 
Troubleshooting PostgreSQL with pgCenter
Alexey Lesovsky
 
Am I reading GC logs Correctly?
Tier1 App
 
jvm goes to big data
srisatish ambati
 
Embedded systems
Katy Anton
 
GitLab PostgresMortem: Lessons Learned
Alexey Lesovsky
 
Profiling and optimizing go programs
Badoo Development
 
Varnish @ Velocity Ignite
Artur Bergman
 
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 

Similar to 16 artifacts to capture when there is a production problem (20)

PPTX
‘16 artifacts’ to capture when there is a production problem
Tier1 app
 
PPTX
16 ARTIFACTS TO CAPTURE WHEN THERE IS A PRODUCTION PROBLEM
KumarNagaraju4
 
PPTX
16 Critical Artifacts to Capture During Production Problems with Payara Server
KumarNagaraju4
 
PPTX
Top-5-production-devconMunich-2023-v2.pptx
Tier1 app
 
PPTX
Top-5-production-devconMunich-2023.pptx
Tier1 app
 
PPTX
Troubleshooting JVM Outages – 3 Fortune 500 case studies
Tier1 app
 
PPTX
Troubleshooting JVM Outages – 3 Fortune 500 case studies
Tier1 app
 
PPTX
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLE
Tier1 app
 
PPTX
Top 5 Java Performance Problems Presentation!
Tier1 app
 
PPTX
Top-5-Performance-JaxLondon-2023.pptx
Tier1 app
 
PPTX
Troubleshooting JVM Outages – 3 Fortune 500 Case Studies
Tier1 app
 
PPTX
Major Outages in Major Enterprises Payara Conference
Tier1 app
 
PPTX
TroubleshootingJVMOutages-3CaseStudies.pptx
Tier1 app
 
PPTX
MAJOR OUTAGES IN MAJOR ENTERPRISES
Tier1 app
 
PPTX
Don't dump thread dumps
Tier1 App
 
PPTX
TroubleshootingJVMOutages-3CaseStudies (1).pptx
Tier1 app
 
PPT
Heap & thread dump
Nishit Charania
 
PPTX
Windows Debugging with WinDbg
Arno Huetter
 
PDF
Fundamentals of Complete Crash and Hang Memory Dump Analysis (Revision 2)
Dmitry Vostokov
 
PPTX
Top-5-java-perf-problems-jax_mainz_2024.pptx
Tier1 app
 
‘16 artifacts’ to capture when there is a production problem
Tier1 app
 
16 ARTIFACTS TO CAPTURE WHEN THERE IS A PRODUCTION PROBLEM
KumarNagaraju4
 
16 Critical Artifacts to Capture During Production Problems with Payara Server
KumarNagaraju4
 
Top-5-production-devconMunich-2023-v2.pptx
Tier1 app
 
Top-5-production-devconMunich-2023.pptx
Tier1 app
 
Troubleshooting JVM Outages – 3 Fortune 500 case studies
Tier1 app
 
Troubleshooting JVM Outages – 3 Fortune 500 case studies
Tier1 app
 
16 ARTIFACTS TO CAPTURE WHEN YOUR CONTAINER APPLICATION IS IN TROUBLE
Tier1 app
 
Top 5 Java Performance Problems Presentation!
Tier1 app
 
Top-5-Performance-JaxLondon-2023.pptx
Tier1 app
 
Troubleshooting JVM Outages – 3 Fortune 500 Case Studies
Tier1 app
 
Major Outages in Major Enterprises Payara Conference
Tier1 app
 
TroubleshootingJVMOutages-3CaseStudies.pptx
Tier1 app
 
MAJOR OUTAGES IN MAJOR ENTERPRISES
Tier1 app
 
Don't dump thread dumps
Tier1 App
 
TroubleshootingJVMOutages-3CaseStudies (1).pptx
Tier1 app
 
Heap & thread dump
Nishit Charania
 
Windows Debugging with WinDbg
Arno Huetter
 
Fundamentals of Complete Crash and Hang Memory Dump Analysis (Revision 2)
Dmitry Vostokov
 
Top-5-java-perf-problems-jax_mainz_2024.pptx
Tier1 app
 
Ad

More from Tier1 app (18)

PPTX
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
PPTX
Key Challenges in Troubleshooting Customer On-Premise Applications
Tier1 app
 
PPTX
Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Tier1 app
 
PPTX
GC Tuning: A Masterpiece in Performance Engineering
Tier1 app
 
PPTX
How to Troubleshoot 9 Types of OutOfMemoryError
Tier1 app
 
PPTX
Not So Common Memory Leaks in Java Webinar
Tier1 app
 
PPTX
Common Memory Leaks in Java and How to Fix Them
Tier1 app
 
PPTX
7 Micro-Metrics That Predict Production Outages in Performance Labs Webinar
Tier1 app
 
PPTX
Mastering Thread Dump Analysis: 9 Tips & Tricks
Tier1 app
 
PPTX
How to Check and Optimize Memory Size for Better Application Performance
Tier1 app
 
PPTX
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 
PPTX
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
PPTX
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
Tier1 app
 
PPTX
Effectively Troubleshoot 9 Types of OutOfMemoryError
Tier1 app
 
PPTX
predicting-m3-devopsconMunich-2023-v2.pptx
Tier1 app
 
PPTX
predicting-m3-devopsconMunich-2023.pptx
Tier1 app
 
PPTX
Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...
Tier1 app
 
PPTX
7-JVM-arguments-JaxLondon-2023.pptx
Tier1 app
 
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
Key Challenges in Troubleshooting Customer On-Premise Applications
Tier1 app
 
Micro-Metrics Every Performance Engineer Should Validate Before Sign-Off
Tier1 app
 
GC Tuning: A Masterpiece in Performance Engineering
Tier1 app
 
How to Troubleshoot 9 Types of OutOfMemoryError
Tier1 app
 
Not So Common Memory Leaks in Java Webinar
Tier1 app
 
Common Memory Leaks in Java and How to Fix Them
Tier1 app
 
7 Micro-Metrics That Predict Production Outages in Performance Labs Webinar
Tier1 app
 
Mastering Thread Dump Analysis: 9 Tips & Tricks
Tier1 app
 
How to Check and Optimize Memory Size for Better Application Performance
Tier1 app
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
Tier1 app
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Tier1 app
 
predicting-m3-devopsconMunich-2023-v2.pptx
Tier1 app
 
predicting-m3-devopsconMunich-2023.pptx
Tier1 app
 
Predicting Production Outages: Unleashing the Power of Micro-Metrics – ADDO C...
Tier1 app
 
7-JVM-arguments-JaxLondon-2023.pptx
Tier1 app
 
Ad

Recently uploaded (20)

PDF
Simplify React app login with asgardeo-sdk
vaibhav289687
 
PDF
AI Prompts Cheat Code prompt engineering
Avijit Kumar Roy
 
PDF
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
PPTX
Build a Custom Agent for Agentic Testing.pptx
klpathrudu
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PPTX
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PPTX
Smart Doctor Appointment Booking option in odoo.pptx
AxisTechnolabs
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
PPTX
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
PPTX
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
PPTX
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PDF
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
PDF
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
PPTX
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
PPTX
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
Simplify React app login with asgardeo-sdk
vaibhav289687
 
AI Prompts Cheat Code prompt engineering
Avijit Kumar Roy
 
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
Build a Custom Agent for Agentic Testing.pptx
klpathrudu
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
Smart Doctor Appointment Booking option in odoo.pptx
AxisTechnolabs
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 

16 artifacts to capture when there is a production problem

  • 1. Ram Lakshmanan | architect: yCrash ‘16 artifacts’ to capture when there is a production problem
  • 4. What is GC Log? https://tier1app.com/dist/sample/g1-repeatedGC.txt
  • 5. How to capture GC Log? -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<file-path> Till Java 8 -Xlog:gc*:file=<file-path> From Java 9
  • 6. Why GC log is important? https://blog.ycrash.io/2021/10/15/interesting-garbage-collection-patterns/
  • 8. Fig: Heavy caching GC Pattern
  • 9. Fig: Acute Memory Leak Pattern
  • 10. Fig: Consecutive Full GC pattern
  • 11. Fig: Memory Leak GC pattern
  • 12. How to analyze GC Log? https://developer.ibm.com/javas dk/tools/ IBM GC & Memory visualizer GC Viewer https://github.com/chewiebug/G CViewer GCeasy https://gceasy.io/ Google Garbage cat (cms) https://code.google.com/archiv e/a/eclipselabs.org/p/garbagec at HP Jmeter https://h20392.www2.hpe.com/ portal/swdepot/displayProductI nfo.do?productNumber=HPJM ETER 03 02 01 05 04
  • 13. 1. GC Log 2. Thread Dump 360-degree data
  • 14. What is Thread dump? https://tier1app.com/dist/sample/threaddump_QC1-031214.txt
  • 15. 2019-12-26 17:13:23 Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.7-b01 mixed mode): "Reconnection-1" prio=10 tid=0x00007f0442e10800 nid=0x112a waiting on condition [0x00007f042f719000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x007b3953a98> (a java.util.concurrent.locks.AbstractQueuedSynchr) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.lang.Thread.run(Thread.java:722) : : 1 2 3 1 Timestamp at which thread dump was triggered 2 JVM Version info 3 Thread Details - <<details in following slides>> Anatomy of thread dump "InvoiceThread-A996" prio=10 tid=0x00002b7cfc6fb000 nid=0x4479 runnable [0x00002b7d17ab8000] java.lang.Thread.State: RUNNABLE at com.buggycompany.rt.util.ItinerarySegmentProcessor.setConnectingFlight(ItinerarySegmentProcessor.java:380) at com.buggycompany.rt.util.ItinerarySegmentProcessor.processTripType0(ItinerarySegmentProcessor.java:366) at com.buggycompany.rt.util.ItinerarySegmentProcessor.processItineraryByTripType(ItinerarySegmentProcessor.java:254) at com.buggycompany.rt.util.ItinerarySegmentProcessor.templateMethod(ItinerarySegmentProcessor.java:399) at com.buggycompany.qc.gds.InvoiceGeneratedFacade.readTicketImage(InvoiceGeneratedFacade.java:252) at com.buggycompany.qc.gds.InvoiceGeneratedFacade.doOrchestrate(InvoiceGeneratedFacade.java:151) at com.buggycompany.framework.gdstask.BaseGDSFacade.orchestrate(BaseGDSFacade.java:32) at com.buggycompany.framework.gdstask.BaseGDSFacade.doWork(BaseGDSFacade.java:22) at com.buggycompany.framework.concurrent.BuggycompanyCallable.call(buggycompanyCallable.java:80) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722)
  • 16. "InvoiceThread-A996" prio=10 tid=0x00002b7cfc6fb000 nid=0x4479 runnable [0x00002b7d17ab8000] java.lang.Thread.State: RUNNABLE at com.buggycompany.rt.util.ItinerarySegmentProcessor.setConnectingFlight(ItinerarySegmentProcessor.java:380) at com.buggycompany.rt.util.ItinerarySegmentProcessor.processTripType0(ItinerarySegmentProcessor.java:366) at com.buggycompany.rt.util.ItinerarySegmentProcessor.processItineraryByTripType(ItinerarySegmentProcessor.java:254) at com.buggycompany.rt.util.ItinerarySegmentProcessor.templateMethod(ItinerarySegmentProcessor.java:399) at com.buggycompany.qc.gds.InvoiceGeneratedFacade.readTicketImage(InvoiceGeneratedFacade.java:252) at com.buggycompany.qc.gds.InvoiceGeneratedFacade.doOrchestrate(InvoiceGeneratedFacade.java:151) at com.buggycompany.framework.gdstask.BaseGDSFacade.orchestrate(BaseGDSFacade.java:32) at com.buggycompany.framework.gdstask.BaseGDSFacade.doWork(BaseGDSFacade.java:22) at com.buggycompany.framework.concurrent.BuggycompanyCallable.call(buggycompanyCallable.java:80) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) 1 2 3 4 5 6 7 1 Thread Name - InvoiceThread-A996 2 Priority - Can have values from 1 to 10 3 Thread Id - 0x00002b7cfc6fb000 – Unique ID assigned by JVM. It's returned by calling the Thread.getId() method. 4 Native Id - 0x4479 - This ID is highly platform dependent. On Linux, it's the pid of the thread. On Windows, it's simply the OS-level thread ID within a process. On Mac OS X, it is said to be the native pthread_t value. 5 Address space - 0x00002b7d17ab8000 - 6 Thread State - RUNNABLE 7 Stack trace -
  • 17. 6 Thread states RUNNABLE TERMINATED NEW TIMED_WAITING Thread.sleep(10); WAITING 03 02 01 06 05 public void synchronized getData() { makeDBCall(); } BLOCKED 04 Thread 1: Runnable Thread 2: BLOCKED wait(); Thread 1: Runnable
  • 18. How to analyze Thread dump? https://www.ibm.com/support/pages/ibm-thread-and- monitor-dump-analyzer-java-tmda IBM TDMA yCrash https://ycrash.io/ FastThread https://fastthread.io/ 03 02 01 https://tinyurl.com/wq95weo Sample thread report
  • 19. 1. GC Log 2. Thread Dump 3. Heap Dump 360-degree data
  • 20. What is Heap dump? https://tier1app.com/dist/sample/small-hd.bin
  • 21. How to analyze Heap dump? jhat (oracle.com) Jhat Eclipse MAT https://www.eclipse.org/mat HeapHero https://heaphero.io/ 03 02 01 https://tinyurl.com/5sxz7dsr Sample heap report yCrash https://ycrash.io/ 04
  • 22. 1. GC Log 2. Thread Dump 3. Heap Dump 4. Heap Substitute 360-degree data
  • 23. What is ‘heap substitute’ data? GC.class_histogram: num #instances #bytes class name ---------------------------------------------- 1: 82990 21331670776 [B 2: 93138 7188159200 [I 3: 10054 139610768 [Ljava.lang.Object; 4: 1053236 127593568 [C 5: 1052271 33672672 java.lang.String 6: 825643 33025720 com.tier1app.heaphero.superpower.analysis.DupStringFinder$UniqueStrin g 7: 252122 14118832 com.tier1app.heaphero.superpower.dump.objects.JPrimitiveArray 8: 281097 13492656 com.tier1app.heaphero.superpower.dump.objects.JObject 9: 250384 10015360 com.tier1app.heaphero.superpower.analysis.DupPrimitiveArrayFinder$Uni queArray 10: 262144 8388608 org.apache.logging.log4j.core.async.AsyncLoggerConfigDisruptor$Log4jEv entWrapper : : Total 5765291 28986068816 VM.system_properties: #Tue Feb 22 13:46:41 UTC 2022 app=yc java.runtime.name=Java(TM) SE Runtime Environment sun.boot.library.path=/home/ec2-user/java8/jdk1.8.0_131/jre/lib/amd java.vm.version=25.131-b11 java.vm.vendor=Oracle Corporation java.vendor.url=http://java.oracle.com/ path.separator=: java.vm.name=Java HotSpot(TM) 64-Bit Server VM file.encoding.pkg=sun.io user.country=US sun.java.launcher=SUN_STANDARD : GC.heap_info: VM.flags: -XX:CICompilerCount=12 -XX:InitialHeapSize=2147483648 - XX:MaxHeapSize=118111600640 -XX:MaxNewSize=39370358784 - XX:MinHeapDeltaBytes=524288 -XX:NewSize=715653120 - XX:OldSize=1431830528 -XX:+UseParallelGC
  • 24. How to capture Heap Substitute? jcmd GC.class_histogram jcmd VM.system_properties jcmd GC.heap_info
  • 25. 1. GC Log 2. Thread Dump 3. Heap Dump 360-degree data 5. top 4. Heap Substitute
  • 27. How to capture ‘top’ data? Command: top
  • 28. Important sections in ‘top’ 1. Processes CPU & Memory 2. Load Average 3. CPU utilization
  • 29. Real problem in trading app Load Avg of an ec2 instance Load Avg of another ec2 instance
  • 30. 1. GC Log 2. Thread Dump 3. Heap Dump 360-degree data 6. ps 5. top 4. Heap Substitute
  • 32. How to capture ‘ps’ data? Command: ps -ef
  • 34. 1. GC Log 2. Thread Dump 3. Heap Dump 360-degree data 6. ps 5. top 4. Heap Substitute 7. top -H
  • 35. What is ‘top -H’ data?
  • 36. How to capture ‘top -H’ data? Command : top –H –p {PID} Best strategy to troubleshoot ‘‘CPU spikes“
  • 37. ‘top –H ’ ‘thread dump’
  • 38. 1. GC Log 2. Thread Dump 3. Heap Dump 360-degree data 6. ps 8. Disk Usage 5. top 4. Heap Substitute 7. top -H
  • 39. What is ‘disk usage’ data?
  • 40. How to capture ‘disk usage’ data? Command : df -h
  • 42. 1. GC Log 2. Thread Dump 9. dmesg 3. Heap Dump 360-degree data 6. ps 8. Disk Usage 5. top 4. Heap Substitute 7. top -H
  • 44. How to capture ‘dmesg’ data? Command : dmesg -T
  • 45. Real world problems HR Cloud app: CPU spike, puppet password expiration OutOfMemoryError: https://answers.ycrash.io/question/jvm-restarts?q=526
  • 46. 1. GC Log 10. netstat 2. Thread Dump 9. dmesg 3. Heap Dump 360-degree data 6. ps 8. Disk Usage 5. top 4. Heap Substitute 7. top -H
  • 48. How to capture ‘netstat’ data? Command : netstat -an
  • 50. 1. GC Log 10. netstat 2. Thread Dump 9. dmesg 3. Heap Dump 360-degree data 6. ps 8. Disk Usage 5. top 11. ping 4. Heap Substitute 7. top -H
  • 52. How to capture ‘ping’ data? Command : ping <host>
  • 54. 1. GC Log 10. netstat 12. vmstat 2. Thread Dump 9. dmesg 3. Heap Dump 360-degree data 6. ps 8. Disk Usage 5. top 11. ping 4. Heap Substitute 7. top -H
  • 56. How to capture ‘vmstat’ data? Command : vmstat <interval> <count>
  • 58. 1. GC Log 10. netstat 12. vmstat 2. Thread Dump 9. dmesg 3. Heap Dump 360-degree data 6. ps 8. Disk Usage 5. top 13. iostat 11. ping 14. Kernel Params 4. Heap Substitute 7. top -H
  • 60. How to capture ‘kernel params’? Command : sysctl -a
  • 62. 1. GC Log 10. netstat 12. vmstat 2. Thread Dump 9. dmesg 3. Heap Dump 360-degree data 6. ps 8. Disk Usage 5. top 13. iostat 11. ping 14. Kernel Params 15. App Logs 4. Heap Substitute 7. top -H
  • 64. 1. GC Log 10. netstat 12. vmstat 2. Thread Dump 9. dmesg 3. Heap Dump 360-degree data 6. ps 8. Disk Usage 5. top 13. iostat 11. ping 14. Kernel Params 15. App Logs 16. metadata 4. Heap Substitute 7. top -H
  • 65. What is metada? hostName=ip-172-31-4-96.us-west-1.compute.internal processId=5013 appName=aps whoami=ec2-user javaVersion=java version "1.8.0_131", Java(TM) SE Runtime Environment (build 1.8.0_131-b11), Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode), osVersion=Linux ip-172-31-4-96.us-west-1.compute.internal 4.14.243-185.433.amzn2.x86_64 #1 SMP Mon Aug 9 05:55:52 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux, tags=manualCapture, release 2.01
  • 66. Thank You my Friends! Ram Lakshmanan [email protected] @tier1app linkedin.com/company/ycrash This deck will be published in: https://blog.ycrash.io Script to capture: https://docs.ycrash.io/ycrash-features/ycrash-faq/only-capture-artifacts.html

Editor's Notes

  • #25: https://ycrash.io/yc-report-trimmed-heap.jsp?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T13-58-24
  • #30: High Load Average: https://ycrash.io/yc-report-top.jsp?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T14-19-35 Low Load Average: https://ycrash.io/yc-report-top.jsp?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T14-24-13
  • #34: https://ycrash.io/yc-report-top.jsp?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T13-58-24
  • #38: https://ycrash.io/yc-load-report-ft?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T13-58-24
  • #42: https://ycrash.io/yc-load-report-ft?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T13-58-24
  • #50: https://ycrash.io/yc-report-netStat.jsp?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T14-19-35
  • #54: https://ycrash.io/yc-report-netStat.jsp?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T13-58-24
  • #58: https://ycrash.io/yc-report-vmstat.jsp?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T13-58-24
  • #62: https://ycrash.io/yc-report-vmstat.jsp?ou=ram-tier1app-com&de=host&app=yc&ts=2022-02-22T13-58-24