SlideShare a Scribd company logo
RAC: 
Administering Parallel Execution 
By 
Riyaj Shamsudeen 
©OraInternals Riyaj Shamsudeen
©OraInternals Riyaj Shamsudeen 2 
Who am I? 
 19 years using Oracle products/DBA 
 OakTable member 
 Oracle ACE Director 
 Certified DBA versions 7.0,7.3,8,8i,9i &10g 
 Specializes in RAC, performance tuning, Internals and 
E-business suite 
 Chief DBA with OraInternals 
 Co-author of “Expert Oracle Practices” ‘2009 
 Co-author of “Pro Oracle SQL” ‘2010 
 Email: rshamsud@orainternals.com 
 Blog : orainternals.wordpress.com 
 URL: www.orainternals.com
 A slave set perform specific task at a point in execution plan. 
After the completion of a task, slave set can be reassigned to 
perform a different task. 
©OraInternals Riyaj Shamsudeen 3 
Parallel Execution 
 Parallel query uses PX slave processes to perform work. 
 Proper configuration of private interconnect and optimal 
execution plan is an essential step in scaling the PX operation. 
 Placement of PX slaves can be controlled by Services or 
parallel_instance_group configuration.
©OraInternals Riyaj Shamsudeen 4 
Example 
create index mut_t1 on mut (transaction_id) 
parallel (degree 8) nologging; 
---------------------------------------------------------------------------------------------------------------------- 
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | 
---------------------------------------------------------------------------------------------------------------------- 
| 0 | CREATE INDEX STATEMENT | | | | 117K(100)| | | | | 
| 1 | PX COORDINATOR | | | | | | | | | 
| 2 | PX SEND QC (ORDER) | :TQ10001 | 13M| 166M| | | Q1,01 | P->S | QC (ORDER) | 
| 3 | INDEX BUILD NON UNIQUE| MUT_T1 | | | | | Q1,01 | PCWP | | 
| 4 | SORT CREATE INDEX | | 13M| 166M| | | Q1,01 | PCWP | | 
| 5 | PX RECEIVE | | 13M| 166M| 116K (1)| 00:05:50 | Q1,01 | PCWP | | 
| 6 | PX SEND RANGE | :TQ10000 | 13M| 166M| 116K (1)| 00:05:50 | Q1,00 | P->P | RANGE | 
| 7 | PX BLOCK ITERATOR | | 13M| 166M| 116K (1)| 00:05:50 | Q1,00 | PCWC | | 
|* 8 | TABLE ACCESS FULL| MUT | 13M| 166M| 116K (1)| 00:05:50 | Q1,00 | PCWP | | 
---------------------------------------------------------------------------------------------------------------------- 
One set of server processes 
reading the data 
Another set of server 
processes, sorting and creating 
index segments. 
P -> P transfer between these two slave sets.
PX: intra vs inter-instance 
 Intra-instance PX operation: All slaves are allocated in the current 
instance. 
 Inter-instance PX operation: Slaves are allocated from more than 
one instance. 
 In intra-instance PX operation, Slaves communicate with each 
other, passing buffers between them and does not use 
interconnect. 
 In inter-instance PX operation, slaves use interconnect to 
exchange buffers and messages, among the slave sets or Co-ordinator 
©OraInternals Riyaj Shamsudeen 5 
processes.
Distribution method of rows between 
producers and consumers are key to 
avoid flooding interconnect. 
P9 P10 P11 … P16 P9 P10 P11 … P16 
Consumers 
©OraInternals Riyaj Shamsudeen 6 
Architecture 
QC 
P1 
Inst 2 
Inst 1 
P1 P2 P3 P8 
P2 P3 … P8 … 
Producers 
P ->P
©OraInternals Riyaj Shamsudeen 7 
Controlling placement 
 Allocation of PX slaves can be controlled using few techniques: 
1) Services 
2) Instance groups/parallel_instance_group 
3) Combination of Services and parallel_instance_group 
 Instance_groups based setup is widely used in the database 
versions 10g and below. By default, all instances can participate in 
a PX operation. 
 From 10g onwards, you should control placement of slaves using 
Services. 
 Placement control using services is much more elegant and 
dynamic. Failover of services automatically handles PX 
placement.
Placement: Two instances 
 Sessions starting PX operations in inst1 can allocate slaves 
in both inst1 and inst2 
inst1.instance_groups='inst1’,’inst12' 
inst2.instance_groups='inst2',’inst12' 
inst3.instance_groups='inst3’ 
inst1.parallel_instance_group= 'inst12' 
P9 
P9 
c 
©OraInternals Riyaj Shamsudeen 8 
QC 
P9 
P1 
P9 
P1 
… P1 
P1 
… 
inst1 inst2 inst3
Placement: Services: two instances 
 Service FIN is located in two instance. 
 Slaves allocated from inst1 and inst2. 
srvctl add service –d solrac –s FIN –r inst1,ins2 –a inst3 
FIN FIN 
P9 
P9 
©OraInternals Riyaj Shamsudeen 9 
QC 
P9 
P1 
P9 
P1 
… P1 
P1 
… 
inst1 inst2 inst3 
demo:pq_query_nopart.sql with po service
Placement: Combination: PIG 
 Service alone also can be used to control the placement. 
 Slaves will be allocated from two instances: inst1 and inst2. 
srvctl add service –d solrac –s FIN –r inst1,ins2,inst3 
srvctl add service –d solrac –s FIN2 –r inst1,ins2 
Alter session set parallel_instance_group='FIN2'; 
FIN FIN2 
FIN FIN 
P9 
P9 
©OraInternals Riyaj Shamsudeen 10 
QC 
P9 
P1 
P9 
P1 
… P1 
P1 
… … 
inst1 inst2 inst3 
FIN2
©OraInternals Riyaj Shamsudeen 11 
Recommendations 
 Both instance_groups/parallel_instance_groups and Services 
method can be concurrently used in a database. 
 That might complicate debugging, Use services from 11g 
onwards. Use instance_groups for 10g and below. 
 For more granular control, use parallel_instance_groups and 
point to a more appropriate service, as an exception.
 Until 11g, measuring PQ traffic is not easy. 
 In 10g, If AWR does not report PQ interconnect traffic, but if 
the device statistics are reporting high interconnect traffic, then it 
is possible that application might be generating high PQ load. 
©OraInternals Riyaj Shamsudeen 12 
Measuring PQ traffic 
 IPQ statistics are visible in x$ksxpclient at a client level from 11g 
onwards.. 
 AWR snaps stores these statistics in dba_hist_ic_client_stats and 
prints PQ traffic in AWR section in a more readable format.
©OraInternals Riyaj Shamsudeen 13 
AWR report 
 AWR report in 11g prints the stats as a per second rate 
Interconnect Throughput by Client DB/Inst: SOLRAC/solrac1 Snaps: 1012-1013 
-> Throughput of interconnect usage by major consumers 
-> All throughput numbers are megabytes per second 
Send Receive 
Used By Mbytes/sec Mbytes/sec 
---------------- ----------- ----------- 
Global Cache .02 .04 
Parallel Query .36 .15 
DB Locks .01 .01 
DB Streams .00 .00 
Other .00 .00
©OraInternals Riyaj Shamsudeen 14 
PQ and cache fusion 
 Parallel query slave processes read the block directly in to their 
PGA using direct reads (except 11gR2 new feature – in memory 
parallelism). 
 With direct reads to PGA, there is no need for global cache 
grants. 
 For intra-instance parallel operation, the performance difference 
between a single instance and RAC is minimal. 
 Still, objects need to be checkpointed at the start of a PQ 
operation from all nodes in case of RAC.
Parallel_execution_message_size 
 Size of buffer transferred between the slaves is determined by 
the parameter parallel_execution_message_size (PEMS). 
 Default value is too small increasing chatty traffic. 
 The default vale is 2k or 4K depending upon the version. 
 Increase the value of this parameter to at least 16K. Downside is 
that increase in shared pool size (PX Msg Buffers). 
 Realizing the performance implications of this parameter, default 
value increased to 16K in 11gR2 (Compatible must be set to 
11.2.0.2 and OS specific). 
 Jumbo frames + PEMS, both set above 16K are good starting 
points for the inter-instance PQ intensive environments. 
©OraInternals Riyaj Shamsudeen 15
 Auto DOP: Optimizer chooses optimal parallelism even if you 
don’t specify parallel hint or parallel degree at object level. 
 PQ queueing feature is useful in PQ intensive data warehousing 
environments. 
©OraInternals Riyaj Shamsudeen 16 
New features (11.2) 
 Parallel Statement Queueing: Query will be queued until there is 
sufficient amount of PQ servers available. 
 In memory parallelism: PQ can read blocks in to buffer cache. 
Demo: pq_queueing.sql if time permits
In-memory parallelism (11gR2) 
 Due to the size of mammoth servers, now it is not uncommon 
to see SGA with a size of 100GB. 
 11gR2 introduced in-memory parallelism. Essentially, PQ servers 
can read buffers in to SGA. 
SGA SGA SGA 
P1 P2 P3 P4 P5 P6 
inst1 inst2 inst3 
P1 P2 P3 P4 P5 P6 
©OraInternals Riyaj Shamsudeen 17
Parallel_force_local (11.2) 
 This parameter controls whether the parallelism to be forced to 
single instance. 
 Default is FALSE and there is no reason to change it. Parallel 
Statement Queueing feature (11.2) interacts badly if this 
parameter is true. 
 Use services if you want to keep Parallel executions to a single 
node, rather than adjusting this parameter. 
©OraInternals Riyaj Shamsudeen 18
 Parallel slaves can be allocated in an instance or multiple 
instances depending upon the configuration. 
©OraInternals Riyaj Shamsudeen 19 
Parallel DML 
 Parallel DML (changes) also works in the same way of Parallel 
query. 
 In DML operation there are two distinct group of parallel 
operations: 
 Parallelism for scan only. 
 Parallelism for both scan and changes.
Parallel DML – scan only-execution plan 
In the execution plan below, UPDATE step is above Co-ordinator. 
Only Scanning is done in parallel, changes are done in 
------------------------------------------------------------------------------------------ ... 
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ... 
----------------------------------------------------------------------------------------- ... 
| 0 | UPDATE STATEMENT | | 1 | 1028 | 2600 (1)| ... 
| 1 | UPDATE | HUGETABLE_HASH | | | | ... 
| 2 | PX COORDINATOR | | | | | ... 
| 3 | PX SEND QC (RANDOM) | :TQ10003 | 1 | 1028 | 2600 (1)| ... 
| 4 | NESTED LOOPS | | 1 | 1028 | 2600 (1)| ... 
| 5 | BUFFER SORT | | | | | ... 
| 6 | PX RECEIVE | | | | | ... 
| 7 | PX SEND BROADCAST | :TQ10002 | | | | ... 
| 8 | VIEW | VW_NSO_1 | 32186 | 408K| 2445 (1)| ... 
| 9 | HASH UNIQUE | | 1 | 817K| | ... 
| 10 | PX RECEIVE | | 1 | 817K| | ... 
| 11 | PX SEND HASH | :TQ10001 | 1 | 817K| | ... 
| 12 | HASH UNIQUE | | 1 | 817K| | ... 
|* 13 | HASH JOIN | | 32186 | 817K| 2445 (1)| ... 
| 14 | PX BLOCK ITERATOR | | 32186 | 408K| 1222 (0)| ... 
| 15 | TABLE ACCESS FULL | HUGETABLE_HASH | 32186 | 408K| 1222 (0)| … 
©OraInternals Riyaj Shamsudeen 20 
serial.
Should you use inter or intra-parallel operations? 
 This is a thorny question in production environment. 
 There is no correct answer to this question, it depends upon two 
factors: 
1. Node resources such as CPU, I/O bandwidth, IC bandwidth 
2. Size of the segment, whether it is partitioned or not etc. 
 If the index creation can be completed with just one node, 
meaning the nodes are beefed up nodes, then you should use one 
node. 
 If the nodes are not big enough, then it is okay to use all nodes, 
just realize that interconnect latency might affect performance. 
©OraInternals Riyaj Shamsudeen 21
Thank you for attending! 
If you like this presentation, you will love my upcoming 
intensive Advanced RAC Troubleshooting class. Watch 
for updates in: 
www.tanelpoder.com 
Orainternals.wordpress.com 
Contact info: 
Email: rshamsud@gmail.com 
Blog : orainternals.wordpress.com 
URL : www.orainternals.com 
©OraInternals Riyaj Shamsudeen 22
MARK YOUR CALENDARS! 
COLLABORATE 12 
April 22-26, 2012 
Mandalay Bay Convention Center 
Las Vegas, Nevada 
http://events.ioug.org/p/cm/ld/fid=15

More Related Content

What's hot (20)

PDF
Rac 12c optimization
Riyaj Shamsudeen
 
PDF
Debunking myths about_redo_ppt
Riyaj Shamsudeen
 
PPT
Dbms plan - A swiss army knife for performance engineers
Riyaj Shamsudeen
 
PDF
Advanced RAC troubleshooting: Network
Riyaj Shamsudeen
 
PDF
Riyaj real world performance issues rac focus
Riyaj Shamsudeen
 
PPT
Introduction to Parallel Execution
Doug Burns
 
PDF
Redo internals ppt
Riyaj Shamsudeen
 
PPT
Tracing Parallel Execution (UKOUG 2006)
Doug Burns
 
PDF
你所不知道的Oracle后台进程Smon功能
maclean liu
 
PDF
了解Oracle rac brain split resolution
maclean liu
 
PDF
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
maclean liu
 
PPT
How Many Slaves (Ukoug)
Doug Burns
 
PDF
oracle cloud with 2 nodes processing
mahdi ahmadi
 
PPT
11 Things About11g
fcamachob
 
PPTX
OpenWorld Sep14 12c for_developers
Connor McDonald
 
PPT
Rmoug ashmaster
Kyle Hailey
 
PPT
Ash masters : advanced ash analytics on Oracle
Kyle Hailey
 
PPTX
Full Table Scan: friend or foe
Mauro Pagano
 
PPT
Oracle 10g Performance: chapter 09 enqueues
Kyle Hailey
 
PPT
UKOUG, Oracle Transaction Locks
Kyle Hailey
 
Rac 12c optimization
Riyaj Shamsudeen
 
Debunking myths about_redo_ppt
Riyaj Shamsudeen
 
Dbms plan - A swiss army knife for performance engineers
Riyaj Shamsudeen
 
Advanced RAC troubleshooting: Network
Riyaj Shamsudeen
 
Riyaj real world performance issues rac focus
Riyaj Shamsudeen
 
Introduction to Parallel Execution
Doug Burns
 
Redo internals ppt
Riyaj Shamsudeen
 
Tracing Parallel Execution (UKOUG 2006)
Doug Burns
 
你所不知道的Oracle后台进程Smon功能
maclean liu
 
了解Oracle rac brain split resolution
maclean liu
 
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
maclean liu
 
How Many Slaves (Ukoug)
Doug Burns
 
oracle cloud with 2 nodes processing
mahdi ahmadi
 
11 Things About11g
fcamachob
 
OpenWorld Sep14 12c for_developers
Connor McDonald
 
Rmoug ashmaster
Kyle Hailey
 
Ash masters : advanced ash analytics on Oracle
Kyle Hailey
 
Full Table Scan: friend or foe
Mauro Pagano
 
Oracle 10g Performance: chapter 09 enqueues
Kyle Hailey
 
UKOUG, Oracle Transaction Locks
Kyle Hailey
 

Similar to Px execution in rac (20)

PDF
Parallel Execution With Oracle Database 12c - Masterclass
Ivica Arsov
 
PPTX
Analysing and troubleshooting Parallel Execution IT Tage 2015
Randolf Geist
 
PPTX
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel Queries
Christo Kutrovsky
 
PDF
Randolf Geist – IT-Tage 2015 – Oracle Parallel Execution – Analyse und Troubl...
Informatik Aktuell
 
PDF
Parallel Query on Exadata
Enkitec
 
PDF
Oracle Parallel Distribution and 12c Adaptive Plans
Franck Pachot
 
PDF
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdf
7vkx8892hv
 
PDF
Oracle 12c Parallel Execution New Features
Randolf Geist
 
PDF
Properly Use Parallel DML for ETL
Andrej Pashchenko
 
PPTX
Manjeet Singh.pptx
RAMCHANDRASHARMA7
 
PDF
A kind and gentle introducton to rac
Riyaj Shamsudeen
 
PDF
Query Parallelism in PostgreSQL: What's coming next?
PGConf APAC
 
PPTX
Oracle optimizer bootcamp
Maria Colgan
 
PDF
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Jignesh Shah
 
DOCX
Applications of parellel computing
pbhopi
 
PPTX
The End of a Myth: Ultra-Scalable Transactional Management
Ricardo Jimenez-Peris
 
PDF
What’s new in 9.6, by PostgreSQL contributor
Masahiko Sawada
 
PDF
Scaling paypal workloads with oracle rac ss
Anil Nair
 
PDF
POLARDB for MySQL - Parallel Query
oysteing
 
PPTX
Anil nair rac_internals_sangam_2016
Anil Nair
 
Parallel Execution With Oracle Database 12c - Masterclass
Ivica Arsov
 
Analysing and troubleshooting Parallel Execution IT Tage 2015
Randolf Geist
 
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel Queries
Christo Kutrovsky
 
Randolf Geist – IT-Tage 2015 – Oracle Parallel Execution – Analyse und Troubl...
Informatik Aktuell
 
Parallel Query on Exadata
Enkitec
 
Oracle Parallel Distribution and 12c Adaptive Plans
Franck Pachot
 
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdf
7vkx8892hv
 
Oracle 12c Parallel Execution New Features
Randolf Geist
 
Properly Use Parallel DML for ETL
Andrej Pashchenko
 
Manjeet Singh.pptx
RAMCHANDRASHARMA7
 
A kind and gentle introducton to rac
Riyaj Shamsudeen
 
Query Parallelism in PostgreSQL: What's coming next?
PGConf APAC
 
Oracle optimizer bootcamp
Maria Colgan
 
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Jignesh Shah
 
Applications of parellel computing
pbhopi
 
The End of a Myth: Ultra-Scalable Transactional Management
Ricardo Jimenez-Peris
 
What’s new in 9.6, by PostgreSQL contributor
Masahiko Sawada
 
Scaling paypal workloads with oracle rac ss
Anil Nair
 
POLARDB for MySQL - Parallel Query
oysteing
 
Anil nair rac_internals_sangam_2016
Anil Nair
 
Ad

Recently uploaded (20)

PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PPTX
Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptx
bashnahara11
 
PDF
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
PDF
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
PDF
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PDF
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PPTX
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
PDF
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PPTX
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
PDF
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
PDF
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptx
bashnahara11
 
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
Build with AI and GDG Cloud Bydgoszcz- ADK .pdf
jaroslawgajewski1
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
Ad

Px execution in rac

  • 1. RAC: Administering Parallel Execution By Riyaj Shamsudeen ©OraInternals Riyaj Shamsudeen
  • 2. ©OraInternals Riyaj Shamsudeen 2 Who am I?  19 years using Oracle products/DBA  OakTable member  Oracle ACE Director  Certified DBA versions 7.0,7.3,8,8i,9i &10g  Specializes in RAC, performance tuning, Internals and E-business suite  Chief DBA with OraInternals  Co-author of “Expert Oracle Practices” ‘2009  Co-author of “Pro Oracle SQL” ‘2010  Email: [email protected]  Blog : orainternals.wordpress.com  URL: www.orainternals.com
  • 3.  A slave set perform specific task at a point in execution plan. After the completion of a task, slave set can be reassigned to perform a different task. ©OraInternals Riyaj Shamsudeen 3 Parallel Execution  Parallel query uses PX slave processes to perform work.  Proper configuration of private interconnect and optimal execution plan is an essential step in scaling the PX operation.  Placement of PX slaves can be controlled by Services or parallel_instance_group configuration.
  • 4. ©OraInternals Riyaj Shamsudeen 4 Example create index mut_t1 on mut (transaction_id) parallel (degree 8) nologging; ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib | ---------------------------------------------------------------------------------------------------------------------- | 0 | CREATE INDEX STATEMENT | | | | 117K(100)| | | | | | 1 | PX COORDINATOR | | | | | | | | | | 2 | PX SEND QC (ORDER) | :TQ10001 | 13M| 166M| | | Q1,01 | P->S | QC (ORDER) | | 3 | INDEX BUILD NON UNIQUE| MUT_T1 | | | | | Q1,01 | PCWP | | | 4 | SORT CREATE INDEX | | 13M| 166M| | | Q1,01 | PCWP | | | 5 | PX RECEIVE | | 13M| 166M| 116K (1)| 00:05:50 | Q1,01 | PCWP | | | 6 | PX SEND RANGE | :TQ10000 | 13M| 166M| 116K (1)| 00:05:50 | Q1,00 | P->P | RANGE | | 7 | PX BLOCK ITERATOR | | 13M| 166M| 116K (1)| 00:05:50 | Q1,00 | PCWC | | |* 8 | TABLE ACCESS FULL| MUT | 13M| 166M| 116K (1)| 00:05:50 | Q1,00 | PCWP | | ---------------------------------------------------------------------------------------------------------------------- One set of server processes reading the data Another set of server processes, sorting and creating index segments. P -> P transfer between these two slave sets.
  • 5. PX: intra vs inter-instance  Intra-instance PX operation: All slaves are allocated in the current instance.  Inter-instance PX operation: Slaves are allocated from more than one instance.  In intra-instance PX operation, Slaves communicate with each other, passing buffers between them and does not use interconnect.  In inter-instance PX operation, slaves use interconnect to exchange buffers and messages, among the slave sets or Co-ordinator ©OraInternals Riyaj Shamsudeen 5 processes.
  • 6. Distribution method of rows between producers and consumers are key to avoid flooding interconnect. P9 P10 P11 … P16 P9 P10 P11 … P16 Consumers ©OraInternals Riyaj Shamsudeen 6 Architecture QC P1 Inst 2 Inst 1 P1 P2 P3 P8 P2 P3 … P8 … Producers P ->P
  • 7. ©OraInternals Riyaj Shamsudeen 7 Controlling placement  Allocation of PX slaves can be controlled using few techniques: 1) Services 2) Instance groups/parallel_instance_group 3) Combination of Services and parallel_instance_group  Instance_groups based setup is widely used in the database versions 10g and below. By default, all instances can participate in a PX operation.  From 10g onwards, you should control placement of slaves using Services.  Placement control using services is much more elegant and dynamic. Failover of services automatically handles PX placement.
  • 8. Placement: Two instances  Sessions starting PX operations in inst1 can allocate slaves in both inst1 and inst2 inst1.instance_groups='inst1’,’inst12' inst2.instance_groups='inst2',’inst12' inst3.instance_groups='inst3’ inst1.parallel_instance_group= 'inst12' P9 P9 c ©OraInternals Riyaj Shamsudeen 8 QC P9 P1 P9 P1 … P1 P1 … inst1 inst2 inst3
  • 9. Placement: Services: two instances  Service FIN is located in two instance.  Slaves allocated from inst1 and inst2. srvctl add service –d solrac –s FIN –r inst1,ins2 –a inst3 FIN FIN P9 P9 ©OraInternals Riyaj Shamsudeen 9 QC P9 P1 P9 P1 … P1 P1 … inst1 inst2 inst3 demo:pq_query_nopart.sql with po service
  • 10. Placement: Combination: PIG  Service alone also can be used to control the placement.  Slaves will be allocated from two instances: inst1 and inst2. srvctl add service –d solrac –s FIN –r inst1,ins2,inst3 srvctl add service –d solrac –s FIN2 –r inst1,ins2 Alter session set parallel_instance_group='FIN2'; FIN FIN2 FIN FIN P9 P9 ©OraInternals Riyaj Shamsudeen 10 QC P9 P1 P9 P1 … P1 P1 … … inst1 inst2 inst3 FIN2
  • 11. ©OraInternals Riyaj Shamsudeen 11 Recommendations  Both instance_groups/parallel_instance_groups and Services method can be concurrently used in a database.  That might complicate debugging, Use services from 11g onwards. Use instance_groups for 10g and below.  For more granular control, use parallel_instance_groups and point to a more appropriate service, as an exception.
  • 12.  Until 11g, measuring PQ traffic is not easy.  In 10g, If AWR does not report PQ interconnect traffic, but if the device statistics are reporting high interconnect traffic, then it is possible that application might be generating high PQ load. ©OraInternals Riyaj Shamsudeen 12 Measuring PQ traffic  IPQ statistics are visible in x$ksxpclient at a client level from 11g onwards..  AWR snaps stores these statistics in dba_hist_ic_client_stats and prints PQ traffic in AWR section in a more readable format.
  • 13. ©OraInternals Riyaj Shamsudeen 13 AWR report  AWR report in 11g prints the stats as a per second rate Interconnect Throughput by Client DB/Inst: SOLRAC/solrac1 Snaps: 1012-1013 -> Throughput of interconnect usage by major consumers -> All throughput numbers are megabytes per second Send Receive Used By Mbytes/sec Mbytes/sec ---------------- ----------- ----------- Global Cache .02 .04 Parallel Query .36 .15 DB Locks .01 .01 DB Streams .00 .00 Other .00 .00
  • 14. ©OraInternals Riyaj Shamsudeen 14 PQ and cache fusion  Parallel query slave processes read the block directly in to their PGA using direct reads (except 11gR2 new feature – in memory parallelism).  With direct reads to PGA, there is no need for global cache grants.  For intra-instance parallel operation, the performance difference between a single instance and RAC is minimal.  Still, objects need to be checkpointed at the start of a PQ operation from all nodes in case of RAC.
  • 15. Parallel_execution_message_size  Size of buffer transferred between the slaves is determined by the parameter parallel_execution_message_size (PEMS).  Default value is too small increasing chatty traffic.  The default vale is 2k or 4K depending upon the version.  Increase the value of this parameter to at least 16K. Downside is that increase in shared pool size (PX Msg Buffers).  Realizing the performance implications of this parameter, default value increased to 16K in 11gR2 (Compatible must be set to 11.2.0.2 and OS specific).  Jumbo frames + PEMS, both set above 16K are good starting points for the inter-instance PQ intensive environments. ©OraInternals Riyaj Shamsudeen 15
  • 16.  Auto DOP: Optimizer chooses optimal parallelism even if you don’t specify parallel hint or parallel degree at object level.  PQ queueing feature is useful in PQ intensive data warehousing environments. ©OraInternals Riyaj Shamsudeen 16 New features (11.2)  Parallel Statement Queueing: Query will be queued until there is sufficient amount of PQ servers available.  In memory parallelism: PQ can read blocks in to buffer cache. Demo: pq_queueing.sql if time permits
  • 17. In-memory parallelism (11gR2)  Due to the size of mammoth servers, now it is not uncommon to see SGA with a size of 100GB.  11gR2 introduced in-memory parallelism. Essentially, PQ servers can read buffers in to SGA. SGA SGA SGA P1 P2 P3 P4 P5 P6 inst1 inst2 inst3 P1 P2 P3 P4 P5 P6 ©OraInternals Riyaj Shamsudeen 17
  • 18. Parallel_force_local (11.2)  This parameter controls whether the parallelism to be forced to single instance.  Default is FALSE and there is no reason to change it. Parallel Statement Queueing feature (11.2) interacts badly if this parameter is true.  Use services if you want to keep Parallel executions to a single node, rather than adjusting this parameter. ©OraInternals Riyaj Shamsudeen 18
  • 19.  Parallel slaves can be allocated in an instance or multiple instances depending upon the configuration. ©OraInternals Riyaj Shamsudeen 19 Parallel DML  Parallel DML (changes) also works in the same way of Parallel query.  In DML operation there are two distinct group of parallel operations:  Parallelism for scan only.  Parallelism for both scan and changes.
  • 20. Parallel DML – scan only-execution plan In the execution plan below, UPDATE step is above Co-ordinator. Only Scanning is done in parallel, changes are done in ------------------------------------------------------------------------------------------ ... | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| ... ----------------------------------------------------------------------------------------- ... | 0 | UPDATE STATEMENT | | 1 | 1028 | 2600 (1)| ... | 1 | UPDATE | HUGETABLE_HASH | | | | ... | 2 | PX COORDINATOR | | | | | ... | 3 | PX SEND QC (RANDOM) | :TQ10003 | 1 | 1028 | 2600 (1)| ... | 4 | NESTED LOOPS | | 1 | 1028 | 2600 (1)| ... | 5 | BUFFER SORT | | | | | ... | 6 | PX RECEIVE | | | | | ... | 7 | PX SEND BROADCAST | :TQ10002 | | | | ... | 8 | VIEW | VW_NSO_1 | 32186 | 408K| 2445 (1)| ... | 9 | HASH UNIQUE | | 1 | 817K| | ... | 10 | PX RECEIVE | | 1 | 817K| | ... | 11 | PX SEND HASH | :TQ10001 | 1 | 817K| | ... | 12 | HASH UNIQUE | | 1 | 817K| | ... |* 13 | HASH JOIN | | 32186 | 817K| 2445 (1)| ... | 14 | PX BLOCK ITERATOR | | 32186 | 408K| 1222 (0)| ... | 15 | TABLE ACCESS FULL | HUGETABLE_HASH | 32186 | 408K| 1222 (0)| … ©OraInternals Riyaj Shamsudeen 20 serial.
  • 21. Should you use inter or intra-parallel operations?  This is a thorny question in production environment.  There is no correct answer to this question, it depends upon two factors: 1. Node resources such as CPU, I/O bandwidth, IC bandwidth 2. Size of the segment, whether it is partitioned or not etc.  If the index creation can be completed with just one node, meaning the nodes are beefed up nodes, then you should use one node.  If the nodes are not big enough, then it is okay to use all nodes, just realize that interconnect latency might affect performance. ©OraInternals Riyaj Shamsudeen 21
  • 22. Thank you for attending! If you like this presentation, you will love my upcoming intensive Advanced RAC Troubleshooting class. Watch for updates in: www.tanelpoder.com Orainternals.wordpress.com Contact info: Email: [email protected] Blog : orainternals.wordpress.com URL : www.orainternals.com ©OraInternals Riyaj Shamsudeen 22
  • 23. MARK YOUR CALENDARS! COLLABORATE 12 April 22-26, 2012 Mandalay Bay Convention Center Las Vegas, Nevada http://events.ioug.org/p/cm/ld/fid=15