SlideShare a Scribd company logo
Wim Godden
Cu.be Solutions
@wimgtr
Beyond PHP :
It's not (just) about the code
Who am I ?
Wim Godden (@wimgtr)
Where I'm from
Where I'm from
Where I'm from
Where I'm from
Where I'm from
Where I'm from
My town
My town
Belgium – the traffic
Who am I ?
Wim Godden (@wimgtr)
Founder of Cu.be Solutions (http://cu.be)
Open Source developer since 1997
Developer of PHPCompatibility, PHPConsistent, Nginx SLIC, ...
Speaker at PHP and Open Source conferences
Cu.be Solutions ?
Open source consultancy
PHP-centered (ZF2, Symfony2, Magento, Pimcore, ...)
Training courses
High-speed redundant network (BGP, OSPF, VRRP)
High scalability development
Nginx + extensions
MySQL Cluster
Projects :
mostly IT & Telecom companies
lots of public-facing apps/sites
Who are you ?
Developers ?
Anyone setup a MySQL master-slave ?
Anyone setup a site/app on separate web and database server ?
→ How much traffic between them ?
The topic
Things we take for granted
Famous last words : "It should work just fine"
Works fine today
→ might fail tomorrow
Most common mistakes
PHP code ↔ PHP ecosystem
It starts with...
… code !
First up : database
Database queries – complexity
SELECT DISTINCT n.nid, n.uid, n.title, n.type, e.event_start, e.event_start AS event_start_orig,
e.event_end, e.event_end AS event_end_orig, e.timezone, e.has_time, e.has_end_date, tz.offset
AS offset, tz.offset_dst AS offset_dst, tz.dst_region, tz.is_dst, e.event_start - INTERVAL IF(tz.is_dst,
tz.offset_dst, tz.offset) HOUR_SECOND AS event_start_utc, e.event_end - INTERVAL IF(tz.is_dst,
tz.offset_dst, tz.offset) HOUR_SECOND AS event_end_utc, e.event_start - INTERVAL IF(tz.is_dst,
tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_start_user,
e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0
SECOND AS event_end_user, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset)
HOUR_SECOND + INTERVAL 0 SECOND AS event_start_site, e.event_end - INTERVAL
IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_end_site,
tz.name as timezone_name FROM node n INNER JOIN event e ON n.nid = e.nid INNER JOIN
event_timezones tz ON tz.timezone = e.timezone INNER JOIN node_access na ON na.nid = n.nid
LEFT JOIN domain_access da ON n.nid = da.nid LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid =
i18n.tnid AND i18n.language = 'en' WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm =
'all'))) AND ((da.realm = "domain_id" AND da.gid = 4) OR (da.realm = "domain_site" AND da.gid =
0)) AND (n.language ='en' OR n.language ='' OR n.language IS NULL OR n.language = 'is' AND
i18n.nid IS NULL) AND ( n.status = 1 AND ((e.event_start >= '2010-01-31 00:00:00' AND
e.event_start <= '2010-03-01 23:59:59') OR (e.event_end >= '2010-01-31 00:00:00' AND
e.event_end <= '2010-03-01 23:59:59') OR (e.event_start <= '2010-01-31 00:00:00' AND
e.event_end >= '2010-03-01 23:59:59')) ) GROUP BY n.nid HAVING (event_start >= '2010-02-01
00:00:00' AND event_start <= '2010-02-28 23:59:59') OR (event_end >= '2010-02-01 00:00:00' AND
event_end <= '2010-02-28 23:59:59') OR (event_start <= '2010-02-01 00:00:00' AND event_end >=
'2010-02-28 23:59:59') ORDER BY event_start ASC;
Database - indexing
'select id from stock where status = 2 order by qty'
→ aggregate index on (status, qty)
'select id from stock where status > 2 order by qty'
→ aggregate index on (status, qty) ?
→ Depends :
- Btree : yes
- Hash : range selection stops use of aggregate index
→ separate index on status and qty (since recent versions)
Database - indexing
Indexes make database faster
→ Let's index everything !
→ DON'T :
Insert/update/delete → Index modification
Each select → evaluation of all indexes
"Relational schema design is based on data
but index design is based on queries"
- Bill Karwin,
author of “SQL Antipatterns”
Databases – detecting problematic queries
Slow query log
→ SET GLOBAL slow_query_log = ON;
Queries not using indexes
→ In my.cnf/my.ini : 'log_queries_not_using_indexes'
General query log
→ SET GLOBAL general_log = ON;
→ Turn it off quickly !
Percona Toolkit
pt-query-digest
Databases - pt-query-digest
# Profile
# Rank Response time Calls R/Call Item
# ==== ================ ===== ======= ======
# 1 16526.2542 98.2% 1208 13.6806 SELECT output_option
# 2 0.8312 0.0% 6412 0.0001 SELECT poller_output poller_item
# 3 0.6811 0.0% 6416 0.0001 SELECT poller_time
# 4 0.2805 0.0% 149 0.0019 SELECT wp_terms wp_term_taxonomy wp_term_relationships
# 5 0.1999 0.0% 51 0.0039 SELECT UNION wp_pp_daily_summary wp_pp_hourly_summary
# 6 0.1956 0.0% 89 0.0022 UPDATE wp_options
# MISC 302.8137 1.8% 3853 0.0002 <147 ITEMS>
# Query 2: 0.26 QPS, 0.00x concurrency, ID 0x92F3B1B361FB0E5B at byte 14081299
# This item is included in the report because it matches --limit.
# Scores: Apdex = 1.00 [1.0], V/M = 0.00
# Query_time sparkline: | _^ |
# Time range: 2011-12-28 18:42:47 to 19:03:10
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 1 312
# Exec time 50 4s 5ms 25ms 13ms 20ms 4ms 12ms
# Lock time 3 32ms 43us 163us 103us 131us 19us 98us
# Rows sent 59 62.41k 203 231 204.82 202.40 3.99 202.40
# Rows examine 13 73.63k 238 296 241.67 246.02 10.15 234.30
# Rows affecte 0 0 0 0 0 0 0 0
# Rows read 59 62.41k 203 231 204.82 202.40 3.99 202.40
# Bytes sent 53 24.85M 46.52k 84.36k 81.56k 83.83k 7.31k 79.83k
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 21.63k 71 71 71 71 0 71
# InnoDB:
# IO r bytes 0 0 0 0 0 0 0 0
# IO r ops 0 0 0 0 0 0 0 0
# IO r wait 0 0 0 0 0 0 0 0
# pages distin 40 11.77k 34 44 38.62 38.53 1.87 38.53
# queue wait 0 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0 0
# Boolean:
# Full scan 100% yes, 0% no
# String:
# Databases wp_blog_one (264/84%), wp_blog_tw… (36/11%)... 1 more
# Hosts
# InnoDB trxID 86B40B (1/0%), 86B430 (1/0%), 86B44A (1/0%)... 309 more
# Last errno 0
# Users wp_blog_one (264/84%), wp_blog_two (36/11%)... 1 more
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms ################################################################
# 100ms
# 1s
# 10s+
# Tables
# SHOW TABLE STATUS FROM `wp_blog_one ` LIKE 'wp_options'G
# SHOW CREATE TABLE `wp_blog_one `.`wp_options`G
Databases – next step : explain
explain <query>
"How will MySQL execute the query"
Databases – next step : explain
+-----------+------+---------------+------+---------+------+--------+-------------+
| TABLE | TYPE | possible_keys | KEY | key_len | REF | ROWS | Extra |
+-----------+------+---------------+------+---------+------+--------+-------------+
| employees | ALL | NULL | NULL | NULL | NULL | 299809 | USING WHERE |
+-----------+------+---------------+------+---------+------+--------+-------------+
+------------+-------+-------------------------------+---------+---------+-------+------+-------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+------------+-------+-------------------------------+---------+---------+-------+------+-------+
| itdevice | const | PRIMARY,fk_device_devicetype1 | PRIMARY | 4 | const | 1 | |
| devicetype | const | PRIMARY | PRIMARY | 4 | const | 1 | |
+------------+-------+-------------------------------+---------+---------+-------+------+-------+
Databases – next step : explain
Type of lookup
'system', 'const' and 'ref' = good
'ALL' = bad
Extra info
Using index = good
Using filesort = usually bad
Databases – covering indexes
mysql> explain select * from product where category=5 and stock=1;
+----+-------+---------------+---------------+---------+------+------------+
| id | TYPE | possible_keys | KEY | key_len | ROWS | Extra |
+----+-------+---------------+---------------+---------+------+------------+
| 1 | ref | categorystock | categorystock | 8 | 1 | |
●+----+-------+---------------+---------------+---------+------+------------+
+--------------+---------------+------+-----+------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------------+------+-----+------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| category | int(11) | YES | MUL | NULL | |
| stock | int(11) | YES | MUL | NULL | |
| description | varchar(255) | YES | | NULL | |
...
mysql> show index from product;
+----------+------------+---------------------+--------------+---------------|
| Table | Non_unique | Key_name | Seq_in_index | Column_name |
+----------+------------+---------------------+--------------+---------------+
| product | 0 | PRIMARY | 1 | id |
| product | 1 | categorystock | 1 | category |
| product | 1 | categorystock | 2 | stock |
...
Databases – covering indexes
mysql>explain select category, stock, id from product where category=5 and stock=1;
+----+-------+---------------+---------------+---------+-----+-------------+
| id | TYPE | possible_keys | KEY | key_len | ROWS| Extra |
+----+-------+---------------+---------------+---------+-----+-------------+
| 1 | ref | categorystock | categorystock | 8 | 1 | Using index |
+----+-------+---------------+---------------+---------+-----+-------------+
+--------------+---------------+------+-----+------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------------+------+-----+------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| category | int(11) | YES | MUL | NULL | |
| stock | int(11) | YES | MUL | NULL | |
| description | varchar(255) | YES | | NULL | |
...
mysql> show index from product;
+----------+------------+---------------------+--------------+---------------|
| Table | Non_unique | Key_name | Seq_in_index | Column_name |
+----------+------------+---------------------+--------------+---------------+
| product | 0 | PRIMARY | 1 | id |
| product | 1 | categorystock | 1 | category |
| product | 1 | categorystock | 2 | stock |
...
Databases – when to use / not to use
Good at :
Fetching data
Storing data
Searching through data
Bad at :
select `name` from `room` where ceiling(`avgNoOfPeople`) = 8
→ full table scan
→ creates temporary table
select `name` from `room` where avgNoOfPeople >= 7 and
avgNoOfPeople <= 8
→ Avoid functions that run across every row
→ Avoid functions in where statement
For / foreach (N+1 problem)
$customers = CustomerQuery::create()
->filterByState('MN')
->find();
foreach ($customers as $customer) {
$contacts = ContactsQuery::create()
->filterByCustomerid($customer->getId())
->find();
foreach ($contacts as $contact) {
doSomestuffWith($contact);
}
}
Joins
$contacts = mysql_query("
select
contacts.*
from
customer
join contact
on contact.customerid = customer.id
where
state = 'MN'
");
while ($contact = mysql_fetch_array($contacts)) {
doSomeStuffWith($contact);
}
(or the ORM equivalent)
Better...
10001 → 1 query
Sadly : people still produce code with query loops
Usually :
Growth not anticipated
Internal app → Public app
The origins of this talk
Customers :
Projects we built
Projects we didn't build, but got pulled into
Fixes
Changes
Infrastructure migration
15 years of 'how to cause mayhem with a few lines of code'
Client X
Jobs search site
Monitor job views :
Daily hits
Weekly hits
Monthly hits
Which user saw which job
Client X
Originally : when user viewed job details
Now : when job is in search result
Search for 'php' → 50 jobs = 50 jobs to be updated
→ 50 updates for shown_today
→ 50 updates for shown_week
→ 50 updates for shown_month
→ 50 inserts for shown_user
= 200 queries for 1 search !
Client X : the code
foreach ($jobs as $job) {
$db->query("
insert into shown_today(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_week(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_month(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number +
1
");
$db->query("
insert into shown_user(
jobId,
userId,
when
) values (
" . $job['id'] . ",
" . $user['id'] . ",
now()
)
");
}
Client X : the graph
Client X : the numbers
600-1000 inserts/sec (peaks up to 1600)
400-1000 updates/sec (peaks up to 2600)
16 core machine
Client X : panic !
Mail : "MySQL slave is more than 5 minutes behind master"
We set it up → who did they blame ?
Wait a second !
Client X : what's causing those peaks ?
Client X : possible cause ?
Code changes ?
→ According to developers : none
Action : turn on general log, analyze with pt-query-digest
→ 50+-fold increase in 4 queries
→ Developers : 'Oops we did make a change'
After 3 days : 2,5 days behind
Every hour : 50 min extra lag
Client X : But why is the slave lagging ?
Master Slave
File :
master-bin-xxxx.log
File :
master-bin-xxxx.logSlave I/O thread
Binlog dump
thread
Slave
SQL
thread
Client X : Master
Client X : Slave
Client X : fix ?
foreach ($jobs as $job) {
$db->query("
insert into shown_today(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_week(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_month(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number +
1
");
$db->query("
insert into shown_user(
jobId,
userId,
when
) values (
" . $job['id'] . ",
" . $user['id'] . ",
now()
)
");
}
Client X : the code change
insert into shown_today values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ;
insert into shown_week values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ;
insert into shown_month values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ;
insert into shown_user values (5, 23, "2015-10-12 12:01:00"), (8, 23, "2015-10-12
12:01:00"), … ;
Client X : the code change
$todayQuery = "
insert into shown_today(
jobId,
number
) values ";
foreach ($jobs as $job) {
$todayQuery .= "(" . $job['id'] . ", 1),";
}
$todayQuery = substr($todayQuery, 0, strlen($todayQuery) - 1);
$todayQuery .= "
on duplicate key
update
number = number + 1
";
$db->query($todayQuery);
Client X : the chosen solution
$db->autocommit(false);
foreach ($jobs as $job) {
$db->query("
insert into shown_today(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_week(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_month(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_user(
jobId,
userId,
when
) values (
" . $job['id'] . ",
" . $user['id'] . ",
now()
)
");
}
$db->commit();
Client X : conclusion
For loops are bad (we already knew that)
Add master/slave and it gets much worse
Use transactions : it will provide huge performance increase
Better yet : use MariaDB 10 or higher → slave_parallel_threads
Result : slave caught up 5 days later
Database → Network
Customer Y
Top 10 site in Belgium
Growing rapidly
At peak traffic :
Unexplicable latency on database
Load on webservers : minimal
Load on database servers : acceptable
Client Y : the network
Client Y : the network
60GB 700GB 700GB
Client Y : network overload
Cause : Drupal hooks → retrieving data that was not needed
Only load data you actually need
Don't know at the start ? → Use lazy loading
Caching :
Same story
Memcached/Redis are fast
But : data still needs to cross the network
Network trouble : more than just traffic
Customer Z
150.000 visits/day
News ticker :
XML feed from other site (owned by same customer)
Cached for 15 min
Customer Z – fetching the feed
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents('http://www.scrambledsitename.be/xml/feed.xml')
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
What's wrong with this code ?
Customer Z – no feed without the source
Feed source
Customer Z – no feed without the source
Feed source
Customer Z : timeout
default_socket_timeout : 60 sec by default
Each visitor : 60 sec wait time
People keep hitting refresh → more load
More active connections → more load
Apache hits maximum connections → entire site down
Customer Z – fetching the feed
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents('http://www.scrambledsitename.be/xml/feed.xml')
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer Z : timeout fix
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
)
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer Z : don't delete from cache
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
)
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer Z : don't delete from cache
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
)
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer Z : don't delete from cache
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
$feed
);
}
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer Z : process early
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
ParseXmlFeed($feed)
);
}
}
Customer Z : file_[get|put]_contents atomicity
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
ParseXmlFeed($feed)
);
}
}
Relying on user → concurrent requests → possible data corruption
Better : run every 15min through cronjob
Network resources
Use timeouts for all :
fopen
curl
SOAP
…
Data source trusted ?
→ setup a webservice
→ let them push updates when their feed changes
→ less load on data source
→ no timeout issues
Add logging → early detection
Logging
Logging = good
Logging in PHP using fopen
→ bad idea : locking issues
→ Use monolog : file, syslog, mail, Pushover, HipChat, Graylog,
Rollbar, ElasticSearch (and 50 more)
For Firefox : FirePHP (add-on for Firebug)
Debug logging = bad on production
Watch your logs !
Don't log on slow disks → I/O bottlenecks
File system : I/O bottlenecks
Causes :
Excessive writes (database updates, logfiles, swapping, …)
Excessive reads (non-indexed database queries, swapping, small file
system cache, …)
How to detect ?
top
iostat
See iowait ? Stop worrying about php, fix the I/O problem !
Cpu(s): 0.2%us, 3.0%sy, 0.0%ni, 61.4%id, 35.5%wa, 0.0%hi
avg-cpu: %user %nice %system %iowait %steal %idle
0.10 0.00 0.96 53.70 0.00 45.24
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 120.40 0.00 123289.60 0 616448
sdb 2.10 0.00 4378.10 0 18215
dm-0 4.20 0.00 36.80 0 184
dm-1 0.00 0.00 0.00 0 0
File system
Worst of all : NFS
PHP files → lstat calls
Templates → same
Sessions
→ locking issues
→ corrupt data
→ store sessions in database, Memcached, Redis, ...
Step-by-step : most common issues
Using NFS ? Get rid of it ;-)
iowait on database server
I/O reads (use iostat)
→ missing/wrong indexes
→ too many queries
→ select *
I/O writes
→ no transactions
→ too many queries
→ bad DB engine settings
iowait on webserver (logs ? static files ?)
CPU on database server (missing/wrong/too many indexes)
CPU on webserver (PHP)
Much more than code
DB
server
Webserver
User
Network
XML feed
Look beyond PHP (or Perl, Ruby, Python, ...) !
Questions ?
Questions ?
Contact
Twitter @wimgtr
Slides http://www.slideshare.net/wimg
E-mail wim@cu.be
Thanks !
Please provide feedback through Joind.in :
https://joind.in/15535

More Related Content

What's hot (20)

ODP
Beyond php it's not (just) about the code
Wim Godden
 
ODP
Caching and tuning fun for high scalability @ PHPTour
Wim Godden
 
ODP
Caching and tuning fun for high scalability @ LOAD2012
Wim Godden
 
ODP
Remove php calls and scale your site like crazy !
Wim Godden
 
ODP
Nginx and friends - putting a turbo button on your site
Wim Godden
 
ODP
When dynamic becomes static: the next step in web caching techniques
Wim Godden
 
PDF
Top Node.js Metrics to Watch
Sematext Group, Inc.
 
ODP
Caching and tuning fun for high scalability
Wim Godden
 
ODP
Beyond php - it's not (just) about the code
Wim Godden
 
ODP
Caching and tuning fun for high scalability
Wim Godden
 
ODP
Caching and tuning fun for high scalability @ 4Developers
Wim Godden
 
PPTX
MongoDB: tips, trick and hacks
Scott Hernandez
 
PDF
MySQL 5.5 Guide to InnoDB Status
Karwin Software Solutions LLC
 
PPTX
Varnish, the high performance valhalla?
Jeroen van Dijk
 
PDF
Static Typing in Vault
GlynnForrest
 
PDF
MySQL under the siege
Source Ministry
 
PPTX
Replication and replica sets
Randall Hunt
 
PDF
Survey of Percona Toolkit
Karwin Software Solutions LLC
 
PDF
Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached
georgepenkov
 
PDF
Load Data Fast!
Karwin Software Solutions LLC
 
Beyond php it's not (just) about the code
Wim Godden
 
Caching and tuning fun for high scalability @ PHPTour
Wim Godden
 
Caching and tuning fun for high scalability @ LOAD2012
Wim Godden
 
Remove php calls and scale your site like crazy !
Wim Godden
 
Nginx and friends - putting a turbo button on your site
Wim Godden
 
When dynamic becomes static: the next step in web caching techniques
Wim Godden
 
Top Node.js Metrics to Watch
Sematext Group, Inc.
 
Caching and tuning fun for high scalability
Wim Godden
 
Beyond php - it's not (just) about the code
Wim Godden
 
Caching and tuning fun for high scalability
Wim Godden
 
Caching and tuning fun for high scalability @ 4Developers
Wim Godden
 
MongoDB: tips, trick and hacks
Scott Hernandez
 
MySQL 5.5 Guide to InnoDB Status
Karwin Software Solutions LLC
 
Varnish, the high performance valhalla?
Jeroen van Dijk
 
Static Typing in Vault
GlynnForrest
 
MySQL under the siege
Source Ministry
 
Replication and replica sets
Randall Hunt
 
Survey of Percona Toolkit
Karwin Software Solutions LLC
 
Saving The World From Guaranteed APOCALYPSE* Using Varnish and Memcached
georgepenkov
 

Viewers also liked (20)

PPTX
Zendcon zray
Mathew Beane
 
PPT
Digital Literacy
mscuttle
 
PPT
Corporations
mscuttle
 
PDF
Tools aren't just about tech: Applying the Open Decision Framework to DevOps
Rebecca Fernandez
 
PPT
Bringing Civil War Preservation to the Classroom
civanoff
 
PPT
Структура на Презентација
Ognena Kostova
 
PPT
Monetary Policy
mscuttle
 
PPTX
69 Cara Closing Ratio
Barlian winarta
 
PPTX
Why need software testing
Vaibhav Dash
 
PDF
Finwin2016‬
БКС
 
PPTX
Acceptance testing
Vaibhav Dash
 
PDF
CFSSL 1.1: The Evolution of a PKI toolkit - DEF CON 23
Nick Sullivan
 
PPT
Literacy in Every Classroom
civanoff
 
PDF
Introduction To Bootstrap
Rand Graham
 
PDF
DevOps 101
Somkiat Puisungnoen
 
PPS
ISTQB Foundation - Chapter 3
Chandukar
 
PDF
2008-03-06 Harris Corp Security Seminar
Shawn Wells
 
PPTX
SenchaCon 2016: How to Give your Sencha App Real-time Web Performance - James...
Sencha
 
PPTX
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
Sencha
 
PPT
SenchaCon 2016: Expect the Unexpected - Dealing with Errors in Web Apps
Sencha
 
Zendcon zray
Mathew Beane
 
Digital Literacy
mscuttle
 
Corporations
mscuttle
 
Tools aren't just about tech: Applying the Open Decision Framework to DevOps
Rebecca Fernandez
 
Bringing Civil War Preservation to the Classroom
civanoff
 
Структура на Презентација
Ognena Kostova
 
Monetary Policy
mscuttle
 
69 Cara Closing Ratio
Barlian winarta
 
Why need software testing
Vaibhav Dash
 
Finwin2016‬
БКС
 
Acceptance testing
Vaibhav Dash
 
CFSSL 1.1: The Evolution of a PKI toolkit - DEF CON 23
Nick Sullivan
 
Literacy in Every Classroom
civanoff
 
Introduction To Bootstrap
Rand Graham
 
ISTQB Foundation - Chapter 3
Chandukar
 
2008-03-06 Harris Corp Security Seminar
Shawn Wells
 
SenchaCon 2016: How to Give your Sencha App Real-time Web Performance - James...
Sencha
 
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
Sencha
 
SenchaCon 2016: Expect the Unexpected - Dealing with Errors in Web Apps
Sencha
 
Ad

Similar to Beyond php - it's not (just) about the code (20)

PDF
Beyond php - it's not (just) about the code
Wim Godden
 
KEY
10x Performance Improvements
Ronald Bradford
 
KEY
10x improvement-mysql-100419105218-phpapp02
promethius
 
PDF
U C2007 My S Q L Performance Cookbook
guestae36d0
 
PDF
Scaling MySQL Strategies for Developers
Jonathan Levin
 
PDF
Highload Perf Tuning
HighLoad2009
 
PDF
Quick Wins
HighLoad2009
 
PPTX
MySQL performance tuning
Anurag Srivastava
 
PDF
MySQL Query Optimisation 101
Federico Razzoli
 
PPT
15 protips for mysql users pfz
Joshua Thijssen
 
KEY
Tek tutorial
Ligaya Turmelle
 
PPTX
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
Dave Stokes
 
PDF
How to Design Indexes, Really
Karwin Software Solutions LLC
 
PDF
How to Design Indexes, Really
MYXPLAIN
 
PDF
Maximizing SQL Reviews and Tuning with pt-query-digest
Pythian
 
PDF
Database Design most common pitfalls
Federico Razzoli
 
PDF
MariaDB and Clickhouse Percona Live 2019 talk
Alexander Rubin
 
PPTX
MySQL Indexing - Best practices for MySQL 5.6
MYXPLAIN
 
PDF
Percona Live 2012PPT: MySQL Query optimization
mysqlops
 
PDF
Performance Tuning Best Practices
webhostingguy
 
Beyond php - it's not (just) about the code
Wim Godden
 
10x Performance Improvements
Ronald Bradford
 
10x improvement-mysql-100419105218-phpapp02
promethius
 
U C2007 My S Q L Performance Cookbook
guestae36d0
 
Scaling MySQL Strategies for Developers
Jonathan Levin
 
Highload Perf Tuning
HighLoad2009
 
Quick Wins
HighLoad2009
 
MySQL performance tuning
Anurag Srivastava
 
MySQL Query Optimisation 101
Federico Razzoli
 
15 protips for mysql users pfz
Joshua Thijssen
 
Tek tutorial
Ligaya Turmelle
 
7 Database Mistakes YOU Are Making -- Linuxfest Northwest 2019
Dave Stokes
 
How to Design Indexes, Really
Karwin Software Solutions LLC
 
How to Design Indexes, Really
MYXPLAIN
 
Maximizing SQL Reviews and Tuning with pt-query-digest
Pythian
 
Database Design most common pitfalls
Federico Razzoli
 
MariaDB and Clickhouse Percona Live 2019 talk
Alexander Rubin
 
MySQL Indexing - Best practices for MySQL 5.6
MYXPLAIN
 
Percona Live 2012PPT: MySQL Query optimization
mysqlops
 
Performance Tuning Best Practices
webhostingguy
 
Ad

More from Wim Godden (20)

PDF
Bringing bright ideas to life
Wim Godden
 
PDF
The why and how of moving to php 8
Wim Godden
 
PDF
The why and how of moving to php 7
Wim Godden
 
PDF
My app is secure... I think
Wim Godden
 
PDF
My app is secure... I think
Wim Godden
 
PDF
Building interactivity with websockets
Wim Godden
 
PDF
Bringing bright ideas to life
Wim Godden
 
ODP
Your app lives on the network - networking for web developers
Wim Godden
 
ODP
The why and how of moving to php 7.x
Wim Godden
 
ODP
The why and how of moving to php 7.x
Wim Godden
 
ODP
Beyond php - it's not (just) about the code
Wim Godden
 
ODP
My app is secure... I think
Wim Godden
 
ODP
Building interactivity with websockets
Wim Godden
 
ODP
Your app lives on the network - networking for web developers
Wim Godden
 
ODP
My app is secure... I think
Wim Godden
 
ODP
My app is secure... I think
Wim Godden
 
ODP
The promise of asynchronous php
Wim Godden
 
ODP
My app is secure... I think
Wim Godden
 
ODP
My app is secure... I think
Wim Godden
 
ODP
Practical git for developers
Wim Godden
 
Bringing bright ideas to life
Wim Godden
 
The why and how of moving to php 8
Wim Godden
 
The why and how of moving to php 7
Wim Godden
 
My app is secure... I think
Wim Godden
 
My app is secure... I think
Wim Godden
 
Building interactivity with websockets
Wim Godden
 
Bringing bright ideas to life
Wim Godden
 
Your app lives on the network - networking for web developers
Wim Godden
 
The why and how of moving to php 7.x
Wim Godden
 
The why and how of moving to php 7.x
Wim Godden
 
Beyond php - it's not (just) about the code
Wim Godden
 
My app is secure... I think
Wim Godden
 
Building interactivity with websockets
Wim Godden
 
Your app lives on the network - networking for web developers
Wim Godden
 
My app is secure... I think
Wim Godden
 
My app is secure... I think
Wim Godden
 
The promise of asynchronous php
Wim Godden
 
My app is secure... I think
Wim Godden
 
My app is secure... I think
Wim Godden
 
Practical git for developers
Wim Godden
 

Recently uploaded (20)

PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 

Beyond php - it's not (just) about the code

  • 1. Wim Godden Cu.be Solutions @wimgtr Beyond PHP : It's not (just) about the code
  • 2. Who am I ? Wim Godden (@wimgtr)
  • 11. Belgium – the traffic
  • 12. Who am I ? Wim Godden (@wimgtr) Founder of Cu.be Solutions (http://cu.be) Open Source developer since 1997 Developer of PHPCompatibility, PHPConsistent, Nginx SLIC, ... Speaker at PHP and Open Source conferences
  • 13. Cu.be Solutions ? Open source consultancy PHP-centered (ZF2, Symfony2, Magento, Pimcore, ...) Training courses High-speed redundant network (BGP, OSPF, VRRP) High scalability development Nginx + extensions MySQL Cluster Projects : mostly IT & Telecom companies lots of public-facing apps/sites
  • 14. Who are you ? Developers ? Anyone setup a MySQL master-slave ? Anyone setup a site/app on separate web and database server ? → How much traffic between them ?
  • 15. The topic Things we take for granted Famous last words : "It should work just fine" Works fine today → might fail tomorrow Most common mistakes PHP code ↔ PHP ecosystem
  • 16. It starts with... … code ! First up : database
  • 17. Database queries – complexity SELECT DISTINCT n.nid, n.uid, n.title, n.type, e.event_start, e.event_start AS event_start_orig, e.event_end, e.event_end AS event_end_orig, e.timezone, e.has_time, e.has_end_date, tz.offset AS offset, tz.offset_dst AS offset_dst, tz.dst_region, tz.is_dst, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND AS event_start_utc, e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND AS event_end_utc, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_start_user, e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_end_user, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_start_site, e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_end_site, tz.name as timezone_name FROM node n INNER JOIN event e ON n.nid = e.nid INNER JOIN event_timezones tz ON tz.timezone = e.timezone INNER JOIN node_access na ON na.nid = n.nid LEFT JOIN domain_access da ON n.nid = da.nid LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'en' WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all'))) AND ((da.realm = "domain_id" AND da.gid = 4) OR (da.realm = "domain_site" AND da.gid = 0)) AND (n.language ='en' OR n.language ='' OR n.language IS NULL OR n.language = 'is' AND i18n.nid IS NULL) AND ( n.status = 1 AND ((e.event_start >= '2010-01-31 00:00:00' AND e.event_start <= '2010-03-01 23:59:59') OR (e.event_end >= '2010-01-31 00:00:00' AND e.event_end <= '2010-03-01 23:59:59') OR (e.event_start <= '2010-01-31 00:00:00' AND e.event_end >= '2010-03-01 23:59:59')) ) GROUP BY n.nid HAVING (event_start >= '2010-02-01 00:00:00' AND event_start <= '2010-02-28 23:59:59') OR (event_end >= '2010-02-01 00:00:00' AND event_end <= '2010-02-28 23:59:59') OR (event_start <= '2010-02-01 00:00:00' AND event_end >= '2010-02-28 23:59:59') ORDER BY event_start ASC;
  • 18. Database - indexing 'select id from stock where status = 2 order by qty' → aggregate index on (status, qty) 'select id from stock where status > 2 order by qty' → aggregate index on (status, qty) ? → Depends : - Btree : yes - Hash : range selection stops use of aggregate index → separate index on status and qty (since recent versions)
  • 19. Database - indexing Indexes make database faster → Let's index everything ! → DON'T : Insert/update/delete → Index modification Each select → evaluation of all indexes "Relational schema design is based on data but index design is based on queries" - Bill Karwin, author of “SQL Antipatterns”
  • 20. Databases – detecting problematic queries Slow query log → SET GLOBAL slow_query_log = ON; Queries not using indexes → In my.cnf/my.ini : 'log_queries_not_using_indexes' General query log → SET GLOBAL general_log = ON; → Turn it off quickly ! Percona Toolkit pt-query-digest
  • 21. Databases - pt-query-digest # Profile # Rank Response time Calls R/Call Item # ==== ================ ===== ======= ====== # 1 16526.2542 98.2% 1208 13.6806 SELECT output_option # 2 0.8312 0.0% 6412 0.0001 SELECT poller_output poller_item # 3 0.6811 0.0% 6416 0.0001 SELECT poller_time # 4 0.2805 0.0% 149 0.0019 SELECT wp_terms wp_term_taxonomy wp_term_relationships # 5 0.1999 0.0% 51 0.0039 SELECT UNION wp_pp_daily_summary wp_pp_hourly_summary # 6 0.1956 0.0% 89 0.0022 UPDATE wp_options # MISC 302.8137 1.8% 3853 0.0002 <147 ITEMS>
  • 22. # Query 2: 0.26 QPS, 0.00x concurrency, ID 0x92F3B1B361FB0E5B at byte 14081299 # This item is included in the report because it matches --limit. # Scores: Apdex = 1.00 [1.0], V/M = 0.00 # Query_time sparkline: | _^ | # Time range: 2011-12-28 18:42:47 to 19:03:10 # Attribute pct total min max avg 95% stddev median # ============ === ======= ======= ======= ======= ======= ======= ======= # Count 1 312 # Exec time 50 4s 5ms 25ms 13ms 20ms 4ms 12ms # Lock time 3 32ms 43us 163us 103us 131us 19us 98us # Rows sent 59 62.41k 203 231 204.82 202.40 3.99 202.40 # Rows examine 13 73.63k 238 296 241.67 246.02 10.15 234.30 # Rows affecte 0 0 0 0 0 0 0 0 # Rows read 59 62.41k 203 231 204.82 202.40 3.99 202.40 # Bytes sent 53 24.85M 46.52k 84.36k 81.56k 83.83k 7.31k 79.83k # Merge passes 0 0 0 0 0 0 0 0 # Tmp tables 0 0 0 0 0 0 0 0 # Tmp disk tbl 0 0 0 0 0 0 0 0 # Tmp tbl size 0 0 0 0 0 0 0 0 # Query size 0 21.63k 71 71 71 71 0 71 # InnoDB: # IO r bytes 0 0 0 0 0 0 0 0 # IO r ops 0 0 0 0 0 0 0 0 # IO r wait 0 0 0 0 0 0 0 0 # pages distin 40 11.77k 34 44 38.62 38.53 1.87 38.53 # queue wait 0 0 0 0 0 0 0 0 # rec lock wai 0 0 0 0 0 0 0 0 # Boolean: # Full scan 100% yes, 0% no # String: # Databases wp_blog_one (264/84%), wp_blog_tw… (36/11%)... 1 more # Hosts # InnoDB trxID 86B40B (1/0%), 86B430 (1/0%), 86B44A (1/0%)... 309 more # Last errno 0 # Users wp_blog_one (264/84%), wp_blog_two (36/11%)... 1 more # Query_time distribution # 1us # 10us # 100us # 1ms # 10ms ################################################################ # 100ms # 1s # 10s+ # Tables # SHOW TABLE STATUS FROM `wp_blog_one ` LIKE 'wp_options'G # SHOW CREATE TABLE `wp_blog_one `.`wp_options`G
  • 23. Databases – next step : explain explain <query> "How will MySQL execute the query"
  • 24. Databases – next step : explain +-----------+------+---------------+------+---------+------+--------+-------------+ | TABLE | TYPE | possible_keys | KEY | key_len | REF | ROWS | Extra | +-----------+------+---------------+------+---------+------+--------+-------------+ | employees | ALL | NULL | NULL | NULL | NULL | 299809 | USING WHERE | +-----------+------+---------------+------+---------+------+--------+-------------+ +------------+-------+-------------------------------+---------+---------+-------+------+-------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +------------+-------+-------------------------------+---------+---------+-------+------+-------+ | itdevice | const | PRIMARY,fk_device_devicetype1 | PRIMARY | 4 | const | 1 | | | devicetype | const | PRIMARY | PRIMARY | 4 | const | 1 | | +------------+-------+-------------------------------+---------+---------+-------+------+-------+
  • 25. Databases – next step : explain Type of lookup 'system', 'const' and 'ref' = good 'ALL' = bad Extra info Using index = good Using filesort = usually bad
  • 26. Databases – covering indexes mysql> explain select * from product where category=5 and stock=1; +----+-------+---------------+---------------+---------+------+------------+ | id | TYPE | possible_keys | KEY | key_len | ROWS | Extra | +----+-------+---------------+---------------+---------+------+------------+ | 1 | ref | categorystock | categorystock | 8 | 1 | | ●+----+-------+---------------+---------------+---------+------+------------+ +--------------+---------------+------+-----+------------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------+---------------+------+-----+------------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | category | int(11) | YES | MUL | NULL | | | stock | int(11) | YES | MUL | NULL | | | description | varchar(255) | YES | | NULL | | ... mysql> show index from product; +----------+------------+---------------------+--------------+---------------| | Table | Non_unique | Key_name | Seq_in_index | Column_name | +----------+------------+---------------------+--------------+---------------+ | product | 0 | PRIMARY | 1 | id | | product | 1 | categorystock | 1 | category | | product | 1 | categorystock | 2 | stock | ...
  • 27. Databases – covering indexes mysql>explain select category, stock, id from product where category=5 and stock=1; +----+-------+---------------+---------------+---------+-----+-------------+ | id | TYPE | possible_keys | KEY | key_len | ROWS| Extra | +----+-------+---------------+---------------+---------+-----+-------------+ | 1 | ref | categorystock | categorystock | 8 | 1 | Using index | +----+-------+---------------+---------------+---------+-----+-------------+ +--------------+---------------+------+-----+------------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------+---------------+------+-----+------------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | category | int(11) | YES | MUL | NULL | | | stock | int(11) | YES | MUL | NULL | | | description | varchar(255) | YES | | NULL | | ... mysql> show index from product; +----------+------------+---------------------+--------------+---------------| | Table | Non_unique | Key_name | Seq_in_index | Column_name | +----------+------------+---------------------+--------------+---------------+ | product | 0 | PRIMARY | 1 | id | | product | 1 | categorystock | 1 | category | | product | 1 | categorystock | 2 | stock | ...
  • 28. Databases – when to use / not to use Good at : Fetching data Storing data Searching through data Bad at : select `name` from `room` where ceiling(`avgNoOfPeople`) = 8 → full table scan → creates temporary table select `name` from `room` where avgNoOfPeople >= 7 and avgNoOfPeople <= 8 → Avoid functions that run across every row → Avoid functions in where statement
  • 29. For / foreach (N+1 problem) $customers = CustomerQuery::create() ->filterByState('MN') ->find(); foreach ($customers as $customer) { $contacts = ContactsQuery::create() ->filterByCustomerid($customer->getId()) ->find(); foreach ($contacts as $contact) { doSomestuffWith($contact); } }
  • 30. Joins $contacts = mysql_query(" select contacts.* from customer join contact on contact.customerid = customer.id where state = 'MN' "); while ($contact = mysql_fetch_array($contacts)) { doSomeStuffWith($contact); } (or the ORM equivalent)
  • 31. Better... 10001 → 1 query Sadly : people still produce code with query loops Usually : Growth not anticipated Internal app → Public app
  • 32. The origins of this talk Customers : Projects we built Projects we didn't build, but got pulled into Fixes Changes Infrastructure migration 15 years of 'how to cause mayhem with a few lines of code'
  • 33. Client X Jobs search site Monitor job views : Daily hits Weekly hits Monthly hits Which user saw which job
  • 34. Client X Originally : when user viewed job details Now : when job is in search result Search for 'php' → 50 jobs = 50 jobs to be updated → 50 updates for shown_today → 50 updates for shown_week → 50 updates for shown_month → 50 inserts for shown_user = 200 queries for 1 search !
  • 35. Client X : the code foreach ($jobs as $job) { $db->query(" insert into shown_today( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_week( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_month( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_user( jobId, userId, when ) values ( " . $job['id'] . ", " . $user['id'] . ", now() ) "); }
  • 36. Client X : the graph
  • 37. Client X : the numbers 600-1000 inserts/sec (peaks up to 1600) 400-1000 updates/sec (peaks up to 2600) 16 core machine
  • 38. Client X : panic ! Mail : "MySQL slave is more than 5 minutes behind master" We set it up → who did they blame ? Wait a second !
  • 39. Client X : what's causing those peaks ?
  • 40. Client X : possible cause ? Code changes ? → According to developers : none Action : turn on general log, analyze with pt-query-digest → 50+-fold increase in 4 queries → Developers : 'Oops we did make a change' After 3 days : 2,5 days behind Every hour : 50 min extra lag
  • 41. Client X : But why is the slave lagging ? Master Slave File : master-bin-xxxx.log File : master-bin-xxxx.logSlave I/O thread Binlog dump thread Slave SQL thread
  • 42. Client X : Master
  • 43. Client X : Slave
  • 44. Client X : fix ? foreach ($jobs as $job) { $db->query(" insert into shown_today( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_week( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_month( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_user( jobId, userId, when ) values ( " . $job['id'] . ", " . $user['id'] . ", now() ) "); }
  • 45. Client X : the code change insert into shown_today values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ; insert into shown_week values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ; insert into shown_month values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ; insert into shown_user values (5, 23, "2015-10-12 12:01:00"), (8, 23, "2015-10-12 12:01:00"), … ;
  • 46. Client X : the code change $todayQuery = " insert into shown_today( jobId, number ) values "; foreach ($jobs as $job) { $todayQuery .= "(" . $job['id'] . ", 1),"; } $todayQuery = substr($todayQuery, 0, strlen($todayQuery) - 1); $todayQuery .= " on duplicate key update number = number + 1 "; $db->query($todayQuery);
  • 47. Client X : the chosen solution $db->autocommit(false); foreach ($jobs as $job) { $db->query(" insert into shown_today( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_week( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_month( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_user( jobId, userId, when ) values ( " . $job['id'] . ", " . $user['id'] . ", now() ) "); } $db->commit();
  • 48. Client X : conclusion For loops are bad (we already knew that) Add master/slave and it gets much worse Use transactions : it will provide huge performance increase Better yet : use MariaDB 10 or higher → slave_parallel_threads Result : slave caught up 5 days later
  • 49. Database → Network Customer Y Top 10 site in Belgium Growing rapidly At peak traffic : Unexplicable latency on database Load on webservers : minimal Load on database servers : acceptable
  • 50. Client Y : the network
  • 51. Client Y : the network 60GB 700GB 700GB
  • 52. Client Y : network overload Cause : Drupal hooks → retrieving data that was not needed Only load data you actually need Don't know at the start ? → Use lazy loading Caching : Same story Memcached/Redis are fast But : data still needs to cross the network
  • 53. Network trouble : more than just traffic Customer Z 150.000 visits/day News ticker : XML feed from other site (owned by same customer) Cached for 15 min
  • 54. Customer Z – fetching the feed if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml') ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml'); What's wrong with this code ?
  • 55. Customer Z – no feed without the source Feed source
  • 56. Customer Z – no feed without the source Feed source
  • 57. Customer Z : timeout default_socket_timeout : 60 sec by default Each visitor : 60 sec wait time People keep hitting refresh → more load More active connections → more load Apache hits maximum connections → entire site down
  • 58. Customer Z – fetching the feed if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml') ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
  • 59. Customer Z : timeout fix $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents( 'http://www.scrambledsitename.be/xml/feed.xml', false, $context ) ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
  • 60. Customer Z : don't delete from cache $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents( 'http://www.scrambledsitename.be/xml/feed.xml', false, $context ) ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
  • 61. Customer Z : don't delete from cache $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents( 'http://www.scrambledsitename.be/xml/feed.xml', false, $context ) ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
  • 62. Customer Z : don't delete from cache $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { $feed = file_get_contents( 'http://www.scrambledsitename.be/xml/feed.xml', false, $context ); if ($feed !== false) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', $feed ); } } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
  • 63. Customer Z : process early $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { $feed = file_get_contents( 'http://www.scrambledsitename.be/xml/feed.xml', false, $context ); if ($feed !== false) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', ParseXmlFeed($feed) ); } }
  • 64. Customer Z : file_[get|put]_contents atomicity if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { $feed = file_get_contents( 'http://www.scrambledsitename.be/xml/feed.xml', false, $context ); if ($feed !== false) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', ParseXmlFeed($feed) ); } } Relying on user → concurrent requests → possible data corruption Better : run every 15min through cronjob
  • 65. Network resources Use timeouts for all : fopen curl SOAP … Data source trusted ? → setup a webservice → let them push updates when their feed changes → less load on data source → no timeout issues Add logging → early detection
  • 66. Logging Logging = good Logging in PHP using fopen → bad idea : locking issues → Use monolog : file, syslog, mail, Pushover, HipChat, Graylog, Rollbar, ElasticSearch (and 50 more) For Firefox : FirePHP (add-on for Firebug) Debug logging = bad on production Watch your logs ! Don't log on slow disks → I/O bottlenecks
  • 67. File system : I/O bottlenecks Causes : Excessive writes (database updates, logfiles, swapping, …) Excessive reads (non-indexed database queries, swapping, small file system cache, …) How to detect ? top iostat See iowait ? Stop worrying about php, fix the I/O problem ! Cpu(s): 0.2%us, 3.0%sy, 0.0%ni, 61.4%id, 35.5%wa, 0.0%hi avg-cpu: %user %nice %system %iowait %steal %idle 0.10 0.00 0.96 53.70 0.00 45.24 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 120.40 0.00 123289.60 0 616448 sdb 2.10 0.00 4378.10 0 18215 dm-0 4.20 0.00 36.80 0 184 dm-1 0.00 0.00 0.00 0 0
  • 68. File system Worst of all : NFS PHP files → lstat calls Templates → same Sessions → locking issues → corrupt data → store sessions in database, Memcached, Redis, ...
  • 69. Step-by-step : most common issues Using NFS ? Get rid of it ;-) iowait on database server I/O reads (use iostat) → missing/wrong indexes → too many queries → select * I/O writes → no transactions → too many queries → bad DB engine settings iowait on webserver (logs ? static files ?) CPU on database server (missing/wrong/too many indexes) CPU on webserver (PHP)
  • 70. Much more than code DB server Webserver User Network XML feed
  • 71. Look beyond PHP (or Perl, Ruby, Python, ...) !
  • 74. Contact Twitter @wimgtr Slides http://www.slideshare.net/wimg E-mail [email protected] Thanks ! Please provide feedback through Joind.in : https://joind.in/15535

Editor's Notes

  • #15: 5kbit/sec or 100Mbit/sec ?
  • #17: Let&amp;apos;s talk about code Without : we don&amp;apos;t exist What are most common mistakes in ecosystem Let&amp;apos;s start with the database
  • #22: time spent per query pattern how many queries of that query pattern
  • #30: Get back to what I said Lots of people use ORM - easier - don&amp;apos;t need to write queries - object-oriented but people start doing this Imagine 10000 customers → 10001 queries
  • #31: Not best code Uses deprecated mysql extension no error handling
  • #42: Master : 16 CPU cores 12 cores for SQL 1 core for binlog dump rest for system Slave : 16 CPU cores 1 core for slave I/O 1 core for slave SQL
  • #46: Grouping Works fine, but : maximum size of string ? PHP = no limit MySQL = max_allowed_packet
  • #47: Grouping Works fine, but : maximum size of string ? PHP = no limit MySQL = max_allowed_packet
  • #48: All in a single commit Note : transaction has max. size Possible : combination with previous solution
  • #51: took few moments to figure out No network monitoring → iptraf → 100Mbit/sec limit → packets dropped → connections dropped Customer : upgrade switch Us : why 100Mbit/sec ?
  • #53: Databases → network What other network related issues ?
  • #57: Server on which feed located : crashed Fine for few minutes (cache) 15 minutes : file_get_contents uses default_socket_timeout
  • #60: Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  • #61: Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  • #62: Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  • #63: Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  • #64: Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  • #65: Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  • #71: How do you treat your data : - where do you get it - how long did you have to wait to get it - how is it transported - how is it processed minimize the amount of data : retrieved transported processed, sent to db and users