High Performance Mobile Web
Patrick Meenan
@PatMeenan
pmeenan@webpagetest.org
Slides

www.slideshare.net/patrickmeenan
Schedule
10 - 10:30 - Delivering a fast mobile web experience
10:30 - 11 - Break + Live Website testing/analysis (network Permitting)
11 - 11:20 - Measuring Web Performance
11:20 - 11:30 - Q&A
Desktop
Median: ~2.7s
Mean: ~6.9s
Mobile *
Median: ~4.8s
Mean: ~10.2s

* optimistic

How Fast Are Websites Around The World? - Google Analytics Blog
Fiber-to-the-home services provided 18 ms round-trip latency on average, while cable-based
services averaged 26 ms, and DSL-based services averaged 43 ms. This compares to 2011 figures of
17 ms for fiber, 28 ms for cable and 44 ms for DSL.

Measuring Broadband America - July 2012 - FCC
"Users of the Sprint 4G network can expect to experience average speeds of 3
Mbps to 6 Mbps download and up to 1.5 Mbps upload with an average latency of
150 ms. On the Sprint 3G network, users can expect to experience average speeds
of 600 Kbps - 1.4 Mbps download and 350 Kbps - 500 Kbps upload with an
average latency of 400 ms."

3G

4G

Sprint

400 ms

150 ms

AT&T

150 - 400 ms

100 - 200 ms

AT&T
●
●

There is a one time cost for control-plane
negotiation
User-plane latency is the one-way latency between
packet availability in the device and packet at the
base station
LTE

Idle to connected latency

User-plane one-way latency

HSPA+

3G

< 100 ms

< 100 ms

< 2.5 s

< 5 ms

< 10 ms

< 50 ms
LTE power state transitions (AT&T)

●
●

●

●

Idle to Active: 260 ms control-plane latency
Dormant to Active: <50 ms control-plane latency (spec)

Timeout driven state transitions back to idle
○ 100 ms > Dormant
○ 10 s > Idle
Similar state machine for 3G devices
○ Except Control Plane latencies are much higher (1-2
seconds)
https://github.com/attdevsupport/ARO/blob/master/ARODataAnalyzer/src/lte.conf
Waterfall Basics
Waterfall Components
3G
(200 ms RTT)

4G
(80 ms RTT)

Control plane

(200-2500 ms)

(50-100 ms)

DNS lookup

200 ms

80 ms

TCP Connection

200 ms

80 ms

TLS handshake

(200-400 ms)

(80-160 ms)

HTTP request

200 ms

80 ms

600-3500 ms

240-500 ms

Total

Network overhead of
one HTTP request!
Typical Mobile Network Performance
Country

Average RTT

Average Downlink
Throughput

Average Uplink Throughput

South Korea

278 ms

1.8 Mbps

723 Kbps

Vietnam

305 ms

1.9 Mbps

543 Kbps

US

344 ms

1.6 Mbps

658 Kbps

UK

372 ms

1.4 Mbps

782 Kbps

Russia

518 ms

1.1 Mbps

439 Kbps

India

654 ms

1.2 Mbps

633 Kbps

Nigeria

892 ms

541 Kbps

298 Kbps

Compare to typical desktop and WiFi performance:
< 50 ms RTT, 5 Mbps throughput in the US
Source: Ookla/Speedtest.net
http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/
3G
http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/
WiFi/LTE
http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/
Smaller resources

Fewer resources

http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/
The Performance Golden Rule

* http://www.stevesouders.com/blog/2012/02/10/the-performance-golden-rule/
Torbit Insight Real-User Data

http://torbit.com/blog/2012/09/19/some-interesting-performance-statistics/
Front-End: 3.2s*

Back-End: 0.010s*

*WebPagetest performance as measured by New Relic
Back-End

Front-End
Content Type

Avg # of Requests

Avg size

HTML

6

39 kB

Images

39

490 kB

Javascript

10

142 kB

CSS

3

27 kB
HTTP Archive - Mobile Trends (Feb, 2013)
11 Requests, 300KB

784 Requests, 9,200KB
11 Requests, 300KB

784 Requests, 9,200KB

WiFi - Cable (5Mbps, 28ms)

1.5s

28s

3G - Fast (1.6Mbps, 150ms)

2.8s

52s
High Performance Websites
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.

Make fewer HTTP requests
Use CDN
Add expires header
Gzip Components
Put stylesheets at the top
Put scripts at the bottom
Avoid CSS expressions
Make JS and CSS external
Reduce DNS lookups
Minify JS
Avoid redirects
Remove duplicate scripts
Configure Etags
Make Ajax cacheable
Sharding domains
basically…..
1 - Deliver smaller resources
2 - Send fewer resources
TCP Initial Congestion Window
This:

Is Really:

Linux 2.6.39+ (IW 10):
TCP Initial Congestion Window

Upgraded from
Ubuntu 10.04 (2.6.32)
to 12.04 (3.2)

Base Page Download Time
WebP
●
●
●
●

40% smaller that jpeg for equivalent quality
Supports alpha + lossy
New losless support (26% smaller than PNG in testing)
Supported by Chrome and Opera
Browser Prefetcher
External JS Script Resources

Poor Bandwidth Utilization
Browser Prefetcher
External JS Script
Resources

Improved Bandwidth Utilization
Watch out for Hidden Images (slideshows in particular)

Main gallery image competing with hidden images and background for bandwidth
4 Shards

●
●
●

2 Shards

50-80ms faster page load times for image heavy pages (e.g. search)
○ 30-50ms faster overall.
Up to 500ms faster load times on mobile.
0.27% increase in pages per visit.

http://calendar.perfplanet.com/2013/reducing-domain-sharding/
Sync scripts block the parser...
Sync script will block the rendering of your page:
<script type="text/javascript"
src="https://apis.google.com/js/plusone.js"></script>

Async script will not block the rendering of your page:
<script type="text/javascript">
(function() {
var po = document.createElement('script'); po.type = 'text/javascript';
po.async = true; po.src = 'https://apis.google.com/js/plusone.js';
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(po, s);
})();
</script>
Performance rules to keep in mind...
(1) JavaScript can block the DOM construction
(2) JavaScript can block on CSS
(3) Rendering is blocked on CSS...

Which means...
(1) Get CSS down to the client as fast as you can
○

Unblocks paints, removes potential JS waiting on CSS scenario

(2) If you can, use async scripts + avoid doc.write at all costs
○
○

Faster DOM construction, faster DCL and paint!
Do you need scripts in your critical rendering path?
Less is more

Parsing JS can take 1ms per KB (uncompressed)
Lazy Image Loaders
● Hide images from preloader (good)
● Reduce data use (good)
● Trigger resource loads at arbitrary times (bad)
○

May wake up radio and require 2-3s delay

● Balance bandwidth with experience and battery
○

Maybe load all delayed images after onload
brown.edu's mobile home page
125 KB,
1800x800
background
image
Break
Measuring Web Performance
Navigation Timing (W3C)
IE 9+
Chrome
Firefox 9+
Android 4+

Front-End

Server
User’s Connectivity

Navigation Timing spec
When is “Done”?
(old) Twitter onLoad (1.9s)

(old) Twitter end of activity (6.8s)
0s

7s

9s

53 s

http://www.webpagetest.org/video/compare.php?tests=131209_GX_W8Z-r:1-c:0
Speed Index: 8592

http://www.webpagetest.org/video/compare.php?tests=131209_GX_W8Z-r:1-c:0
onload=”performance.mark(‘aft.Image Loaded’)”...

window.onload:
● performance.getEntriesByType(“mark”)
● Report latest aft.* as custom time

http://blog.patrickmeenan.com/2013/07/measuring-performance-of-user-experience.html
Roll Your Own

● Send http beacon to beacon server
○ All timings as query params
○ 204 or transparent 1px gif
○ Log requests to access log

● IP, User Agent and Timings all in each
record
● access log -> logster -> statsd -> graphite
● Profit!
Google Analytics
_gaq.push(['_trackTiming', 'UserTimings', 'aft', measuredTime);
Boomerang/Soasta mPulse
BOOMR.plugins.RT.setTimer(‘custom0’, measuredTime);

https://gist.github.com/pmeenan/5902672
AFT vs onload

AFT
Median

0.932

95th Percentile

4.141

98th Percentile

7.262

onLoad
Median

2.235

95th Percentile

11.787

98th Percentile

26.72

http://www.soasta.com/products/mpulse/
Custom timings in WebPagetest
Resource Timing
IE 10+
Chrome

Timing for every network-loaded resource

http://www.w3.org/TR/resource-timing/
Resource Timing
window.performance.getResourceTimings();

Cross-origin restrictions.
Granular timing blanked out unless:

Timing-Allow-Origin: *
Timing-Allow-Origin: null
Timing-Allow-Origin: www.example.com
Experiment on webpagetest.org

●

Set session cookie to identify new/existing session

●

Set browser cookie and local cache value to track cache persistence

●

Track performance for sitewide js (site.js)
○

From Cache: responseStart == 0 || responseStart == requestStart
Resource from the Network
Resource from Memory Cache
Resource from Disk Cache
https://plus.google.com/u/0/+IlyaGrigorik/posts/EnoiF9PkeYb
Thank You!
Patrick Meenan
@PatMeenan
pmeenan@webpagetest.org

Mobile web performance - MoDev East