SlideShare a Scribd company logo
Scaling Nuxeo Applications
Building scalable content apps with Nuxeo

Benoit Delbosc / Thierry Delprat




                                            1
Performance questions ...

 How can I ensure 0.1s response time ?

 What CPU should I use to host 1TB data ?
 Will 2 servers be enough for 1000 users ?
 Can I run Nuxeo DM with 512 MB of heap ?

 Is DELL PExyz ok to host Nuxeo ?

                                             2
Why is this not that simple ?
●
    Nuxeo EP is an ECM platform
    ●
        there are several ways to use it
         –   hundreds of possible user actions
         –   lot of screens
    ●
        there are several distributions
    ●
        there are several possible confgurations
        (security, fling plan, doc types ...)

●
    you can not size without knowing
    ●
        the hypothesis
    ●
        the constraints                            3
Agenda
●
    Approach for managing performance
●
    Nuxeo Architecture and performance
●
    Performance testing
●
    Performance tuning
●
    Benchmark fgures




                                         4
Ensuring good performance
●
    Carefully defne your hypothesis
    ➔
        what processing will be needed ?
        ➔
            batch processing vs interactive processing
    ➔
        what data you will be processed ?
        ➔
            big image transformation vs simple page rendering
    ➔
        what are the expectations ?
        ➔
            how many concurrent users ?
        ➔
            are users hyper-active ?

    ➔
        be able to defne a usage scenario
                                                                5
Ensuring good performance
●
    Defne your architecture according to
    ●
        constraints
    ●
        initial benchmark
    ➔
        mono server, multi-servers, cluster …
●
    Organize periodic benchmarks
    ➔
        ideally integrate in CI chain
●
    Plan some tuning
    ➔
        DB, memory, connections …
    ➔
        and code too
➔
    this is not a one shot task                 6
Impacting factors
●
    Security policies
    ●
        ACLs inheritance and custom security policies
●
    Web Layer
    ●
        Stateless vs Stateful
    ●
        JSF traps and screen design
●
    Document types
    ●
        complex document types impact DB schema
●
    Number of active documents
    ●
        number of rows
    ●
        size of indexes vs Memory                       7
Impacting factors
●
    Simultaneous access
    ●
        how many concurrent requests/s
    ●
        include batch processing
●
    Application server
    ●
        Tomcat 6 is signifcantly faster than JBoss 5
●
    DB choice
●
    OS
    ●
        a 32 bit OS is too limiting for JVM Memory
    ●
        JVM seems to run faster under Linux 64
                                                       8
No (or low) impact factors
●
    Total volume of binary fles
    ●
        only network and low level storage are impacted
●
    Average number of document per folder
    ●
        VCS has not the same limitations as JackRabbit
●
    Number of parallel sessions
    ●
        can only be impacting in JSF
●
    Documents that are almost never accessed
    ●
        DB caches should do their job

                                                          9
Scaling Nuxeo Applications
Architecture considerations




                              10
Architecture solutions

●
    3 possibles axises

    ●
        Simple Clustering


    ●
        Spreading services on multiple JVMs


    ●
        Multiple repositories

                                              11
VCS Cluster
●
    VCS Cluster is simple
    ●
        only one confg parameter to turn on
    ●
        do not rely on App Server level cluster
    ●
        “Nuxeo Boxes” are swappable
●
    VCS Cluster scales well
    ●
        as long as the backend DB scales !
➔
    VCS Cluster is a good solution for both
    ➔
        scaling out
    ➔
        providing redundancy
                                                  12
1 node deployment


            Nuxeo     DB
           Instance
                      FS




                           13
2 nodes deployment

            Nuxeo
           Instance
                      DB
     NLB
                      NAS
            Nuxeo
           Instance




                            14
3 nodes deployment
            Nuxeo
           Instance



            Nuxeo     DB
     NLB
           Instance
                      NAS


            Nuxeo
           Instance




                            15
Multi VM deployment
●
    Nuxeo Services can be spread across JVMs
    ●
        externalize batch processing (mass i/o)
    ●
        externalize heavy transformations
    ●
        externalize slow interaction with external app
    ●
        …
    ➔
        build dedicated processing servers

●
    Nuxeo services can be coupled with a GRID
    ●
        integration with GRID Gain
                                                         16
Multi VM : mono node


             batch
            Nuxeo     DB
           Instance
                      FS




                           17
Multi VM : 2 nodes
                        batch
                        Nuxeo
                       Instance




             Nuxeo                DB
            Instance
                                  FS




                                       18
Multi VM : GRID
                        batch
                       batch         GRID
                       batch
                       Nuxeo
                         Nuxeo
                        Nuxeo
                        Instance
                       Instance
                      Instance




            Nuxeo               DB
           Instance
                                FS




                                            19
Multi VM : perspectives
●
    Technically each service could be accessed
    remotely via RMI (on JEE server)
    ●
        rendering layer
    ●
        wf engine
    ●
        …
●
    But it's a little bit more complex
    ●
        Tx management
    ●
        binding confguration
➔
    In most case VCS Cluster is much more simple
                                                   20
Multi-Repositories
●
    A single Nuxeo Application can be bound to
    several repositories
●
    Each repository
    ●
        has its own database and cache
    ➔
        scale out solution if DB is the bottleneck
➔
    Useful to do data partitioning
    ➔
        manage documents with different constraints
        (ex: Live documents and archives)
    ➔
        manage documents for different user groups
        (ex: multi-tenant)
                                                      21
Multi-Repositories


                        Database
                            2      Archived
                 Repo              documents
                  2       FS
                           2
        Nuxeo
        Server
                 Repo   Database   Live
                  1         1      documents
                          FS
                           1




                                               22
Scaling Nuxeo Applications
Performance testing




                             23
Performance testing overview




                               24
Setup a realistic environment
●
    Replicate as much as possible the production
    environment
●
    Use historical data when possible
●
    Populate the database
    ●
        Custom mass importer tools
        (nuxeo-platform-importer)
    ●
        Load generating tools
    ●
        Feed the database at the SQL level

                                                   25
Monitoring
●
    Performance testing without monitor don't bother
●
    Collecting general information:
    ●
        Hardware metrics: type and number of cpu, memory, disk
        usage
    ●
        Software versions: os, jvm, middlewar, application
    ●
        Application confguration: nuxeo.conf
    ●
        Database confguration and database statistics




                                                                 26
Monitoring CPU
                 ●
                     Is the CPU a bottleneck ?
                 ●
                     Is the CPU waiting for IO ?
                 ●
                     Does the system use all
                     the available CPU ?




                                                   27
Monitoring Disk
                  ●
                      Is there a device
                      saturation ?
                  ●
                      Writing or reading
                      operations ?




                                           28
Monitoring GC
                ●
                    Is the JVM spend too
                    much time in the garbage
                    collector ?




                                               29
Monitoring web request processor
                     ●
                         How many requests ?
                     ●
                         How much input/output ?




                                                   30
Monitoring web thread pool
                 ●
                     Is the thread pool a bottle
                     neck ?




                                                   31
Monitoring datasources
               ●
                   Is there enough
                   connection in the pool ?




                                              32
Monitoring JVM
                 ●
                     How much JVM threads ?
                 ●
                     After a longevity testing, is
                     there a memory leak ?




                                                     33
Monitoring database
               ●
                   How much time is spent
                   processing SQL ?
               ●
                   Which query took up the
                   most time ?




                                             34
Setup the monitoring
●
    Use your production monitoring
    (nagios, hyperic, sysstat …)
●
    Setup GC log in the nuxeo.conf fle:
      JAVA_OPTS=$JAVA_OPTS -Xloggc:$DIRNAME/../log/gc.log -verbose:gc -XX:+PrintGC

●
    Setup the monitor JBoss templates in the
    nuxeo.conf fle :
      nuxeo.templates=postgresql,monitor

●
    Use monitorctl.sh
      ./jboss/bin/monitorctl.sh
      Usage: monitorctl.sh (start|stop|status|heapdump [TAG]|info|vacuumdb|help)


                                                                                     35
Load generating tools
●
    The application has to cooperate a bit to ease
    test writing
●
    Tools
    ●
        In-house tools
    ●
        Vendor tools
    ●
        Open source tools
        –   JMeter (GUI script, java)
        –   FunkLoad (Python)



                                                     36
Writing test scripts
 Try to simulate the expected user actions




                                             37
Performance report
 ●
     Speed (response time)
 ●
     Throughput (requests/s)
 ●
     User satisfaction (Apdex)
     http://www.apdex.org/




                                 38
Monitoring report
●
    All in one logchart.py
    http://svn.nuxeo.org/nuxeo/tools/qa/logchart/trunk

●
    Other tools:
       kSar, pgfouine, gcviewer ...




                                                         39
Scaling Nuxeo Applications
Tuning




                             40
Tuning process




                 41
Jboss/Tomcat tuning
●
    JVM, heap size (nuxeo.conf)
    JAVA_OPTS=$JAVA_OPTS -Xms2g -Xmx2g

●
    In special case you may remove SoftRef
    JAVA_OPTS=$JAVA_OPTS -XX:SoftRefLRUPolicyMSPerMB=0

●
    Datasource connection pool (nuxeo.conf)
    nuxeo.db.max-pool-size=40
    nuxeo.vcs.max-pool-size=40

●
    HTTP or AJP Thread pooling
    <Connector port=”8080” … maxThreads=”32” ... acceptCount=”256” .. />



                                                                           42
Database tuning
●
    Read the Nuxeo KB
    https://doc.nuxeo.com/display/KB/Confguring+PostgreSQL

●
    EXPLAIN ANALYZE
    helper: http://explain.despez.com
●
    Check for missing
    index on custom
    schema
●
    PostgreSQL
    Performance ml
    mailto:pgsql-performance@postgresql.org
                                                             43
Scaling Nuxeo Applications
Benchmarking results




                             44
Mass injection
●
    Using load generating tools:
    from 3 to 10 doc/s
    works fne up to 100k docs
    (otherwise it takes to much time)
●
    Using nuxeo-platform-importer:
    from 30 to 100 doc/s
    works fne up to 1m docs
    (otherwise it takes to much time)
●
    Using SQL injection :
    from 1000 to 3000 doc/s
                                        45
Document retrieval and
insertion operations
●
    Nuxeo DM 5.4 / tomcat / Sun JVM 6 (Heap 3G)
●
    10M of documents, 1TB of data
●
    Dell PE 2900 2xQuad-Core 20g RAM / Linux
●
    http://public.dev.nuxeo.com/~ben/bench-10m-tomcat/
Operation          Speed (s)         Throughput (req/s) Extrapolation VU
JSF View of a      0.6               25                  250
random document
JSF View of a      0.2               30                  300
cached document
Web Engine         0.1               100                 1000
navigation
JSF Creating       0.8               16                  160
new file

                                                                           46
Thank you!



             47

More Related Content

KEY
Introducing Apricot, The Eclipse Content Management Platform
Nuxeo
 
PDF
Nuxeo and JavaScript
Nuxeo
 
PDF
Nuxeo World Session: Building Packages for the Nuxeo Marketplace
Nuxeo
 
ODP
[Nuxeo World 2013] XML EXTENSION POINT COMPLETION IN NUXEO IDE - SUN TAN, SERLI
Nuxeo
 
PDF
Development with Vert.x: an event-driven application framework for the JVM
David Wu
 
PDF
Vert.x introduction
GR8Conf
 
PDF
Treinamento frontend
Adrian Caetano
 
PDF
An Introduction to the Vert.x framework
zznate
 
Introducing Apricot, The Eclipse Content Management Platform
Nuxeo
 
Nuxeo and JavaScript
Nuxeo
 
Nuxeo World Session: Building Packages for the Nuxeo Marketplace
Nuxeo
 
[Nuxeo World 2013] XML EXTENSION POINT COMPLETION IN NUXEO IDE - SUN TAN, SERLI
Nuxeo
 
Development with Vert.x: an event-driven application framework for the JVM
David Wu
 
Vert.x introduction
GR8Conf
 
Treinamento frontend
Adrian Caetano
 
An Introduction to the Vert.x framework
zznate
 

What's hot (20)

PDF
JavaScript as a Server side language (NodeJS): JSConf 2011, Dhaka
Nurul Ferdous
 
PPTX
Real World Enterprise Reactive Programming using Vert.x
Sascha Möllering
 
PDF
Node4J: Running Node.js in a JavaWorld
Ian Bull
 
PPTX
Overview of PaaS: Java experience
Alex Tumanoff
 
PPTX
Halton Software Peer 2 Peer Meetup #10
David Ashton
 
PDF
Introduction to node js - From "hello world" to deploying on azure
Colin Mackay
 
PDF
Running JavaScript Efficiently in a Java World
irbull
 
PDF
Fundamental of Node.JS - Internship Presentation - Week7
Devang Garach
 
PDF
Node.js, toy or power tool?
Ovidiu Dimulescu
 
PDF
Angular + JHipster - JHipster Conf
William Marques
 
PDF
Continuous Integration for front-end JavaScript
Lars Thorup
 
PDF
Node js (runtime environment + js library) platform
Sreenivas Kappala
 
PDF
Node.js Tutorial for Beginners | Node.js Web Application Tutorial | Node.js T...
Edureka!
 
PDF
Running Containerized Node.js Services on AWS Elastic Beanstalk
zupzup.org
 
PDF
The Neova Health Open Source Tool Chain
Rob Dyke
 
PPTX
3 Things Everyone Knows About Node JS That You Don't
F5 Buddy
 
PDF
NodeJS overview
Roman Trukhin
 
DOCX
All Work Responsibilities in my whole Life-Testing Profession
Gopi Raghavendra
 
PPTX
An introduction to Node.js application development
shelloidhq
 
PDF
Quarkus Denmark 2019
Max Andersen
 
JavaScript as a Server side language (NodeJS): JSConf 2011, Dhaka
Nurul Ferdous
 
Real World Enterprise Reactive Programming using Vert.x
Sascha Möllering
 
Node4J: Running Node.js in a JavaWorld
Ian Bull
 
Overview of PaaS: Java experience
Alex Tumanoff
 
Halton Software Peer 2 Peer Meetup #10
David Ashton
 
Introduction to node js - From "hello world" to deploying on azure
Colin Mackay
 
Running JavaScript Efficiently in a Java World
irbull
 
Fundamental of Node.JS - Internship Presentation - Week7
Devang Garach
 
Node.js, toy or power tool?
Ovidiu Dimulescu
 
Angular + JHipster - JHipster Conf
William Marques
 
Continuous Integration for front-end JavaScript
Lars Thorup
 
Node js (runtime environment + js library) platform
Sreenivas Kappala
 
Node.js Tutorial for Beginners | Node.js Web Application Tutorial | Node.js T...
Edureka!
 
Running Containerized Node.js Services on AWS Elastic Beanstalk
zupzup.org
 
The Neova Health Open Source Tool Chain
Rob Dyke
 
3 Things Everyone Knows About Node JS That You Don't
F5 Buddy
 
NodeJS overview
Roman Trukhin
 
All Work Responsibilities in my whole Life-Testing Profession
Gopi Raghavendra
 
An introduction to Node.js application development
shelloidhq
 
Quarkus Denmark 2019
Max Andersen
 
Ad

Similar to Nuxeo World Session: Scaling Nuxeo Applications (20)

PDF
Nuxeo platform in 15mn
Nuxeo
 
PDF
Nuxeo in 2011: A year in review and a preview of what's next!
Nuxeo
 
PDF
Nuxeo World Session: Nuxeo Distributions
Nuxeo
 
PDF
Nuxeo World Session: Migrating to Nuxeo
Nuxeo
 
PDF
Nuxeo, an open source platform for content-centric business applications
Stefane Fermigier
 
PDF
Nuxeo Open Source ECM, OW2con 11, Nov 24-25, Paris
OW2
 
PPTX
vFabric - Ideal Platform for SaaS Apps
VMware vFabric
 
PDF
Slide deck roadmap webinar 2013
Nuxeo
 
PDF
10 reasons why Nuxeo is using GlassFish
Nuxeo
 
PDF
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
IndicThreads
 
KEY
Lessons learned Building Nuxeo EP - Component-based, open source ECM platform
Nuxeo
 
PDF
Brief about Windows Azure Platform
K.Mohamed Faizal
 
PPTX
Windows Azure Platform
AsmTrash
 
PDF
Imaginea Performance Engineering
RajaneeshChandra
 
PDF
What's new in Nuxeo 5.2? - Solutions Linux 2009
Stefane Fermigier
 
KEY
Nuxeo EP 5 2
Nuxeo
 
PDF
Scalarium and CouchDB
Jonathan Weiss
 
PDF
Couchbase b jmeetup
mysqlops
 
PDF
Choosing Your Windows Azure Platform Strategy
drmarcustillett
 
PDF
A scalable server environment for your applications
GigaSpaces
 
Nuxeo platform in 15mn
Nuxeo
 
Nuxeo in 2011: A year in review and a preview of what's next!
Nuxeo
 
Nuxeo World Session: Nuxeo Distributions
Nuxeo
 
Nuxeo World Session: Migrating to Nuxeo
Nuxeo
 
Nuxeo, an open source platform for content-centric business applications
Stefane Fermigier
 
Nuxeo Open Source ECM, OW2con 11, Nov 24-25, Paris
OW2
 
vFabric - Ideal Platform for SaaS Apps
VMware vFabric
 
Slide deck roadmap webinar 2013
Nuxeo
 
10 reasons why Nuxeo is using GlassFish
Nuxeo
 
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
IndicThreads
 
Lessons learned Building Nuxeo EP - Component-based, open source ECM platform
Nuxeo
 
Brief about Windows Azure Platform
K.Mohamed Faizal
 
Windows Azure Platform
AsmTrash
 
Imaginea Performance Engineering
RajaneeshChandra
 
What's new in Nuxeo 5.2? - Solutions Linux 2009
Stefane Fermigier
 
Nuxeo EP 5 2
Nuxeo
 
Scalarium and CouchDB
Jonathan Weiss
 
Couchbase b jmeetup
mysqlops
 
Choosing Your Windows Azure Platform Strategy
drmarcustillett
 
A scalable server environment for your applications
GigaSpaces
 
Ad

More from Nuxeo (20)

PDF
Own the Digital Shelf Strategies Food and Beverage Companies
Nuxeo
 
PDF
How DAM Librarians Can Get Ready for the Uncertain Future
Nuxeo
 
PDF
How Insurers Fueled Transformation During a Pandemic
Nuxeo
 
PDF
Manage your Content at Scale with MongoDB and Nuxeo
Nuxeo
 
PDF
Accelerate the Digital Supply Chain From Idea to Support
Nuxeo
 
PDF
Where are you in the DAM Continuum
Nuxeo
 
PDF
Customer Experience in 2021
Nuxeo
 
PPTX
L’IA personnalisée, clé d’une gestion de l’information innovante
Nuxeo
 
PDF
Gérer ses contenus avec MongoDB et Nuxeo
Nuxeo
 
PPTX
Le DAM en 2021 : Tendances, points clés et critères d'évaluation
Nuxeo
 
PPTX
Enabling Digital Transformation Amidst a Global Pandemic | Low-Code, Cloud, A...
Nuxeo
 
PDF
Elevate your Customer's Experience and Stay Ahead of the Competition
Nuxeo
 
PDF
Driving Brand Loyalty Through Superior Customer Experience
Nuxeo
 
PDF
Drive Enterprise Speed and Scale with A Cloud-Native DAM
Nuxeo
 
PPTX
The Big Picture: the Role of Video, Photography, and Content in Enhancing the...
Nuxeo
 
PDF
How Creatives Are Getting Creative in 2020 and Beyond
Nuxeo
 
PPTX
Digitalisation : Améliorez la collaboration et l’expérience client grâce au DAM
Nuxeo
 
PDF
Reimagine Your Claims Process with Future-Proof Technologies
Nuxeo
 
PPTX
Comment le Centre Hospitalier Laborit dématérialise ses processus administratifs
Nuxeo
 
PDF
Accelerating the Packaging Design Process with Artificial Intelligence
Nuxeo
 
Own the Digital Shelf Strategies Food and Beverage Companies
Nuxeo
 
How DAM Librarians Can Get Ready for the Uncertain Future
Nuxeo
 
How Insurers Fueled Transformation During a Pandemic
Nuxeo
 
Manage your Content at Scale with MongoDB and Nuxeo
Nuxeo
 
Accelerate the Digital Supply Chain From Idea to Support
Nuxeo
 
Where are you in the DAM Continuum
Nuxeo
 
Customer Experience in 2021
Nuxeo
 
L’IA personnalisée, clé d’une gestion de l’information innovante
Nuxeo
 
Gérer ses contenus avec MongoDB et Nuxeo
Nuxeo
 
Le DAM en 2021 : Tendances, points clés et critères d'évaluation
Nuxeo
 
Enabling Digital Transformation Amidst a Global Pandemic | Low-Code, Cloud, A...
Nuxeo
 
Elevate your Customer's Experience and Stay Ahead of the Competition
Nuxeo
 
Driving Brand Loyalty Through Superior Customer Experience
Nuxeo
 
Drive Enterprise Speed and Scale with A Cloud-Native DAM
Nuxeo
 
The Big Picture: the Role of Video, Photography, and Content in Enhancing the...
Nuxeo
 
How Creatives Are Getting Creative in 2020 and Beyond
Nuxeo
 
Digitalisation : Améliorez la collaboration et l’expérience client grâce au DAM
Nuxeo
 
Reimagine Your Claims Process with Future-Proof Technologies
Nuxeo
 
Comment le Centre Hospitalier Laborit dématérialise ses processus administratifs
Nuxeo
 
Accelerating the Packaging Design Process with Artificial Intelligence
Nuxeo
 

Recently uploaded (20)

PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PDF
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
AI-Cloud-Business-Management-Platforms-The-Key-to-Efficiency-Growth.pdf
Artjoker Software Development Company
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 

Nuxeo World Session: Scaling Nuxeo Applications

  • 1. Scaling Nuxeo Applications Building scalable content apps with Nuxeo Benoit Delbosc / Thierry Delprat 1
  • 2. Performance questions ... How can I ensure 0.1s response time ? What CPU should I use to host 1TB data ? Will 2 servers be enough for 1000 users ? Can I run Nuxeo DM with 512 MB of heap ? Is DELL PExyz ok to host Nuxeo ? 2
  • 3. Why is this not that simple ? ● Nuxeo EP is an ECM platform ● there are several ways to use it – hundreds of possible user actions – lot of screens ● there are several distributions ● there are several possible confgurations (security, fling plan, doc types ...) ● you can not size without knowing ● the hypothesis ● the constraints 3
  • 4. Agenda ● Approach for managing performance ● Nuxeo Architecture and performance ● Performance testing ● Performance tuning ● Benchmark fgures 4
  • 5. Ensuring good performance ● Carefully defne your hypothesis ➔ what processing will be needed ? ➔ batch processing vs interactive processing ➔ what data you will be processed ? ➔ big image transformation vs simple page rendering ➔ what are the expectations ? ➔ how many concurrent users ? ➔ are users hyper-active ? ➔ be able to defne a usage scenario 5
  • 6. Ensuring good performance ● Defne your architecture according to ● constraints ● initial benchmark ➔ mono server, multi-servers, cluster … ● Organize periodic benchmarks ➔ ideally integrate in CI chain ● Plan some tuning ➔ DB, memory, connections … ➔ and code too ➔ this is not a one shot task 6
  • 7. Impacting factors ● Security policies ● ACLs inheritance and custom security policies ● Web Layer ● Stateless vs Stateful ● JSF traps and screen design ● Document types ● complex document types impact DB schema ● Number of active documents ● number of rows ● size of indexes vs Memory 7
  • 8. Impacting factors ● Simultaneous access ● how many concurrent requests/s ● include batch processing ● Application server ● Tomcat 6 is signifcantly faster than JBoss 5 ● DB choice ● OS ● a 32 bit OS is too limiting for JVM Memory ● JVM seems to run faster under Linux 64 8
  • 9. No (or low) impact factors ● Total volume of binary fles ● only network and low level storage are impacted ● Average number of document per folder ● VCS has not the same limitations as JackRabbit ● Number of parallel sessions ● can only be impacting in JSF ● Documents that are almost never accessed ● DB caches should do their job 9
  • 11. Architecture solutions ● 3 possibles axises ● Simple Clustering ● Spreading services on multiple JVMs ● Multiple repositories 11
  • 12. VCS Cluster ● VCS Cluster is simple ● only one confg parameter to turn on ● do not rely on App Server level cluster ● “Nuxeo Boxes” are swappable ● VCS Cluster scales well ● as long as the backend DB scales ! ➔ VCS Cluster is a good solution for both ➔ scaling out ➔ providing redundancy 12
  • 13. 1 node deployment Nuxeo DB Instance FS 13
  • 14. 2 nodes deployment Nuxeo Instance DB NLB NAS Nuxeo Instance 14
  • 15. 3 nodes deployment Nuxeo Instance Nuxeo DB NLB Instance NAS Nuxeo Instance 15
  • 16. Multi VM deployment ● Nuxeo Services can be spread across JVMs ● externalize batch processing (mass i/o) ● externalize heavy transformations ● externalize slow interaction with external app ● … ➔ build dedicated processing servers ● Nuxeo services can be coupled with a GRID ● integration with GRID Gain 16
  • 17. Multi VM : mono node batch Nuxeo DB Instance FS 17
  • 18. Multi VM : 2 nodes batch Nuxeo Instance Nuxeo DB Instance FS 18
  • 19. Multi VM : GRID batch batch GRID batch Nuxeo Nuxeo Nuxeo Instance Instance Instance Nuxeo DB Instance FS 19
  • 20. Multi VM : perspectives ● Technically each service could be accessed remotely via RMI (on JEE server) ● rendering layer ● wf engine ● … ● But it's a little bit more complex ● Tx management ● binding confguration ➔ In most case VCS Cluster is much more simple 20
  • 21. Multi-Repositories ● A single Nuxeo Application can be bound to several repositories ● Each repository ● has its own database and cache ➔ scale out solution if DB is the bottleneck ➔ Useful to do data partitioning ➔ manage documents with different constraints (ex: Live documents and archives) ➔ manage documents for different user groups (ex: multi-tenant) 21
  • 22. Multi-Repositories Database 2 Archived Repo documents 2 FS 2 Nuxeo Server Repo Database Live 1 1 documents FS 1 22
  • 25. Setup a realistic environment ● Replicate as much as possible the production environment ● Use historical data when possible ● Populate the database ● Custom mass importer tools (nuxeo-platform-importer) ● Load generating tools ● Feed the database at the SQL level 25
  • 26. Monitoring ● Performance testing without monitor don't bother ● Collecting general information: ● Hardware metrics: type and number of cpu, memory, disk usage ● Software versions: os, jvm, middlewar, application ● Application confguration: nuxeo.conf ● Database confguration and database statistics 26
  • 27. Monitoring CPU ● Is the CPU a bottleneck ? ● Is the CPU waiting for IO ? ● Does the system use all the available CPU ? 27
  • 28. Monitoring Disk ● Is there a device saturation ? ● Writing or reading operations ? 28
  • 29. Monitoring GC ● Is the JVM spend too much time in the garbage collector ? 29
  • 30. Monitoring web request processor ● How many requests ? ● How much input/output ? 30
  • 31. Monitoring web thread pool ● Is the thread pool a bottle neck ? 31
  • 32. Monitoring datasources ● Is there enough connection in the pool ? 32
  • 33. Monitoring JVM ● How much JVM threads ? ● After a longevity testing, is there a memory leak ? 33
  • 34. Monitoring database ● How much time is spent processing SQL ? ● Which query took up the most time ? 34
  • 35. Setup the monitoring ● Use your production monitoring (nagios, hyperic, sysstat …) ● Setup GC log in the nuxeo.conf fle: JAVA_OPTS=$JAVA_OPTS -Xloggc:$DIRNAME/../log/gc.log -verbose:gc -XX:+PrintGC ● Setup the monitor JBoss templates in the nuxeo.conf fle : nuxeo.templates=postgresql,monitor ● Use monitorctl.sh ./jboss/bin/monitorctl.sh Usage: monitorctl.sh (start|stop|status|heapdump [TAG]|info|vacuumdb|help) 35
  • 36. Load generating tools ● The application has to cooperate a bit to ease test writing ● Tools ● In-house tools ● Vendor tools ● Open source tools – JMeter (GUI script, java) – FunkLoad (Python) 36
  • 37. Writing test scripts Try to simulate the expected user actions 37
  • 38. Performance report ● Speed (response time) ● Throughput (requests/s) ● User satisfaction (Apdex) http://www.apdex.org/ 38
  • 39. Monitoring report ● All in one logchart.py http://svn.nuxeo.org/nuxeo/tools/qa/logchart/trunk ● Other tools: kSar, pgfouine, gcviewer ... 39
  • 42. Jboss/Tomcat tuning ● JVM, heap size (nuxeo.conf) JAVA_OPTS=$JAVA_OPTS -Xms2g -Xmx2g ● In special case you may remove SoftRef JAVA_OPTS=$JAVA_OPTS -XX:SoftRefLRUPolicyMSPerMB=0 ● Datasource connection pool (nuxeo.conf) nuxeo.db.max-pool-size=40 nuxeo.vcs.max-pool-size=40 ● HTTP or AJP Thread pooling <Connector port=”8080” … maxThreads=”32” ... acceptCount=”256” .. /> 42
  • 43. Database tuning ● Read the Nuxeo KB https://doc.nuxeo.com/display/KB/Confguring+PostgreSQL ● EXPLAIN ANALYZE helper: http://explain.despez.com ● Check for missing index on custom schema ● PostgreSQL Performance ml mailto:[email protected] 43
  • 45. Mass injection ● Using load generating tools: from 3 to 10 doc/s works fne up to 100k docs (otherwise it takes to much time) ● Using nuxeo-platform-importer: from 30 to 100 doc/s works fne up to 1m docs (otherwise it takes to much time) ● Using SQL injection : from 1000 to 3000 doc/s 45
  • 46. Document retrieval and insertion operations ● Nuxeo DM 5.4 / tomcat / Sun JVM 6 (Heap 3G) ● 10M of documents, 1TB of data ● Dell PE 2900 2xQuad-Core 20g RAM / Linux ● http://public.dev.nuxeo.com/~ben/bench-10m-tomcat/ Operation Speed (s) Throughput (req/s) Extrapolation VU JSF View of a 0.6 25 250 random document JSF View of a 0.2 30 300 cached document Web Engine 0.1 100 1000 navigation JSF Creating 0.8 16 160 new file 46