Building the Best Computer
Clusters for Mechanical Design
Environments	

Daniel Chan and Ed Huott


                                 e
GE Corporate R&D Center
Key Challenges

•  A Highly Diverse Environment
  –  Many Different Businesses
  –  Engineering, Chemistry, Financial, Services
•  Short-Term Technical Support
  –  Quick Turnaround
•  Costs
  –  Downtime, System Admin., Software
     Licenses, Productivity
                                             e
Design Requirements
•  Web-Centric
  –  Simplify Job Submission in a Heterogeneous
     Environment
  –  Access from Anywhere and Anytime
•  Minimize Complexity and Cost
•  Stable and Highly-Available
  –  Isolate Network, NFS and Insufficient Disk
     Space Problems, Software License
     Management                                   e
Reasons for Having a
Heterogeneous Environment

•  Performance

•  Costs

•  Legacy Applications


                            e
Let’s Do An Experiment
           NT 4.0                                  Unix/Solaris 2.6
     GE Corporate Standard                          Sun Ultra 10
         PIII 450 MHz                          UltraSPARC 440 MHz
         256 MB RAM                                 256 MB RAM
             $1400                                     $5000


Use Perl script                 CFX 5                  Extract wall
to launch 1, 2 and          23,862 Nodes               clock times from
3 concurrent jobs,         102 MB Required             output files then
respectively, for                                      use Minitab to
30 times                                               analyze data

          Test stability, multitasking and paging capabilities
                                                                     e
Results For Sun Ultra 10
                                        1 job
                       Histogram of C4, with Normal Curve
                                                                                                                                                     2 jobs
                                                                                                                                       Histogram of C4, with Normal Curve

            8                                                                                                            30

            7

            6
                                                                                                                         20
Frequency




            5




                                                                                                             Frequency
            4

            3
                                                                                                                         10
            2

            1

            0                                                                                                             0
                1700     1710    1720    1730               1740        1750    1760                                          3320   3340   3360   3380   3400   3420   3440   3460   3480   3500
                                          C4                                                                                                                 C4
                                                                               Histogram of C4, with Normal Curve

                                                                   50


                                                                   40
                                                Frequency




                           3 jobs
                                                                   30




                                                                                                                                                                                      e
                                                                   20


                                                                   10


                                                                    0

                                                                                   4000               5000                      6000

                                                                                                 C4
Results for Dell OptiPlex G1x
             Histogram of TOTAL_CLOCK_TIME, with Normal Curve
                              1 job                                                                              2 jobs
                                                                                                   Histogram of C4, with Normal Curve

            20                                                                                40



                                                                                              30
Frequency




                                                                               Frequency
            10                                                                                20



                                                                                              10



             0
                                                                                               0
                  1924      1925     1926                1927          1928
                                                                                                       3900                3950         4000
                               TOTAL_CLOCK_TIME
                                                                                                                     C4



                                                                              3 jobs
                                                     Histogram of TOTAL_CLOCK_TIME, with Normal Curve

                                                    50


                                                    40




                                                                                                                                               e
                                        Frequency




                                                    30


                                                    20


                                                    10


                                                     0

                                                                5800                   5850           5900
Scorecard for Sun Ultra 10
                 USL − µ       01µ
                                .
            Z=             =
                   σ           σ

            Mean       Standard       Z
                       Deviation
   1 job    1724         15.9        10.8
   2 jobs   3442         26.1        13.2
            (2X)
   3 jobs   5144        350.2        1.5
            (3X)

  Paging
                                            e
Scorecard For Dell Optiplex GX1
                    USL − µ       01µ
                                   .
               Z=             =
                      σ           σ

               Mean       Standard      Z
                          Deviation
      1 job   1926           1.2        161
      2 jobs  3909          15.6        25
             (2.1X)
      3 jobs 5828             26.4      22
              (3X)


                                              e
Summary of Results
•  The SUN workstation is about 10% faster, but
   nearly 5 times as expensive

•  Both systems are just as stable for a period of
   one week

•  NT can multitask

•  NT can page and appears to have “better”
   memory management scheme                       e
APNASA Parallel Performance
                                   308,115 Grid Points
             1400
                                       PII 333 MHz Cluster
             1200

             1000
TIME (SEC)




              800

              600

              400                                                       SGI Origin 2000
                                                                        R10K 195 MHz
              200        PIII 550 MHz Xeon




                                                                                          e
                         2 MB Cache, 8-Way SMP

                    -2    -1   0   1    2    3   4   5    6     7   8    9   10 11
                                            No. of Processors
APNASA Parallel Performance
                          308,115 Grid Points
                                                        IDEAL
              9
              8
              7       PII 333 MHz Cluster
              6                                             SGI Origin 2000
                                                            R10K 195 MHz
              5
Speed Up




              4
              3                             PIII 550 MHz Xeon
              2                             2 MB Cache, 8-Way SMP
              1




                                                                              e
              0
                  0   1   2    3    4       5   6   7   8   9   10   11
                                   No. of Processors
CFX PERFORMANCE ON A
                        VARIETY OF COMPUTER
                             PLATFORMS
                                             556323 VERTICES
                                             SGI SMP

                            81118 VERTICES
CPU TIME (SECS)




                       4    PII 333 MHz NT
                  10
                                         81118 VERTICES
                                         WORKSTATIONS
                                                                    81118 VERTICES
                                                                    SGI SMP
                       3
                  10
                                23862 VERTICES                              23862 VERTICES
                                WORKSTATIONS                                SGI SMP

                       0.8 1
                         0.91        2        3   4   5 6 7 8 910
                                   NUMBER OF PROCESSORS
                                                               10           20

                                                                                             e
ANSYS Performance on a Variety of
           Computers
                                      Sun Enterprise 450
                                      Ultra II 400 MHz        ELAPSED TIME
                   HP J5000, PA8500 400 MHz                   CPU TIME
                                 Compaq SP700, PIII 500 MHz


                                           SGI 320, PIII 500 MHz

                            Dell Optiplex GX1, PII 450 MHz
                            Dell 6300, PIII 500 MHz Xeon

                    Compaq DS20, 500 MHz EV6

                                     SGI O2K R10K 250 MHz




                                                                             e
   0   1   2   3    4   5    6   7     8     9   10 11 12
                     CPU TIME (HRS)

         160,000 ELEMENTS WITH 2D CONTACT
       2.1 GB OF OUTPUT AND 615 MB PAGE FILE
eComputing Architecture




                           Web-Enabled Environment


                                                     Load Sharing Facility
        Quality Tools
                                                                             SGI SC
   Remote Access from
                                                                             HP Cluster
    Anywhere Anytime
     (Globalization)
                                                                             NT Cluster

 Field and Customer Data                                                     DM Cluster
(Services and eCommerce)

                              user-transparent work
                                load management
                                                                                          e
Key LSF Challenges
•  Implicit Assumptions of A Homogeneous
   Environment
  –  Uniform View of NFS
  –  Uniform Mounting of User Home Directory
  –  Unix Bias
•  Wholesale Migration of User
   Environment Variables from Submit to
   Execution Hosts
                                               e
A Web-Centric Job Management
           System

              https     Web       NFS             Job Spec.
Web Clients                                         Files
                       Server
                      LSF Proxy
                                                NFS or SMB
                      ftp
                                    Execution Hosts

Job Directories                   LSF Cluster


                                                             e
Screen Shot of Web Interface

                 Job Source Files


                 Job Output Destination

               Command to Run on
                 Execution Host



                                          e
              Files Sent to
                 Output
               Destination
Process, Data Flow and Security
      for Web-Based Jobs

                    https         Apache
Web Client                                              Job Spec
                             (invoke_bsub.pl)
Authenticated access to
     setuid script          File system
                            permissions
                                           bsub
                     ftp                            Exec Host
   Job Dir                                          (lsf_exec.pl)

                                          LSF Cluster
                                                                    e
Normalized Job Execution Wrapper/
                 Environment
•  Standard job execution wrapper that runs on all execution hosts.
•  Key "ingredients": Perl (with Net::FTP package), wget, subset of
   "standard" Unix utilities (e.g. /bin/sh, cp, mv, etc.)
    –  (Note: Makes use of Cygwin tools on Windows NT. See http://
       sources.redhat.com/cygwin/)
•  Reads Job Specification File to determine:
    –  Source of remote job directory
    –  Command to run o Destination for job output file(s)
•  Pre-determined top level (local) directory for jobs on each execution
   host.
•  Remote jobs are migrated by FTP (wget) to unique sub-directories
   under top level.



                                                                           e
•  Specified command is run in migrated job directory.
•  Output files are transferred by FTP to specified destination.
Summary
•  Leveraging Low-Cost NT Solution
•  Shield Users from Complexity
  –  productivity gain
  –  better centralized resource management
•  Enterprise Resource Planning
  –  compute cycles
  –  software licenses
                                              e

More Related Content

PPT
6.09 Develop A Plan And Execute
PPT
6.09 The Job Search Tool Box Presentation
PDF
Storage Migration Planning Example
PDF
ICEMM - company introduction 2011
PPTX
Bioinformatics with HPC Midlands
PPTX
CF, J2EE and G3
PDF
Design Portfolio
PDF
Working with branch
6.09 Develop A Plan And Execute
6.09 The Job Search Tool Box Presentation
Storage Migration Planning Example
ICEMM - company introduction 2011
Bioinformatics with HPC Midlands
CF, J2EE and G3
Design Portfolio
Working with branch

Similar to Compute Cluster (20)

PPTX
A2DataDive workshop: Introduction to R
PDF
Real Application Testing
PPT
Verification Metrics
PDF
Salamian dv club_foils_intel_austin
PDF
[iGEM Workshop] Coming up with a Project
PPTX
Modeling pheromone dispensers using genetic programming
PDF
Williamson arm validation metrics
PPTX
Learning analytics for Medical Education
PPTX
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
PDF
Chapter 8 statistics
PPTX
Impact of Agricultural Activities on Groundwater Quality and its Suitability ...
PPTX
04 heederik benzeno
PDF
Education and Work Experience
PDF
WST PhD presentation for PenTAG 17may11
PDF
An Adaptive Gossip-Based Dissemination Protocol for Multi-Source Message Streams
PDF
Bassa presentation
KEY
slide
PDF
Reward 01.12
PPTX
RAMS 2013 Accelerated testing for 2 year storage
PDF
Staareoc 2012-05spr-wg-paper copy
A2DataDive workshop: Introduction to R
Real Application Testing
Verification Metrics
Salamian dv club_foils_intel_austin
[iGEM Workshop] Coming up with a Project
Modeling pheromone dispensers using genetic programming
Williamson arm validation metrics
Learning analytics for Medical Education
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
Chapter 8 statistics
Impact of Agricultural Activities on Groundwater Quality and its Suitability ...
04 heederik benzeno
Education and Work Experience
WST PhD presentation for PenTAG 17may11
An Adaptive Gossip-Based Dissemination Protocol for Multi-Source Message Streams
Bassa presentation
slide
Reward 01.12
RAMS 2013 Accelerated testing for 2 year storage
Staareoc 2012-05spr-wg-paper copy
Ad

Recently uploaded (20)

PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
PDF
Flame analysis and combustion estimation using large language and vision assi...
DOCX
Basics of Cloud Computing - Cloud Ecosystem
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PPTX
Modernising the Digital Integration Hub
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PPT
What is a Computer? Input Devices /output devices
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
Five Habits of High-Impact Board Members
PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
Getting started with AI Agents and Multi-Agent Systems
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
Flame analysis and combustion estimation using large language and vision assi...
Basics of Cloud Computing - Cloud Ecosystem
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Modernising the Digital Integration Hub
Taming the Chaos: How to Turn Unstructured Data into Decisions
sustainability-14-14877-v2.pddhzftheheeeee
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Microsoft Excel 365/2024 Beginner's training
Credit Without Borders: AI and Financial Inclusion in Bangladesh
Module 1.ppt Iot fundamentals and Architecture
Consumable AI The What, Why & How for Small Teams.pdf
What is a Computer? Input Devices /output devices
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
UiPath Agentic Automation session 1: RPA to Agents
Five Habits of High-Impact Board Members
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
Ad

Compute Cluster

  • 1. Building the Best Computer Clusters for Mechanical Design Environments Daniel Chan and Ed Huott e GE Corporate R&D Center
  • 2. Key Challenges •  A Highly Diverse Environment –  Many Different Businesses –  Engineering, Chemistry, Financial, Services •  Short-Term Technical Support –  Quick Turnaround •  Costs –  Downtime, System Admin., Software Licenses, Productivity e
  • 3. Design Requirements •  Web-Centric –  Simplify Job Submission in a Heterogeneous Environment –  Access from Anywhere and Anytime •  Minimize Complexity and Cost •  Stable and Highly-Available –  Isolate Network, NFS and Insufficient Disk Space Problems, Software License Management e
  • 4. Reasons for Having a Heterogeneous Environment •  Performance •  Costs •  Legacy Applications e
  • 5. Let’s Do An Experiment NT 4.0 Unix/Solaris 2.6 GE Corporate Standard Sun Ultra 10 PIII 450 MHz UltraSPARC 440 MHz 256 MB RAM 256 MB RAM $1400 $5000 Use Perl script CFX 5 Extract wall to launch 1, 2 and 23,862 Nodes clock times from 3 concurrent jobs, 102 MB Required output files then respectively, for use Minitab to 30 times analyze data Test stability, multitasking and paging capabilities e
  • 6. Results For Sun Ultra 10 1 job Histogram of C4, with Normal Curve 2 jobs Histogram of C4, with Normal Curve 8 30 7 6 20 Frequency 5 Frequency 4 3 10 2 1 0 0 1700 1710 1720 1730 1740 1750 1760 3320 3340 3360 3380 3400 3420 3440 3460 3480 3500 C4 C4 Histogram of C4, with Normal Curve 50 40 Frequency 3 jobs 30 e 20 10 0 4000 5000 6000 C4
  • 7. Results for Dell OptiPlex G1x Histogram of TOTAL_CLOCK_TIME, with Normal Curve 1 job 2 jobs Histogram of C4, with Normal Curve 20 40 30 Frequency Frequency 10 20 10 0 0 1924 1925 1926 1927 1928 3900 3950 4000 TOTAL_CLOCK_TIME C4 3 jobs Histogram of TOTAL_CLOCK_TIME, with Normal Curve 50 40 e Frequency 30 20 10 0 5800 5850 5900
  • 8. Scorecard for Sun Ultra 10 USL − µ 01µ . Z= = σ σ Mean Standard Z Deviation 1 job 1724 15.9 10.8 2 jobs 3442 26.1 13.2 (2X) 3 jobs 5144 350.2 1.5 (3X) Paging e
  • 9. Scorecard For Dell Optiplex GX1 USL − µ 01µ . Z= = σ σ Mean Standard Z Deviation 1 job 1926 1.2 161 2 jobs 3909 15.6 25 (2.1X) 3 jobs 5828 26.4 22 (3X) e
  • 10. Summary of Results •  The SUN workstation is about 10% faster, but nearly 5 times as expensive •  Both systems are just as stable for a period of one week •  NT can multitask •  NT can page and appears to have “better” memory management scheme e
  • 11. APNASA Parallel Performance 308,115 Grid Points 1400 PII 333 MHz Cluster 1200 1000 TIME (SEC) 800 600 400 SGI Origin 2000 R10K 195 MHz 200 PIII 550 MHz Xeon e 2 MB Cache, 8-Way SMP -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 No. of Processors
  • 12. APNASA Parallel Performance 308,115 Grid Points IDEAL 9 8 7 PII 333 MHz Cluster 6 SGI Origin 2000 R10K 195 MHz 5 Speed Up 4 3 PIII 550 MHz Xeon 2 2 MB Cache, 8-Way SMP 1 e 0 0 1 2 3 4 5 6 7 8 9 10 11 No. of Processors
  • 13. CFX PERFORMANCE ON A VARIETY OF COMPUTER PLATFORMS 556323 VERTICES SGI SMP 81118 VERTICES CPU TIME (SECS) 4 PII 333 MHz NT 10 81118 VERTICES WORKSTATIONS 81118 VERTICES SGI SMP 3 10 23862 VERTICES 23862 VERTICES WORKSTATIONS SGI SMP 0.8 1 0.91 2 3 4 5 6 7 8 910 NUMBER OF PROCESSORS 10 20 e
  • 14. ANSYS Performance on a Variety of Computers Sun Enterprise 450 Ultra II 400 MHz ELAPSED TIME HP J5000, PA8500 400 MHz CPU TIME Compaq SP700, PIII 500 MHz SGI 320, PIII 500 MHz Dell Optiplex GX1, PII 450 MHz Dell 6300, PIII 500 MHz Xeon Compaq DS20, 500 MHz EV6 SGI O2K R10K 250 MHz e 0 1 2 3 4 5 6 7 8 9 10 11 12 CPU TIME (HRS) 160,000 ELEMENTS WITH 2D CONTACT 2.1 GB OF OUTPUT AND 615 MB PAGE FILE
  • 15. eComputing Architecture Web-Enabled Environment Load Sharing Facility Quality Tools SGI SC Remote Access from HP Cluster Anywhere Anytime (Globalization) NT Cluster Field and Customer Data DM Cluster (Services and eCommerce) user-transparent work load management e
  • 16. Key LSF Challenges •  Implicit Assumptions of A Homogeneous Environment –  Uniform View of NFS –  Uniform Mounting of User Home Directory –  Unix Bias •  Wholesale Migration of User Environment Variables from Submit to Execution Hosts e
  • 17. A Web-Centric Job Management System https Web NFS Job Spec. Web Clients Files Server LSF Proxy NFS or SMB ftp Execution Hosts Job Directories LSF Cluster e
  • 18. Screen Shot of Web Interface Job Source Files Job Output Destination Command to Run on Execution Host e Files Sent to Output Destination
  • 19. Process, Data Flow and Security for Web-Based Jobs https Apache Web Client Job Spec (invoke_bsub.pl) Authenticated access to setuid script File system permissions bsub ftp Exec Host Job Dir (lsf_exec.pl) LSF Cluster e
  • 20. Normalized Job Execution Wrapper/ Environment •  Standard job execution wrapper that runs on all execution hosts. •  Key "ingredients": Perl (with Net::FTP package), wget, subset of "standard" Unix utilities (e.g. /bin/sh, cp, mv, etc.) –  (Note: Makes use of Cygwin tools on Windows NT. See http:// sources.redhat.com/cygwin/) •  Reads Job Specification File to determine: –  Source of remote job directory –  Command to run o Destination for job output file(s) •  Pre-determined top level (local) directory for jobs on each execution host. •  Remote jobs are migrated by FTP (wget) to unique sub-directories under top level. e •  Specified command is run in migrated job directory. •  Output files are transferred by FTP to specified destination.
  • 21. Summary •  Leveraging Low-Cost NT Solution •  Shield Users from Complexity –  productivity gain –  better centralized resource management •  Enterprise Resource Planning –  compute cycles –  software licenses e