SlideShare a Scribd company logo
 Go Native 

Squeeze the juice out of your 64-bit
        processor using…
 Go Native 
                        C/C++
Squeeze the juice out of your 64-bit
        processor using…
Who am I
Who am I

-> Fernando Moreira ( @fpmore )
Who am I

-> Fernando Moreira ( @fpmore )
-> MSc student @ FEUP
Who am I

-> Fernando Moreira ( @fpmore )
-> MSc student @ FEUP
-> Undergraduate Researcher @ Porto Interactive Center
Who am I

-> Fernando Moreira ( @fpmore )
-> MSc student @ FEUP
-> Undergraduate Researcher @ Porto Interactive Center
-> Microsoft Student Partner Lead @ M$ PT
Who am I

-> Fernando Moreira ( @fpmore )
-> MSc student @ FEUP
-> Undergraduate Researcher @ Porto Interactive Center
-> Microsoft Student Partner Lead @ M$ PT
-> I’ve    doing C++ for over… 5y
Who are you ?
Who are you ?

-> Norte
Who are you ?

-> Norte . Centro
Who are you ?

-> Norte . Centro . Sul
Who are you ?

-> Norte . Centro . Sul . Açores
Who are you ?

-> Norte . Centro . Sul . Açores . Madeira
Who are you ?

-> Norte . Centro . Sul . Açores . Madeira . FMI
Who are you ?

-> Norte . Centro . Sul . Açores . Madeira . FMI


-> Who has experience with C?
Who are you ?

-> Norte . Centro . Sul . Açores . Madeira . FMI


-> Who has experience with C? And with C++?
Who are you ?

-> Norte . Centro . Sul . Açores . Madeira . FMI


-> Who has experience with C? And with C++?


-> Who has experience with 64bit native dev?
Talk’s Schedule
int main( int argc, char **argv ) {
   try {




    } catch( Timeout &e ) { return -1; }


    return 0;
}
Talk’s Schedule
int main( int argc, char **argv ) {
   try {
      introducing_x64();




    } catch( Timeout &e ) { return -1; }


    return 0;
}
Talk’s Schedule
int main( int argc, char **argv ) {
   try {
      introducing_x64();
      advantagesOver_x86();




    } catch( Timeout &e ) { return -1; }


    return 0;
}
Talk’s Schedule
int main( int argc, char **argv ) {
   try {
      introducing_x64();
      advantagesOver_x86();
       nativeDev_x64( const Topic &t );




    } catch( Timeout &e ) { return -1; }   Promising not
                                           to change the
                                           topic. 
    return 0;
}
Talk’s Schedule
int main( int argc, char **argv ) {
   try {
      introducing_x64();
      advantagesOver_x86();
       nativeDev_x64( const Topic &t );
       codeAnalysis_and_DebugTools();



    } catch( Timeout &e ) { return -1; }


    return 0;
}
Talk’s Schedule
int main( int argc, char **argv ) {
   try {
      introducing_x64();
      advantagesOver_x86();
       nativeDev_x64( const Topic &t );
       codeAnalysis_and_DebugTools();
       costProspectionOn_x64Dev();

    } catch( Timeout &e ) { return -1; }


    return 0;
}
introducing_x64()
introducing_x64()

-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
introducing_x64()

-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
-> Notice : IA-64 ≠ AMD64
introducing_x64()

-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
-> Notice : IA-64 ≠ AMD64
-> AMD64 is backwards compatible with x86 (IA-64 isn’t)
introducing_x64()

-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
-> Notice : IA-64 ≠ AMD64
-> AMD64 is backwards compatible with x86 (IA-64 isn’t)
-> Some Hardware: Phenom, Athlon 64, Core-iX, Core 2, …
introducing_x64()

-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
-> Notice : IA-64 ≠ AMD64
-> AMD64 is backwards compatible with x86 (IA-64 isn’t)
-> Some Hardware: Phenom, Athlon 64, Core-iX, Core 2, …
-> Some OS’s : Win(XP.Vista.7), OSX, Several Linux distros.
introducing_x64()



This talk will be focused on the AMD64 architecture.
advantagesOver_x86()
advantagesOver_x86()

-> Address space : Theoretical limit of 16 ExaBytes (2^64)
advantagesOver_x86()

-> Address space : Theoretical limit of 16 ExaBytes (2^64)
-> More available registers. (there’s one called RIP)
advantagesOver_x86()

-> Address space : Theoretical limit of 16 ExaBytes (2^64)
-> More available registers. (there’s one called RIP)
-> Larger instruction set with emphasis on SIMD
advantagesOver_x86()

-> Address space : Theoretical limit of 16 ExaBytes (2^64)
-> More available registers. (there’s one called RIP)
-> Larger instruction set with emphasis on SIMD
-> SSE1, SSE2, and SSE3 are always there
advantagesOver_x86()

-> Address space : Theoretical limit of 16 ExaBytes (2^64)
-> More available registers. (there’s one called RIP)
-> Larger instruction set with emphasis on SIMD
-> SSE1, SSE2, and SSE3 are always there
-> Unified function calling convention
advantagesOver_x86()


         Can run x86 environments
Can run x86 binaries under x64 environments

  On Windows: . 32bit processes can’t load 64bit DLLs for execution
              . 64bit processes can’t load 32bit DLLs for execution
nativeDev_x64( how_it_looks_like )
nativeDev_x64( how_it_looks_like )


        -> A valid, yet useless, 64bit application.
int main( int argc, char **argv } {
   return 0;
}
nativeDev_x64( how_it_looks_like )


    -> A valid, yet useless and dangerous, 64bit application.
int main( int argc, char **argv } {

    size_t external_debt = SIZE_MAX;
    int *ptr             = &external_debt;
    *ptr                 = 0;

    return 0;
}
nativeDev_x64( how_it_looks_like )


    -> A valid, yet useless and dangerous, 64bit application.
int main( int argc, char **argv } {

    size_t external_debt = SIZE_MAX;
    int *ptr             = &external_debt;
    *ptr                 = 0;

    return 0;
}
nativeDev_x64( data_model )
nativeDev_x64( data_model )

-> On Microsoft Win64 : LLP64 model
nativeDev_x64( data_model )

-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
nativeDev_x64( data_model )

-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
nativeDev_x64( data_model )

-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
-> LP64: short( 2 ), int( 4 ), long( 8 ), ptr( 8 ), long long( 8 )
nativeDev_x64( data_model )

-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
-> LP64: short( 2 ), int( 4 ), long( 8 ), ptr( 8 ), long long( 8 )
         Can you see the data portability problem?
nativeDev_x64( data_model )

-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
-> LP64: short( 2 ), int( 4 ), long( 8 ), ptr( 8 ), long long( 8 )
Suggestions: Use conditional compilation and type aliasing.
nativeDev_x64( data_model )

-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
-> LP64: short( 2 ), int( 4 ), long( 8 ), ptr( 8 ), long long( 8 )
Suggestions: Use conditional compilation and type aliasing.
            Make conscious usage of the sizeof operator.
nativeDev_x64( data_model )

-> On x86 : ptr( 4 ), size_t( 4 ), ptrdiff_t( 4 )
nativeDev_x64( data_model )

-> On x86 : ptr( 4 ), size_t( 4 ), ptrdiff_t( 4 )
-> On x64 : ptr( 8 ), size_t( 8 ), ptrdiff_t( 8 )
nativeDev_x64( data_model )

-> On x86 : ptr( 4 ), size_t( 4 ), ptrdiff_t( 4 )
-> On x64 : ptr( 8 ), size_t( 8 ), ptrdiff_t( 8 )


          These ones will increase memory usage…
                But will be performance-wise.
nativeDev_x64( common_pitfalls )
nativeDev_x64( common_pitfalls )

-> Usage of magic numbers & bit-wise ops: 0x7fffffff
nativeDev_x64( common_pitfalls )

-> Usage of magic numbers & bit-wise ops: 0x7fffffff
-> Functions with variable number of arguments : printf
nativeDev_x64( common_pitfalls )

-> Usage of magic numbers & bit-wise ops: 0x7fffffff
-> Functions with variable number of arguments : printf
-> Virtual functions
nativeDev_x64( common_pitfalls )

-> Usage of magic numbers & bit-wise ops: 0x7fffffff
-> Functions with variable number of arguments : printf
-> Virtual functions
-> Data exchange between x86 and x64 apps
nativeDev_x64( common_pitfalls )

-> Usage of magic numbers & bit-wise ops: 0x7fffffff
-> Functions with variable number of arguments : printf
-> Virtual functions
-> Data exchange between x86 and x64 apps
-> Data misalignment : SSE requires 16-byte alignment
nativeDev_x64( optimization_tips )
nativeDev_x64( optimization_tips )

-> Use native types for loops or tight data usage
nativeDev_x64( optimization_tips )

-> Use native types for loops or tight data usage
-> Use 16-byte alignment for SSE loads and stores
nativeDev_x64( optimization_tips )

-> Use native types for loops or tight data usage
-> Use 16-byte alignment for SSE loads and stores
-> Heap-allocs in Win64 and XBOX360 are 16-byte aligned
nativeDev_x64( optimization_tips )

-> Use native types for loops or tight data usage
-> Use 16-byte alignment for SSE loads and stores
-> Heap-allocs in Win64 and XBOX360 are 16-byte aligned
-> *Use* intrinsics : #include <immintrin.h>
nativeDev_x64( optimization_tips )

-> Use native types for loops or tight data usage
-> Use 16-byte alignment for SSE loads and stores
-> Heap-allocs in Win64 and XBOX360 are 16-byte aligned
-> *Use* intrinsics : #include <immintrin.h>
-> Unroll loops and sort object’s member data by their size
nativeDev_x64( real-world_tips )
nativeDev_x64( real-world_tips )

-> Don’t sacrifice your software architecture.
nativeDev_x64( real-world_tips )

-> Don’t sacrifice your software architecture.
-> Don’t use it if you don’t know how to.
nativeDev_x64( real-world_tips )

-> Don’t sacrifice your software architecture.
-> Don’t use it if you don’t know how to.
-> Don’t go into premature optimization.
nativeDev_x64( real-world_tips )

-> Don’t sacrifice your software architecture.
-> Don’t use it if you don’t know how to.
-> Don’t go into premature optimization.
-> Do it at lower levels and then hide it.
nativeDev_x64( real-world_tips )

-> Don’t sacrifice your software architecture.
-> Don’t use it if you don’t know how to.
-> Don’t go into premature optimization.
-> Do it at lower levels and then hide it.
-> Trust your compiler to help you do the job.
codeAnalysis_and_DebugTools()
codeAnalysis_and_DebugTools()

-> Your IDE : LEARN to fu**** use it!
codeAnalysis_and_DebugTools()

-> Your IDE : LEARN to fu**** use it!
-> Conditional break points, call-stack
codeAnalysis_and_DebugTools()

-> Your IDE : LEARN to fu**** use it!
-> Conditional break points, call-stack
-> Free tool : CppCheck (CmdLine, Eclipse, CodeBlocks, …)
codeAnalysis_and_DebugTools()

-> Your IDE : LEARN to fu**** use it!
-> Conditional break points, call-stack
-> Free tool : CppCheck (CmdLine, Eclipse, CodeBlocks, …)
-> State-of-the-art tool: PVS-Studio (VS 05,08,10)
codeAnalysis_and_DebugTools()

-> Your IDE : LEARN to fu**** use it!
-> Conditional break points, call-stack
-> Free tool : CppCheck (CmdLine, Eclipse, CodeBlocks, …)
-> State-of-the-art tool: PVS-Studio (VS 05,08,10)
-> Do pair programming and peer-review if possible
costProspectionOn_x64Dev()
costProspectionOn_x64Dev()

-> Hardware & Software (IDE + Plugins + Tools + Libs)
costProspectionOn_x64Dev()

-> Hardware & Software (IDE + Plugins + Tools + Libs)
-> You’ll need to teach the developers (theory & practice)
costProspectionOn_x64Dev()

-> Hardware & Software (IDE + Plugins + Tools + Libs)
-> You’ll need to teach the developers (theory & practice)
-> A port takes time, adds bugs, and it’s not creative
costProspectionOn_x64Dev()

-> Hardware & Software (IDE + Plugins + Tools + Libs)
-> You’ll need to teach the developers (theory & practice)
-> A port takes time, adds bugs, and it’s not creative
-> … plus you’ll probably have to maintain two code paths
costProspectionOn_x64Dev()

-> Hardware & Software (IDE + Plugins + Tools + Libs)
-> You’ll need to teach the developers (theory & practice)
-> A port takes time, adds bugs, and it’s not creative
-> … plus you’ll probably have to maintain two code paths
-> Full implementation adds creativity, but takes much
   more time and will add many more bugs.
Lets go
state-of-the-art!
Questions?

More Related Content

What's hot (20)

PDF
Boost.Python - domesticating the snake
Sławomir Zborowski
 
PDF
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...
David Beazley (Dabeaz LLC)
 
PDF
Create your own PHP extension, step by step - phpDay 2012 Verona
Patrick Allaert
 
PDF
r2con 2017 r2cLEMENCy
Ray Song
 
PPTX
Bypassing DEP using ROP
Japneet Singh
 
PDF
實戰 Hhvm extension php conf 2014
Ricky Su
 
PDF
Threads and Callbacks for Embedded Python
Yi-Lung Tsai
 
PPTX
05 - Bypassing DEP, or why ASLR matters
Alexandre Moneger
 
PDF
Comparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etc
Yukio Okuda
 
PDF
[系列活動] Data exploration with modern R
台灣資料科學年會
 
PPTX
C ISRO Debugging
splix757
 
PPTX
Linux
afzal pa
 
PDF
Debugger Principle Overview & GDB Tricks
dutor
 
PPTX
Boost.Python: C++ and Python Integration
GlobalLogic Ukraine
 
PPTX
Clonewise - Automatically Detecting Package Clones and Inferring Security Vu...
Silvio Cesare
 
PDF
C++の話(本当にあった怖い話)
Yuki Tamura
 
ODP
Design and implementation_of_shellcodes
Amr Ali
 
PDF
Ghost Vulnerability CVE-2015-0235
Rajivarnan (Rajiv)
 
PDF
Linked lists
Piyush Mittal
 
PPTX
07 - Bypassing ASLR, or why X^W matters
Alexandre Moneger
 
Boost.Python - domesticating the snake
Sławomir Zborowski
 
An Embedded Error Recovery and Debugging Mechanism for Scripting Language Ext...
David Beazley (Dabeaz LLC)
 
Create your own PHP extension, step by step - phpDay 2012 Verona
Patrick Allaert
 
r2con 2017 r2cLEMENCy
Ray Song
 
Bypassing DEP using ROP
Japneet Singh
 
實戰 Hhvm extension php conf 2014
Ricky Su
 
Threads and Callbacks for Embedded Python
Yi-Lung Tsai
 
05 - Bypassing DEP, or why ASLR matters
Alexandre Moneger
 
Comparing On-The-Fly Accelerating Packages: Numba, TensorFlow, Dask, etc
Yukio Okuda
 
[系列活動] Data exploration with modern R
台灣資料科學年會
 
C ISRO Debugging
splix757
 
Linux
afzal pa
 
Debugger Principle Overview & GDB Tricks
dutor
 
Boost.Python: C++ and Python Integration
GlobalLogic Ukraine
 
Clonewise - Automatically Detecting Package Clones and Inferring Security Vu...
Silvio Cesare
 
C++の話(本当にあった怖い話)
Yuki Tamura
 
Design and implementation_of_shellcodes
Amr Ali
 
Ghost Vulnerability CVE-2015-0235
Rajivarnan (Rajiv)
 
Linked lists
Piyush Mittal
 
07 - Bypassing ASLR, or why X^W matters
Alexandre Moneger
 

Similar to Go Native : Squeeze the juice out of your 64-bit processor using C++ (20)

PPTX
C Programming Training in Ambala ! Batra Computer Centre
jatin batra
 
PPTX
C programming language tutorial
javaTpoint s
 
PPT
C
Anuja Lad
 
PPT
Csdfsadf
Atul Setu
 
PDF
AllBits presentation - Lower Level SW Security
AllBits BVBA (freelancer)
 
PPTX
C_Progragramming_language_Tutorial_ppt_f.pptx
maaithilisaravanan
 
PPTX
PVS-Studio, a solution for resource intensive applications development
OOO "Program Verification Systems"
 
PPTX
Introduction to c
amol_chavan
 
PPT
Chapter Eight(3)
bolovv
 
PDF
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
RootedCON
 
PDF
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
PDF
Workshop "Can my .NET application use less CPU / RAM?", Yevhen Tatarynov
Fwdays
 
ODP
PHP applications/environments monitoring: APM & Pinba
Patrick Allaert
 
ODP
Writing MySQL UDFs
Roland Bouman
 
PPTX
Getting started cpp full
Võ Hòa
 
PDF
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Anne Nicolas
 
PDF
What we can learn from Rebol?
lichtkind
 
PDF
Getting Started Cpp
Long Cao
 
C Programming Training in Ambala ! Batra Computer Centre
jatin batra
 
C programming language tutorial
javaTpoint s
 
Csdfsadf
Atul Setu
 
AllBits presentation - Lower Level SW Security
AllBits BVBA (freelancer)
 
C_Progragramming_language_Tutorial_ppt_f.pptx
maaithilisaravanan
 
PVS-Studio, a solution for resource intensive applications development
OOO "Program Verification Systems"
 
Introduction to c
amol_chavan
 
Chapter Eight(3)
bolovv
 
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
RootedCON
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 
Workshop "Can my .NET application use less CPU / RAM?", Yevhen Tatarynov
Fwdays
 
PHP applications/environments monitoring: APM & Pinba
Patrick Allaert
 
Writing MySQL UDFs
Roland Bouman
 
Getting started cpp full
Võ Hòa
 
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Anne Nicolas
 
What we can learn from Rebol?
lichtkind
 
Getting Started Cpp
Long Cao
 
Ad

Recently uploaded (20)

PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Python basic programing language for automation
DanialHabibi2
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Python basic programing language for automation
DanialHabibi2
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
Ad

Go Native : Squeeze the juice out of your 64-bit processor using C++