SlideShare a Scribd company logo
"Optimization of a .NET application- is it simple ! / ?",  Yevhen Tatarynov
Yevhen Tatarynov
Software developer with 15 years of experience in commercial
software and database development (.NET / MS SQL / Delphi)
PhD in math, specializing in the theoretical foundations of
computer science and cybernetics
I was involved in projects performing complex mathematical calculations and processing large
amounts of data. For now my role senior software developer in infrastructure team, Covent IT.
Point of professional interest:
application performance optimization and analysis
writing C# code similar in performance to C++
advanced debugging
Agenda
The first
challenge?
Measurements To much … ?
Summary
QA
The second
challenge?
Is It an
bottleneck?
Measurements
The first challenge?
Console .NET application
Read *.csv data files
Process data in
multiple threads
Write results in MS
Excel file
Use .NET Framework
4.6.2 Run on Windows It works correctly
Measurements
Measurement Tools
DotNetBenchmark
Visual Studio
Performance Profiler
Perfview R# dotTrace R# dotMemory
Performance
monitor
dotMemoryScreen
dotMemory Snapshot
dotTrace Snapshot
Too much…?
# Too many code issues?
Unused variables
Unused properties
Unused objects not allocated
Unused data not loaded from files
Unused fields don’t increase object
size
# Memory snapshot
Total memory old
Total memory new
Total memory diff
Percent
419,750 MB
359,971 MB
14 %
-55,599 MB
# Too many keys
ConcurrentDictionary<MyKey,string>
public struct MyKey
{
string field1;
string field2;
/*.....*/
}
Boxing
ConcurrentDictionary<TKey,TValue> classes have the same
functionality as the Hashtable class. A ConcurrentDictionary
<TKey,TValue> of a specific type (other than Object) provides
better performance than a Hashtable for value types.
This is because the elements of Hashtable are of type Object;
therefore, boxing and unboxing typically occur when you store
or retrieve a value type.
StackOverflow
Since you only override Equals and don’t implement
IEquatable<T>, the dictionary is forced to box one of the two
instances whenever it compares two of them for equality
because it's passing an instance into an
Equals-method-accepting object.
If you implement IEquatable<T>, then the dictionary can (and
will) use the version of Equals that accepts the parameter as a
T, which won't require boxing.
IEquatable interface
public struct MyKey : IEquatable<MyKey>
{
string field1;
string field2;
string field3;
/*.....*/
public bool Equals(MyKey other);
}
# Memory snapshot
Total memory old
Total memory new
Total memory diff
Percent
359,971 MB
137,294 MB
65 %
-251,654 MB
# Too much System.Double in heap?
class MyClass<T> where T : struct
{
T Add(T a, T b) =>
(dynamic)a+(dynamic)b;
}
We use only double type for T
class MyClass
{
double Add(double a, double b)
=> a + b;
}
For Struct we have no defined operation
+, so to maintain generic MyClass<T> we
need to cast to dynamic and we make
boxing ☹
We use only double type for T
# Memory snapshot
Total memory old
Total memory new
Total memory diff
Percent
140,550 MB
130,288 MB
7 %
-10,262 MB
# Still too much char[ ]?
/*
Collect single element of csv
string
*/
item=new List<char>();
/*.....*/
return item.ToArray();
# REUSE StringBuilder
char[] item
_stringBuilder.Clear();
/*.....*/
_stringBuilder.Write(item);
return item;
Cache StringBuilder in private field
No new Allocation
Can be used for different item
lengths
Can grow in 8,000 bytes if it’s
necessary to expand the internal buffer
# Memory snapshot
Total memory old
Total memory new
Total memory diff
Percent
59,565 MB
43,338 MB
27 %
-16,227 MB
# TOO MUCH <>c_DisplayClass26_0?
void AddValue(string key,string val)
/* Redundant lambda all data read
only */
dict.TryAdd(key,()=>new Data(val));
/*.....*/
# Memory snapshot
Total memory old
Total memory new
Total memory diff
Percent
34,243 MB
33,118 MB
3 %
-1,125 MB
The Second Challenge
WinForms .NET Applications
Read *.bin, *.txt data
files
“Process bits”, extract
and use full data, pack
into new format
It works correctly
Write results in text
and binary files
Use .NET
Framework 4.0
Run on Windows 10 x64
Measurements
dotMemory Snapshot
dotTrace TimeLine Sample & Snapshot Execution Time
Is It an bottleneck?
# Is Linq an bottleneck?
Potential Improvements
Used .ToArray() - slow
Concat use foreach and extra
memory to iterate input params
Each time we produce new byte array.
Redundant memory traffic.
var b = new byte[];
for (int i = 0; i < N; i++)
{
byte[] a = new byte[GetLen(i)];
/* fill a with values */
b = b.Concat(a).ToArray();
}
return b;
Buffer.BlockCopy
public static void BlockCopy
(Array src, int srcOffset, Array
dst, int dstOffset,
int count);
Copies a specified number of bytes from a
source array starting at a particular offset to a
destination array starting at a particular offset.
● src – Array The source buffer.
● srcOffset - Int32 The
zero-based byte offset into src.
● dst – Array The destination
buffer.
● dstOffset - Int32 The
zero-based byte offset into
dst.
● count - Int32 The number of
bytes to copy.
Comparison
var b = new byte[];
for (int i = 0; i < N; i++)
{
byte[] a = new byte[GetLen(i)];
/* fill a with values */
b = b.Concat(a).ToArray();
}
return b;
var b = new byte[maxN]; var bN=0;
for (int i = 0; i < N; i++)
{
var aN = GetLen(i);
byte[] a = new byte[aN];
/* fill a with values */
Buffer.BlockCopy(aN,0,b,bN,aN);
bN += aN;
}
return b;
#1 Performance Summary
Execution time
1st 374,011
120,333
68,83 %
-38 m 25 s 228 ms
Memory (MB)
2nd
Diff
%
43 m 21 s 642 ms
4 m 56 s 414 ms
-253,678
88,61 %
х 3,11
х 9,25
# Is FileStream.get_Length an bottleneck?
In both cases, the binary file read and
the basis on read data is the
calculated number of binary chains.
Potential Improvements
using(var br = new BinaryReader(…))
{
while (br.BaseStream.Position <= br.BaseStream.Length - 4)
{
counter++;
br.ReadUInt32();
br.ReadUInt32();
var n = br.ReadUInt32();
for (int i = 0; i < n; i++) br.ReadUInt32();
}
}
Redundant Length calls
Redundant subtraction
Redundant call
ReadUInt32
Solution
using(var br = new BinaryReader(…))
{
var length = br.BaseStream.Length - 4;
while (br.BaseStream.Position <= length)
{
counter++;
br.ReadUInt64();
var n = br.ReadUInt32();
for (int i = 0; i < n; i++) br.ReadUInt32();
}
}
Store Length in local
variable
Call ReadUInt64 instead
ReadUInt32
Comparison
using(var br = new BinaryReader(…))
{
while (br.BaseStream.Position <= br.BaseStream.Length - 4)
{
counter++;
br.ReadUInt32();
br.ReadUInt32();
var n = br.ReadUInt32();
for (int i = 0; i < n; i++) br.ReadUInt32();
}
}
using(var br = new BinaryReader(…))
{
var length = br.BaseStream.Length - 4;
while (br.BaseStream.Position <= length)
{
counter++;
br.ReadUInt64();
var n = br.ReadUInt32();
for (int i = 0; i < n; i++) br.ReadUInt32();
}
}
# Performance Summary
Execution time
Old 103,775
103,775
0.00 %
-31 s 786 ms
Memory (MB)
New
Diff
%
2 m 11 s 091 ms
1 m 39 s 305 ms
0
24.25 %
х 1.00
х 1.32
# ScaleGrad. - Can It Be Faster?
/*
Return index of number x by ordered
scale
*/
int ScaleGrad(int x)
Potential Improvements
static double[] Scale;
…
/* 600+ lines of code */
…
int ScaleGrad(int x)
{
for(int i=0; i<Scale.Length && Scale[i]<=x; i++)
return i - 1;
}
Avoid compare int and
double values
Scale is a sorted array, so
we can use binary search;
it’s more efficient and less
dependent on input data
Comparison
static double[] Scale;
/* 600+ lins of code */
int ScaleGrad(int x)
{
for(int i=0;(i<Scale.Length)&&(Scale[i]<= x);i++);
return i - 1;
}
static int[] Scale;
/* 600+ lins of code */
int ScaleGrad(int x)
var left = 1; var right = Scale.Length -1;
var mid =(left + right)>>1;//(left+right)/2
do {
mid = left + ((right - left)>>1);
if ( x < Scale[mid]) right = mid - 1;
else left = mid + 1;
} while (right >= left);
return mid;
#9 Performance Summary
Execution time
Old 9,754
9,754
0.00 %
-3 s 707 ms
Memory (MB)
New
Diff
%
32 s 551 ms
28 s 844 ms
0
11.39 %
х 1.00
х 1.13
Summary
PLEASE JOIN OUR WORKSHOP
TO SEE ALL OPTIMIZATION STEPS
Thank you!
Q&A
LINKS
Use dotTrace Command-Line Profiler Hashtable and dictionary collection types
.NET Performance Optimization &
Profiling with JetBrains dotTrace
Why GC run when using a struct as a
generic dictionary
Matt Ellis. Writing Allocation Free Code
in C#
Maarten Balliauw. Let’s refresh our
memory! Memory management in .NET
Sasha Goldshtein. Pro .NET Performance:
Optimize Your C# Applications
Ben Watson. Writing High-Performance
.NET Code, 2nd Edition
Maarten Balliauw
LINKS
Sasha Goldshtein
Yevhen Tatarynov GitHub
Writing Faster Managed Code: Know
What Things Cost
Ling.Concat
Linq.Concat Implementation
Buffer.BlockCopy
Generic List implementation
Konrad Kokosa. High-performance code
design patterns in C#

More Related Content

Similar to "Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov (20)

PPTX
Unit 3
GOWSIKRAJAP
 
PPTX
CPP-overviews notes variable data types notes
SukhpreetSingh519414
 
PDF
Yevhen Tatarynov "From POC to High-Performance .NET applications"
LogeekNightUkraine
 
PPTX
cppt-170218053903 (1).pptx
WatchDog13
 
PDF
c++ referesher 1.pdf
AnkurSingh656748
 
PDF
TDC2018SP | Trilha .Net - Novidades do C# 7 e 8
tdc-globalcode
 
PDF
Look Mommy, No GC! (TechDays NL 2017)
Dina Goldshtein
 
PPTX
Oops presentation
sushamaGavarskar1
 
PPTX
C++11: Feel the New Language
mspline
 
DOCX
Arrry structure Stacks in data structure
lodhran-hayat
 
PPT
Visual studio 2008
Luis Enrique
 
PDF
Object Oriented Programming (OOP) using C++ - Lecture 1
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
PDF
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Vivian S. Zhang
 
PDF
CUDA by Example : Parallel Programming in CUDA C : Notes
Subhajit Sahu
 
PDF
C sharp chap6
Mukesh Tekwani
 
PPTX
c++ introduction, array, pointers included.pptx
fn723290
 
PDF
Optimization in Programming languages
Ankit Pandey
 
PDF
Golang in TiDB (GopherChina 2017)
PingCAP
 
PDF
Implementation of Computational Algorithms using Parallel Programming
ijtsrd
 
PPT
01_intro-cpp.ppt
SWETHAABIRAMIM
 
Unit 3
GOWSIKRAJAP
 
CPP-overviews notes variable data types notes
SukhpreetSingh519414
 
Yevhen Tatarynov "From POC to High-Performance .NET applications"
LogeekNightUkraine
 
cppt-170218053903 (1).pptx
WatchDog13
 
c++ referesher 1.pdf
AnkurSingh656748
 
TDC2018SP | Trilha .Net - Novidades do C# 7 e 8
tdc-globalcode
 
Look Mommy, No GC! (TechDays NL 2017)
Dina Goldshtein
 
Oops presentation
sushamaGavarskar1
 
C++11: Feel the New Language
mspline
 
Arrry structure Stacks in data structure
lodhran-hayat
 
Visual studio 2008
Luis Enrique
 
Object Oriented Programming (OOP) using C++ - Lecture 1
Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Vivian S. Zhang
 
CUDA by Example : Parallel Programming in CUDA C : Notes
Subhajit Sahu
 
C sharp chap6
Mukesh Tekwani
 
c++ introduction, array, pointers included.pptx
fn723290
 
Optimization in Programming languages
Ankit Pandey
 
Golang in TiDB (GopherChina 2017)
PingCAP
 
Implementation of Computational Algorithms using Parallel Programming
ijtsrd
 
01_intro-cpp.ppt
SWETHAABIRAMIM
 

More from Fwdays (20)

PDF
"Mastering UI Complexity: State Machines and Reactive Patterns at Grammarly",...
Fwdays
 
PDF
"Effect, Fiber & Schema: tactical and technical characteristics of Effect.ts"...
Fwdays
 
PPTX
"Computer Use Agents: From SFT to Classic RL", Maksym Shamrai
Fwdays
 
PPTX
"Як ми переписали Сільпо на Angular", Євген Русаков
Fwdays
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
"Validation and Observability of AI Agents", Oleksandr Denisyuk
Fwdays
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PPTX
"Co-Authoring with a Machine: What I Learned from Writing a Book on Generativ...
Fwdays
 
PPTX
"Human-AI Collaboration Models for Better Decisions, Faster Workflows, and Cr...
Fwdays
 
PDF
"AI is already here. What will happen to your team (and your role) tomorrow?"...
Fwdays
 
PPTX
"Is it worth investing in AI in 2025?", Alexander Sharko
Fwdays
 
PDF
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
Fwdays
 
PDF
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
PDF
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
 
PDF
"Scaling in space and time with Temporal", Andriy Lupa .pdf
Fwdays
 
PPTX
"Provisioning via DOT-Chain: from catering to drone marketplaces", Volodymyr ...
Fwdays
 
PPTX
" Observability with Elasticsearch: Best Practices for High-Load Platform", A...
Fwdays
 
PPTX
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
PPTX
"Istio Ambient Mesh in production: our way from Sidecar to Sidecar-less",Hlib...
Fwdays
 
"Mastering UI Complexity: State Machines and Reactive Patterns at Grammarly",...
Fwdays
 
"Effect, Fiber & Schema: tactical and technical characteristics of Effect.ts"...
Fwdays
 
"Computer Use Agents: From SFT to Classic RL", Maksym Shamrai
Fwdays
 
"Як ми переписали Сільпо на Angular", Євген Русаков
Fwdays
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
"Validation and Observability of AI Agents", Oleksandr Denisyuk
Fwdays
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
"Co-Authoring with a Machine: What I Learned from Writing a Book on Generativ...
Fwdays
 
"Human-AI Collaboration Models for Better Decisions, Faster Workflows, and Cr...
Fwdays
 
"AI is already here. What will happen to your team (and your role) tomorrow?"...
Fwdays
 
"Is it worth investing in AI in 2025?", Alexander Sharko
Fwdays
 
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
Fwdays
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
 
"Scaling in space and time with Temporal", Andriy Lupa .pdf
Fwdays
 
"Provisioning via DOT-Chain: from catering to drone marketplaces", Volodymyr ...
Fwdays
 
" Observability with Elasticsearch: Best Practices for High-Load Platform", A...
Fwdays
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
"Istio Ambient Mesh in production: our way from Sidecar to Sidecar-less",Hlib...
Fwdays
 
Ad

Recently uploaded (20)

PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Learn Computer Forensics, Second Edition
AnuraShantha7
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
PPT
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Learn Computer Forensics, Second Edition
AnuraShantha7
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
Predicting the unpredictable: re-engineering recommendation algorithms for fr...
Speck&Tech
 
Interview paper part 3, It is based on Interview Prep
SoumyadeepGhosh39
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Ad

"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov

  • 2. Yevhen Tatarynov Software developer with 15 years of experience in commercial software and database development (.NET / MS SQL / Delphi) PhD in math, specializing in the theoretical foundations of computer science and cybernetics I was involved in projects performing complex mathematical calculations and processing large amounts of data. For now my role senior software developer in infrastructure team, Covent IT. Point of professional interest: application performance optimization and analysis writing C# code similar in performance to C++ advanced debugging
  • 3. Agenda The first challenge? Measurements To much … ? Summary QA The second challenge? Is It an bottleneck? Measurements
  • 5. Console .NET application Read *.csv data files Process data in multiple threads Write results in MS Excel file Use .NET Framework 4.6.2 Run on Windows It works correctly
  • 7. Measurement Tools DotNetBenchmark Visual Studio Performance Profiler Perfview R# dotTrace R# dotMemory Performance monitor
  • 12. # Too many code issues? Unused variables Unused properties Unused objects not allocated Unused data not loaded from files Unused fields don’t increase object size
  • 13. # Memory snapshot Total memory old Total memory new Total memory diff Percent 419,750 MB 359,971 MB 14 % -55,599 MB
  • 14. # Too many keys ConcurrentDictionary<MyKey,string> public struct MyKey { string field1; string field2; /*.....*/ }
  • 15. Boxing ConcurrentDictionary<TKey,TValue> classes have the same functionality as the Hashtable class. A ConcurrentDictionary <TKey,TValue> of a specific type (other than Object) provides better performance than a Hashtable for value types. This is because the elements of Hashtable are of type Object; therefore, boxing and unboxing typically occur when you store or retrieve a value type.
  • 16. StackOverflow Since you only override Equals and don’t implement IEquatable<T>, the dictionary is forced to box one of the two instances whenever it compares two of them for equality because it's passing an instance into an Equals-method-accepting object. If you implement IEquatable<T>, then the dictionary can (and will) use the version of Equals that accepts the parameter as a T, which won't require boxing.
  • 17. IEquatable interface public struct MyKey : IEquatable<MyKey> { string field1; string field2; string field3; /*.....*/ public bool Equals(MyKey other); }
  • 18. # Memory snapshot Total memory old Total memory new Total memory diff Percent 359,971 MB 137,294 MB 65 % -251,654 MB
  • 19. # Too much System.Double in heap? class MyClass<T> where T : struct { T Add(T a, T b) => (dynamic)a+(dynamic)b; }
  • 20. We use only double type for T class MyClass { double Add(double a, double b) => a + b; } For Struct we have no defined operation +, so to maintain generic MyClass<T> we need to cast to dynamic and we make boxing ☹ We use only double type for T
  • 21. # Memory snapshot Total memory old Total memory new Total memory diff Percent 140,550 MB 130,288 MB 7 % -10,262 MB
  • 22. # Still too much char[ ]? /* Collect single element of csv string */ item=new List<char>(); /*.....*/ return item.ToArray();
  • 23. # REUSE StringBuilder char[] item _stringBuilder.Clear(); /*.....*/ _stringBuilder.Write(item); return item; Cache StringBuilder in private field No new Allocation Can be used for different item lengths Can grow in 8,000 bytes if it’s necessary to expand the internal buffer
  • 24. # Memory snapshot Total memory old Total memory new Total memory diff Percent 59,565 MB 43,338 MB 27 % -16,227 MB
  • 25. # TOO MUCH <>c_DisplayClass26_0? void AddValue(string key,string val) /* Redundant lambda all data read only */ dict.TryAdd(key,()=>new Data(val)); /*.....*/
  • 26. # Memory snapshot Total memory old Total memory new Total memory diff Percent 34,243 MB 33,118 MB 3 % -1,125 MB
  • 28. WinForms .NET Applications Read *.bin, *.txt data files “Process bits”, extract and use full data, pack into new format It works correctly Write results in text and binary files Use .NET Framework 4.0 Run on Windows 10 x64
  • 31. dotTrace TimeLine Sample & Snapshot Execution Time
  • 32. Is It an bottleneck?
  • 33. # Is Linq an bottleneck?
  • 34. Potential Improvements Used .ToArray() - slow Concat use foreach and extra memory to iterate input params Each time we produce new byte array. Redundant memory traffic. var b = new byte[]; for (int i = 0; i < N; i++) { byte[] a = new byte[GetLen(i)]; /* fill a with values */ b = b.Concat(a).ToArray(); } return b;
  • 35. Buffer.BlockCopy public static void BlockCopy (Array src, int srcOffset, Array dst, int dstOffset, int count); Copies a specified number of bytes from a source array starting at a particular offset to a destination array starting at a particular offset. ● src – Array The source buffer. ● srcOffset - Int32 The zero-based byte offset into src. ● dst – Array The destination buffer. ● dstOffset - Int32 The zero-based byte offset into dst. ● count - Int32 The number of bytes to copy.
  • 36. Comparison var b = new byte[]; for (int i = 0; i < N; i++) { byte[] a = new byte[GetLen(i)]; /* fill a with values */ b = b.Concat(a).ToArray(); } return b; var b = new byte[maxN]; var bN=0; for (int i = 0; i < N; i++) { var aN = GetLen(i); byte[] a = new byte[aN]; /* fill a with values */ Buffer.BlockCopy(aN,0,b,bN,aN); bN += aN; } return b;
  • 37. #1 Performance Summary Execution time 1st 374,011 120,333 68,83 % -38 m 25 s 228 ms Memory (MB) 2nd Diff % 43 m 21 s 642 ms 4 m 56 s 414 ms -253,678 88,61 % х 3,11 х 9,25
  • 38. # Is FileStream.get_Length an bottleneck? In both cases, the binary file read and the basis on read data is the calculated number of binary chains.
  • 39. Potential Improvements using(var br = new BinaryReader(…)) { while (br.BaseStream.Position <= br.BaseStream.Length - 4) { counter++; br.ReadUInt32(); br.ReadUInt32(); var n = br.ReadUInt32(); for (int i = 0; i < n; i++) br.ReadUInt32(); } } Redundant Length calls Redundant subtraction Redundant call ReadUInt32
  • 40. Solution using(var br = new BinaryReader(…)) { var length = br.BaseStream.Length - 4; while (br.BaseStream.Position <= length) { counter++; br.ReadUInt64(); var n = br.ReadUInt32(); for (int i = 0; i < n; i++) br.ReadUInt32(); } } Store Length in local variable Call ReadUInt64 instead ReadUInt32
  • 41. Comparison using(var br = new BinaryReader(…)) { while (br.BaseStream.Position <= br.BaseStream.Length - 4) { counter++; br.ReadUInt32(); br.ReadUInt32(); var n = br.ReadUInt32(); for (int i = 0; i < n; i++) br.ReadUInt32(); } } using(var br = new BinaryReader(…)) { var length = br.BaseStream.Length - 4; while (br.BaseStream.Position <= length) { counter++; br.ReadUInt64(); var n = br.ReadUInt32(); for (int i = 0; i < n; i++) br.ReadUInt32(); } }
  • 42. # Performance Summary Execution time Old 103,775 103,775 0.00 % -31 s 786 ms Memory (MB) New Diff % 2 m 11 s 091 ms 1 m 39 s 305 ms 0 24.25 % х 1.00 х 1.32
  • 43. # ScaleGrad. - Can It Be Faster? /* Return index of number x by ordered scale */ int ScaleGrad(int x)
  • 44. Potential Improvements static double[] Scale; … /* 600+ lines of code */ … int ScaleGrad(int x) { for(int i=0; i<Scale.Length && Scale[i]<=x; i++) return i - 1; } Avoid compare int and double values Scale is a sorted array, so we can use binary search; it’s more efficient and less dependent on input data
  • 45. Comparison static double[] Scale; /* 600+ lins of code */ int ScaleGrad(int x) { for(int i=0;(i<Scale.Length)&&(Scale[i]<= x);i++); return i - 1; } static int[] Scale; /* 600+ lins of code */ int ScaleGrad(int x) var left = 1; var right = Scale.Length -1; var mid =(left + right)>>1;//(left+right)/2 do { mid = left + ((right - left)>>1); if ( x < Scale[mid]) right = mid - 1; else left = mid + 1; } while (right >= left); return mid;
  • 46. #9 Performance Summary Execution time Old 9,754 9,754 0.00 % -3 s 707 ms Memory (MB) New Diff % 32 s 551 ms 28 s 844 ms 0 11.39 % х 1.00 х 1.13
  • 48. PLEASE JOIN OUR WORKSHOP TO SEE ALL OPTIMIZATION STEPS
  • 50. Q&A
  • 51. LINKS Use dotTrace Command-Line Profiler Hashtable and dictionary collection types .NET Performance Optimization & Profiling with JetBrains dotTrace Why GC run when using a struct as a generic dictionary Matt Ellis. Writing Allocation Free Code in C# Maarten Balliauw. Let’s refresh our memory! Memory management in .NET Sasha Goldshtein. Pro .NET Performance: Optimize Your C# Applications Ben Watson. Writing High-Performance .NET Code, 2nd Edition
  • 52. Maarten Balliauw LINKS Sasha Goldshtein Yevhen Tatarynov GitHub Writing Faster Managed Code: Know What Things Cost Ling.Concat Linq.Concat Implementation Buffer.BlockCopy Generic List implementation Konrad Kokosa. High-performance code design patterns in C#