SlideShare a Scribd company logo
The TclQuadcode
Compiler
Status report on Tcl type
analysis and code
generation
Donal Fellowsorcid.org/0000-0002-9091-5938
Kevin Kenny
What is going on?
Ü  Want to Make Tcl Faster
Ü  Everyone benefits
Ü  Lehenbauer Challenges
Ü  2 times faster (“Perl” territory)
Ü  Better algorithms
Ü  Better buffer management
Ü  Bytecode optimization
Ü  10 times faster (“C” territory)
Ü  Needs more radical approach
2
Generating Native Code is Hard
Ü  Going to 10 times faster requires native code
Ü  Bytecode work simply won’t do it
Ü  But Tcl is a very dynamic language
Ü  Even ignoring command renaming tricks
Ü  Native code needs types
Ü  Many platforms
3
Let’s Go to LLVM!
Ü  Solves many problems
Ü  Optimization
Ü  Native code issuing
Ü  Runtime loading
Ü  LLVM Intermediate Representation (IR)
Ü  Effectively a virtual assembly language target
Ü  Existing Tcl package!
Ü  llvmtcl by Jos Decoster
Ü  Introduces problems though
Ü  LLVM’s idea of “throw an error” is to panic with a
gnostic error message
4
How to get to LLVM?
Ü  Still need those pesky types
Ü  Still need fixed semantics
Ü  We need a new bytecode!
Quadcode
Ü  Designed to help:
Ü  Simple translation from Tcl bytecode
Ü  More amenable to analysis
5
6
Quadcode
Tcl Analysis with
Quadcode
Ü  Based on three-address code assembly
Ü  The Tcl code:
set	
  a	
  [expr	
  {	
  $b	
  +	
  1	
  }]	
  
Ü  Equivalent Tcl bytecode:
loadScalar	
  %b;	
  push	
  “1”;	
  add;	
  storeScalar	
  %a	
  
Ü  Equivalent (optimized) quadcode:
add	
  {var	
  a}	
  {var	
  b}	
  {literal	
  1}
Ü  No stack
Ü  Temporary variables used as required
8
Example: Tcl code to bytecode
proc cos {x {n 16}} {
set x [expr {double($x)}]
set n [expr {int($n)}]
set j 0
set s 1.0
set t 1.0
set i 0
while {[incr i] < $n} {
set t [expr {
-$t*$x*$x / [incr j] / [incr j]
}]
set s [expr {$s + $t}]
}
return $s
}
...
29: startCommand {pc 42} 1
38: push1 {literal 0}
40: storeScalar1 {scalar j}
42: pop
43: startCommand {pc 56} 1
52: push1 {literal 1.0}
54: storeScalar1 {scalar s}
56: pop
57: startCommand {pc 70} 1
66: push1 {literal 1.0}
68: storeScalar1 {scalar t}
70: pop
71: startCommand {pc 84} 1
80: push1 {literal 0}
82: storeScalar1 {scalar i}
84: pop
85: startCommand {pc 179} 1
94: jump1 {pc 160}
96: startCommand {pc 142} 2
105: loadScalar1 {scalar t}
107: uminus
108: loadScalar1 {{scalar arg} x}
110: mult
111: loadScalar1 {{scalar arg} x}
113: mult
114: startCommand {pc 126} 1
123: incrScalar1Imm {scalar j} 1
126: div
127: startCommand {pc 139} 1
136: incrScalar1Imm {scalar j} 1
139: div
140: storeScalar1 {scalar t}
142: pop
...
9
Example: bytecode to
quadcode
29: startCommand {pc 42} 1
38: push1 {literal 0}
40: storeScalar1 {scalar j}
42: pop
43: startCommand {pc 56} 1
52: push1 {literal 1.0}
54: storeScalar1 {scalar s}
56: pop
57: startCommand {pc 70} 1
66: push1 {literal 1.0}
68: storeScalar1 {scalar t}
70: pop
71: startCommand {pc 84} 1
80: push1 {literal 0}
82: storeScalar1 {scalar i}
84: pop
85: startCommand {pc 179} 1
94: jump1 {pc 160}
96: startCommand {pc 142} 2
105: loadScalar1 {scalar t}
107: uminus
108: loadScalar1 {{scalar arg} x}
110: mult
111: loadScalar1 {{scalar arg} x}
113: mult
114: startCommand {pc 126} 1
123: incrScalar1Imm {scalar j} 1
126: div
127: startCommand {pc 139} 1
136: incrScalar1Imm {scalar j} 1
139: div
140: storeScalar1 {scalar t}
142: pop
11: copy {temp 0} {literal 0}
12: copy {var j} {temp 0}
13: copy {temp 0} {literal 1.0}
14: copy {var s} {temp 0}
15: copy {temp 0} {literal 1.0}
16: copy {var t} {temp 0}
17: copy {temp 0} {literal 0}
18: copy {var i} {temp 0}
19: jump {pc 37}
20: copy {temp 0} {var t}
21: uminus {temp 0} {temp 0}
22: copy {temp 1} {var x}
23: mult {temp 0} {temp 0} {temp 1}
24: copy {temp 1} {var x}
25: mult {temp 0} {temp 0} {temp 1}
26: add {var j} {var j} {literal 1}
27: copy {temp 1} {var j}
28: div {temp 0} {temp 0} {temp 1}
29: add {var j} {var j} {literal 1}
30: copy {temp 1} {var j}
31: div {temp 0} {temp 0} {temp 1}
32: copy {var t} {temp 0}
10
Quadcode Analysis
Ü  Code is converted to Static Single
Assignment (SSA) form
Ü  Variables assigned only once
Ü  Phi (φ) instructions used to merge variables at
convergences (after if-branches and in loops)
Ü  Lifetime analysis
Ü  Corresponds to where to use Tcl_DecrRefCount	
  
Ü  Type analysis
Ü  What type of data actually goes in a variable?
11
Example: Tcl code to
cleaned-up quadcode
proc cos {x {n 16}} {
set x [expr {double($x)}]
set n [expr {int($n)}]
set j 0
set s 1.0
set t 1.0
set i 0
while {[incr i] < $n} {
set t [expr {
-$t*$x*$x / [incr j] / [incr j]
}]
set s [expr {$s + $t}]
}
return $s
}
0: param {var x} {arg 0}
1: param {var n} {arg 1}
2: invoke {var x} {literal tcl::mathfunc::double} {var x}
3: invoke {var n} {literal tcl::mathfunc::int} {var n}
4: copy {var j} {literal 0}
5: copy {var s} {literal 1.0}
6: copy {var t} {literal 1.0}
7: copy {var i} {literal 0}
8: jump {pc 18}
9: uminus {temp 0} {var t}
10: mult {temp 0} {temp 0} {var x}
11: mult {temp 0} {temp 0} {var x}
12: add {var j} {var j} {literal 1}
13: div {temp 0} {temp 0} {var j}
14: add {var j} {var j} {literal 1}
15: div {temp 0} {temp 0} {var j}
16: copy {var t} {temp 0}
17: add {var s} {var s} {temp 0}
18: add {var i} {var i} {literal 1}
19: lt {temp 0} {var i} {var n}
20: jumpTrue {pc 9} {temp 0}
21: return {} {var s}
12
Note that this is before SSA analysis
Example: In SSA form
0: param {var x 0} {arg 0}
1: param {var n 1} {arg 1}
2: invoke {var x 2} {literal tcl::mathfunc::double} {var x 0}
3: invoke {var n 3} {literal tcl::mathfunc::int} {var n 1}
4: copy {var j 4} {literal 0}
5: copy {var s 5} {literal 1.0}
6: copy {var t 6} {literal 1.0}
7: copy {var i 7} {literal 0}
8: jump {pc 18}
9: uminus {temp 0 9} {var t 21}
10: mult {temp 0 10} {temp 0 9} {var x 2}
11: mult {temp 0 11} {temp 0 10} {var x 2}
12: add {var j 12} {var j 19} {literal 1}
13: div {temp 0 13} {temp 0 11} {var j 12}
14: add {var j 14} {var j 12} {literal 1}
15: div {temp 0 15} {temp 0 13} {var j 14}
16: copy {var t 16} {temp 0 15}
17: add {var s 17} {var s 20} {temp 0 15}
18: confluence
19: phi {var j 19} {var j 4} {pc 8} {var j 14} {pc 17}
20: phi {var s 20} {var s 5} {pc 8} {var s 17} {pc 17}
21: phi {var t 21} {var t 6} {pc 8} {var t 16} {pc 17}
22: phi {var i 22} {var i 7} {pc 8} {var i 23} {pc 17}
23: add {var i 23} {var i 22} {literal 1}
24: lt {temp 0 24} {var i 23} {var n 3}
25: jumpTrue {pc 9} {temp 0 24}
26: return {} {var s 20}
13
The Types of Tcl
Ü  Tcl isn’t entirely
typeless
Ü  Our values have types
Ü  String, Integer, Double-
precision float,
Boolean, List,
Dictionary, etc.
Ü  But everything is a
string
Ü  All other types are
formally subtypes of
string
14
string
double integer
booleannumeric
bool int
BOTTOM
list
dict
Example: Determined Types
Ü  Variable types inferred:
Ü  DOUBLE (i.e., proven to only ever contain a floating
point number)
Ü  var x 0, var x 2,var t 8, var t 37, temp 0 16, …
Ü  INT (i.e., proven to only ever contain an integer of
unknown width)
Ü  var n 1, var n 4, var j 10, var i 12, var j 35, var j 22, var j 26, …
Ü  INT BOOLEAN (i.e., proven to only ever contain the
values 0 or 1)
Ü  var j 6, var i 9, temp 0 41, …
Ü  Return type inferred:
Ü  DOUBLE (i.e., always succeeds, always produces a
floating point number)
15
Neat Tech along the Way
Ü  Uses TclBDD as Reasoning Engine
Ü  Datalog is clean way to express complex programs
Ü  Good for computing properties
Ü  Stops us from going mad!
Ü  (presented last year)
Ü  Might be possible to use quadcode itself as
an bytecode-interpreted execution target
Ü  Totally not our aim, but it is quite a bit cleaner
Ü  Not yet studied
16
We’re at the Station…
17
LLVM IR
Generating
18
Generating LLVM
Ü  LLVM Intermediate Representation (IR) is very
concrete
Ü  Lower level than C
Ü  Virtual Assembler
Ü  Each Tcl procedure goes to two functions
1.  Body of procedure
2.  “Thunk” to connect body to Tcl
Ü  Each quadcode instruction goes to a non-
branching sequence of IR opcodes
Ü  Keep pattern of basic blocks
Ü  Except branches, which branch of course
Compiling Instructions: Add
Ü  Adding two floats is trivial conversion
%s = fadd double %phi_s, %tmp.08
Ü  Adding two integers is not, as we don’t know
the bit width
Ü  So call a helper function!
%j = call %INT @tcl.add(%INT %phi_j, %INT %k)
Ü  The INT type is really a discriminated union
20
Compiling Instructions: Add
Ü  Many ways to add
Ü  Which to use in particular situation?
Ü  How we do it:
Ü  Look at the argument types (guaranteed known)
Ü  Look up TclOO method in code issuer to actually
get how to issue code
Ü  Add the types to the method name
Ü  Unknown method handler generates normal
typecasts
Ü  Just need to specify the “interesting” cases
21
Example: Issuing an Add
Ü  Want to issue an add:
add {var a 1} {var b 2} {var c 3}
Ü  Look up argument types:
{var b 2} à DOUBLE
{var c 3} à INT
Ü  Call issuer method add(DOUBLE,INT)
Ü  Doesn’t exist, build from add(DOUBLE,DOUBLE) and
typecaster
Ü  End up with required instructions, perhaps:
%45 = call double @tcl.typecast.dbl(%INT %c.3)
%a.1 = fadd double %b.2 %45
22
Internal
Standard
Library
Function
The Internal Standard Library
Ü  Collection of Functions to be Inlined by LLVM
Optimizer
Ü  Implement many more complex operations
23
; casts from our structured INT to a double-precision float
define hidden double @tcl.typecast.dbl(%INT %x) #0 {
; extract the fields
%x.flag = extractvalue %INT %x, 0
%x.32 = extractvalue %INT %x, 1
%x.64 = extractvalue %INT %x, 2
; determine what the 64-bit value is
%is32bit = icmp eq i32 %x.flag, 0
%value32bit = sext i32 %x.32 to i64
%value = select i1 %is32bit, i64 %value32bit, i64 %x.64
; perform the cast and return it
%casted = sitofp i64 %value to double
ret double %casted
}
Optimization
Ü  A critical step of IR generation is to run the
optimizer
Ü  Cleans up the code hugely
Ü  Inlines functions
Ü  Removes dead code paths
Ü  We have much of Tcl API annotated to help
the optimizer understand it
Ü  Documents guarantees and assumptions
24
Example: Optimized COS body
%6 = fmul double %phi_t64, %x
%7 = fmul double %6, %x
%tmp.04 = fsub double -0.000000e+00, %7
%8 = extractvalue %INT %phi_j62, 0
%9 = icmp eq i32 %8, 0
%10 = extractvalue %INT %phi_j62, 1
%11 = sext i32 %10 to i64
%12 = extractvalue %INT %phi_j62, 2
%x.6425.i43 = select i1 %9, i64 %11, i64 %12
%z.643.i44 = add i64 %x.6425.i43, 1
%cast = sitofp i64 %z.643.i44 to double
%tmp.05 = fdiv double %tmp.04, %cast
%z.643.i = add i64 %x.6425.i43, 2
%13 = insertvalue %INT { i32 1, i32 undef, i64 undef }, i64 %z.643.i, 2
%cast7 = sitofp i64 %z.643.i to double
%tmp.08 = fdiv double %tmp.05, %cast7
%s = fadd double %phi_s63, %tmp.08
25
The Other Types
Ü  Lists and Dictionaries are treated as Strings
Ü  Mapped to a Tcl_Obj* reference
Ü  Lifetime management used to control reference
counting efficiently
Ü  Failing operations become tagged derived
types
Ü  Failures cause jump to exception handling code
Ü  The BOTTOM type only occurs in functions
that cannot return
Ü  If they return, they do so by an error
26
Neat Tech along the Way
Ü  Closures
Ü  Callbacks which
capture local variables
Ü  Locally-scoped
Variables
Ü  Easy way to stop
variables from one
place spreading
elsewhere
Ü  Prevented many nasty
bugs
Example from Standard Library Builder
# Create local LLVM function: tcl.int.32
set f [$m local "tcl.int.32" int<-INT ReadNone]
params x
build {
my condBr [my isInt32 $x] $x32 $x64
label x32:
my ret [my int.32 $x]
label x64:
my ret [my cast(int) [my int.64 $x]]
}
# Make closure to create call to tcl.int.32
my closure getInt32 {arg {resultName ""}} {
my call [$f ref] [list $arg] $resultName
}
27
Heading Out…
28
FAST?
So, is this thing
Performance Categories
Ü  Numeric Code
Ü  Test pure integer functions with iterative fibonacci
Ü  Test floating point functions with cosines
Ü  Reference-handling Code
Ü  Test string handling functions with complex string
replacement
Ü  Test list handling functions with list joining
Ü  Test dictionary/array functions with counting words in
a list
Ü  Error-path Code
Ü  Test exception handling with non-trivial try usage
Performance
Category Test
Time (µs) Acceleration
(%)
Target
Reached?Uncompiled Compiled
Numeric fib 12.15 0.4758 2453 ✓✓
Numeric cos 6.277 0.3936 1495 ✓✓
Reference replacing 1.233 0.8792 40 ✗
Reference listjoin 2.300 0.6946 231 ✓
Reference wordcount 18.67 5.660 230 ✓
Error errortester 13.73 4.999 175 ✓-ish
31
Looking
Great!
Summary and Analysis
Ü  Numeric code is hugely faster
Ü  Typically much more than 10 times faster!
Ü  Reference management code is nicely faster
Ü  Often around 2–3 times faster
Ü  Automatically detecting how to unshare objects
Ü  String code largely unaffected
Ü  Critically dependent on buffer management
Ü  Might also be due to code used for testing
Ü  Error code mostly faster
Ü  Could still do better, but not usually critical path
32
Going Fast!
33
Future
Looking to the
34
Where Next?
Ü  Finish filling out translation from bytecode
Ü  Unset
Ü  Introspection
Ü  Address slow speed of compilation
Ü  Resulting code is fast, but process to get to it…
Ü  How to integrate into Tcl?
Ü  When to compile?
Ü  When to cache?
Ü  How to use LLVM practically?
Ü  What extensions to Tcl’s C API are needed?
Advanced Compilation
Ü  Compilation between procedures
Ü  Can we use type info more extensively?
Ü  Access to global variables
Ü  Currently local-variable only
Ü  Traces, variable scopes, etc.
Ü  Other types of compileable things
Ü  Lambdas, methods, …
36
Longer-term Questions
Ü  What changes should we do in Tcl in light of
this?
Ü  Already some ideas:
Ü  Change incr to support floats
Ü  Some way to annotate suggested types on arguments
Ü  If we bite the LLVM bullet, what other
changes follow?
Ü  Need to link to C++ libraries to use LLVM
Ü  Implement official C++ API to Tcl?
37

More Related Content

What's hot (20)

PPTX
Java 7, 8 & 9 - Moving the language forward
Mario Fusco
 
PPTX
Meta Object Protocols
Pierre de Lacaze
 
PDF
Pragmatic Real-World Scala (short version)
Jonas Bonér
 
PPT
Initial Java Core Concept
Rays Technologies
 
PDF
Java Keeps Throttling Up!
José Paumard
 
PDF
FP in Java - Project Lambda and beyond
Mario Fusco
 
PDF
Java Class Design
Ganesh Samarthyam
 
PDF
Why Haskell
Susan Potter
 
PDF
Map(), flatmap() and reduce() are your new best friends: simpler collections,...
Chris Richardson
 
PDF
Logic programming a ruby perspective
Norman Richards
 
PDF
The Ring programming language version 1.7 book - Part 35 of 196
Mahmoud Samir Fayed
 
PPSX
What's New In C# 7
Paulo Morgado
 
PPTX
The Sincerest Form of Flattery
José Paumard
 
PDF
Java Full Throttle
José Paumard
 
PPTX
Protocol-Oriented Programming in Swift
GlobalLogic Ukraine
 
PDF
Comparing JVM languages
Jose Manuel Ortega Candel
 
PPTX
Java Generics
Zülfikar Karakaya
 
ODP
AST Transformations at JFokus
HamletDRC
 
PDF
C# 7
Mike Harris
 
PDF
The Ring programming language version 1.8 book - Part 37 of 202
Mahmoud Samir Fayed
 
Java 7, 8 & 9 - Moving the language forward
Mario Fusco
 
Meta Object Protocols
Pierre de Lacaze
 
Pragmatic Real-World Scala (short version)
Jonas Bonér
 
Initial Java Core Concept
Rays Technologies
 
Java Keeps Throttling Up!
José Paumard
 
FP in Java - Project Lambda and beyond
Mario Fusco
 
Java Class Design
Ganesh Samarthyam
 
Why Haskell
Susan Potter
 
Map(), flatmap() and reduce() are your new best friends: simpler collections,...
Chris Richardson
 
Logic programming a ruby perspective
Norman Richards
 
The Ring programming language version 1.7 book - Part 35 of 196
Mahmoud Samir Fayed
 
What's New In C# 7
Paulo Morgado
 
The Sincerest Form of Flattery
José Paumard
 
Java Full Throttle
José Paumard
 
Protocol-Oriented Programming in Swift
GlobalLogic Ukraine
 
Comparing JVM languages
Jose Manuel Ortega Candel
 
Java Generics
Zülfikar Karakaya
 
AST Transformations at JFokus
HamletDRC
 
The Ring programming language version 1.8 book - Part 37 of 202
Mahmoud Samir Fayed
 

Similar to The TclQuadcode Compiler (20)

PDF
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
Functional Thursday
 
PPTX
Lecture 12 intermediate code generation
Iffat Anjum
 
PPTX
Functional verification using System verilog introduction
JuhaMichel
 
PDF
Introduction to Compiler Development
Logan Chien
 
PDF
Peyton jones-2009-fun with-type_functions-slide
Takayuki Muranushi
 
PDF
Scheme on WebAssembly: It is happening!
Igalia
 
PDF
Real World Haskell: Lecture 4
Bryan O'Sullivan
 
ODP
06. haskell type builder
Sebastian Rettig
 
PDF
Compiler Construction for DLX Processor
Soham Kulkarni
 
PPTX
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Silvio Cesare
 
PDF
Optimizing with persistent data structures (LLVM Cauldron 2016)
Igalia
 
PPT
ALGOL ailesi programlama dilleri
Cumhuriyet Üniversitesi
 
PDF
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Anne Nicolas
 
PDF
Writing a C Compiler Build a Real Programming Language from Scratch Nora Sandler
hathudpunta
 
PDF
T3chFest 2016 - The polyglot programmer
David Muñoz Díaz
 
PDF
Qe Reference
Susan Gold
 
PDF
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Jonathan Salwan
 
PDF
09. haskell Context
Sebastian Rettig
 
PDF
SWIG : An Easy to Use Tool for Integrating Scripting Languages with C and C++
David Beazley (Dabeaz LLC)
 
[FT-11][suhorng] “Poor Man's” Undergraduate Compilers
Functional Thursday
 
Lecture 12 intermediate code generation
Iffat Anjum
 
Functional verification using System verilog introduction
JuhaMichel
 
Introduction to Compiler Development
Logan Chien
 
Peyton jones-2009-fun with-type_functions-slide
Takayuki Muranushi
 
Scheme on WebAssembly: It is happening!
Igalia
 
Real World Haskell: Lecture 4
Bryan O'Sullivan
 
06. haskell type builder
Sebastian Rettig
 
Compiler Construction for DLX Processor
Soham Kulkarni
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Silvio Cesare
 
Optimizing with persistent data structures (LLVM Cauldron 2016)
Igalia
 
ALGOL ailesi programlama dilleri
Cumhuriyet Üniversitesi
 
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Anne Nicolas
 
Writing a C Compiler Build a Real Programming Language from Scratch Nora Sandler
hathudpunta
 
T3chFest 2016 - The polyglot programmer
David Muñoz Díaz
 
Qe Reference
Susan Gold
 
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Jonathan Salwan
 
09. haskell Context
Sebastian Rettig
 
SWIG : An Easy to Use Tool for Integrating Scripting Languages with C and C++
David Beazley (Dabeaz LLC)
 
Ad

Recently uploaded (20)

PPTX
Get Started with Maestro: Agent, Robot, and Human in Action – Session 5 of 5
klpathrudu
 
PDF
Why is partnering with a SaaS development company crucial for enterprise succ...
Nextbrain Technologies
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PPTX
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
PDF
Latest Capcut Pro 5.9.0 Crack Version For PC {Fully 2025
utfefguu
 
PDF
Meet in the Middle: Solving the Low-Latency Challenge for Agentic AI
Alluxio, Inc.
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PDF
Is Framer the Future of AI Powered No-Code Development?
Isla Pandora
 
PDF
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
PDF
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
PPTX
Transforming Insights: How Generative AI is Revolutionizing Data Analytics
LetsAI Solutions
 
PDF
Simplify React app login with asgardeo-sdk
vaibhav289687
 
PDF
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
PDF
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
PDF
Ready Layer One: Intro to the Model Context Protocol
mmckenna1
 
PDF
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
PDF
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
PPTX
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
PPTX
Library_Management_System_PPT111111.pptx
nmtnissancrm
 
Get Started with Maestro: Agent, Robot, and Human in Action – Session 5 of 5
klpathrudu
 
Why is partnering with a SaaS development company crucial for enterprise succ...
Nextbrain Technologies
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
Comprehensive Risk Assessment Module for Smarter Risk Management
EHA Soft Solutions
 
Latest Capcut Pro 5.9.0 Crack Version For PC {Fully 2025
utfefguu
 
Meet in the Middle: Solving the Low-Latency Challenge for Agentic AI
Alluxio, Inc.
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
Is Framer the Future of AI Powered No-Code Development?
Isla Pandora
 
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
Transforming Insights: How Generative AI is Revolutionizing Data Analytics
LetsAI Solutions
 
Simplify React app login with asgardeo-sdk
vaibhav289687
 
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
Ready Layer One: Intro to the Model Context Protocol
mmckenna1
 
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
SciPy 2025 - Packaging a Scientific Python Project
Henry Schreiner
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
Library_Management_System_PPT111111.pptx
nmtnissancrm
 
Ad

The TclQuadcode Compiler

  • 1. The TclQuadcode Compiler Status report on Tcl type analysis and code generation Donal Fellowsorcid.org/0000-0002-9091-5938 Kevin Kenny
  • 2. What is going on? Ü  Want to Make Tcl Faster Ü  Everyone benefits Ü  Lehenbauer Challenges Ü  2 times faster (“Perl” territory) Ü  Better algorithms Ü  Better buffer management Ü  Bytecode optimization Ü  10 times faster (“C” territory) Ü  Needs more radical approach 2
  • 3. Generating Native Code is Hard Ü  Going to 10 times faster requires native code Ü  Bytecode work simply won’t do it Ü  But Tcl is a very dynamic language Ü  Even ignoring command renaming tricks Ü  Native code needs types Ü  Many platforms 3
  • 4. Let’s Go to LLVM! Ü  Solves many problems Ü  Optimization Ü  Native code issuing Ü  Runtime loading Ü  LLVM Intermediate Representation (IR) Ü  Effectively a virtual assembly language target Ü  Existing Tcl package! Ü  llvmtcl by Jos Decoster Ü  Introduces problems though Ü  LLVM’s idea of “throw an error” is to panic with a gnostic error message 4
  • 5. How to get to LLVM? Ü  Still need those pesky types Ü  Still need fixed semantics Ü  We need a new bytecode! Quadcode Ü  Designed to help: Ü  Simple translation from Tcl bytecode Ü  More amenable to analysis 5
  • 6. 6
  • 8. Quadcode Ü  Based on three-address code assembly Ü  The Tcl code: set  a  [expr  {  $b  +  1  }]   Ü  Equivalent Tcl bytecode: loadScalar  %b;  push  “1”;  add;  storeScalar  %a   Ü  Equivalent (optimized) quadcode: add  {var  a}  {var  b}  {literal  1} Ü  No stack Ü  Temporary variables used as required 8
  • 9. Example: Tcl code to bytecode proc cos {x {n 16}} { set x [expr {double($x)}] set n [expr {int($n)}] set j 0 set s 1.0 set t 1.0 set i 0 while {[incr i] < $n} { set t [expr { -$t*$x*$x / [incr j] / [incr j] }] set s [expr {$s + $t}] } return $s } ... 29: startCommand {pc 42} 1 38: push1 {literal 0} 40: storeScalar1 {scalar j} 42: pop 43: startCommand {pc 56} 1 52: push1 {literal 1.0} 54: storeScalar1 {scalar s} 56: pop 57: startCommand {pc 70} 1 66: push1 {literal 1.0} 68: storeScalar1 {scalar t} 70: pop 71: startCommand {pc 84} 1 80: push1 {literal 0} 82: storeScalar1 {scalar i} 84: pop 85: startCommand {pc 179} 1 94: jump1 {pc 160} 96: startCommand {pc 142} 2 105: loadScalar1 {scalar t} 107: uminus 108: loadScalar1 {{scalar arg} x} 110: mult 111: loadScalar1 {{scalar arg} x} 113: mult 114: startCommand {pc 126} 1 123: incrScalar1Imm {scalar j} 1 126: div 127: startCommand {pc 139} 1 136: incrScalar1Imm {scalar j} 1 139: div 140: storeScalar1 {scalar t} 142: pop ... 9
  • 10. Example: bytecode to quadcode 29: startCommand {pc 42} 1 38: push1 {literal 0} 40: storeScalar1 {scalar j} 42: pop 43: startCommand {pc 56} 1 52: push1 {literal 1.0} 54: storeScalar1 {scalar s} 56: pop 57: startCommand {pc 70} 1 66: push1 {literal 1.0} 68: storeScalar1 {scalar t} 70: pop 71: startCommand {pc 84} 1 80: push1 {literal 0} 82: storeScalar1 {scalar i} 84: pop 85: startCommand {pc 179} 1 94: jump1 {pc 160} 96: startCommand {pc 142} 2 105: loadScalar1 {scalar t} 107: uminus 108: loadScalar1 {{scalar arg} x} 110: mult 111: loadScalar1 {{scalar arg} x} 113: mult 114: startCommand {pc 126} 1 123: incrScalar1Imm {scalar j} 1 126: div 127: startCommand {pc 139} 1 136: incrScalar1Imm {scalar j} 1 139: div 140: storeScalar1 {scalar t} 142: pop 11: copy {temp 0} {literal 0} 12: copy {var j} {temp 0} 13: copy {temp 0} {literal 1.0} 14: copy {var s} {temp 0} 15: copy {temp 0} {literal 1.0} 16: copy {var t} {temp 0} 17: copy {temp 0} {literal 0} 18: copy {var i} {temp 0} 19: jump {pc 37} 20: copy {temp 0} {var t} 21: uminus {temp 0} {temp 0} 22: copy {temp 1} {var x} 23: mult {temp 0} {temp 0} {temp 1} 24: copy {temp 1} {var x} 25: mult {temp 0} {temp 0} {temp 1} 26: add {var j} {var j} {literal 1} 27: copy {temp 1} {var j} 28: div {temp 0} {temp 0} {temp 1} 29: add {var j} {var j} {literal 1} 30: copy {temp 1} {var j} 31: div {temp 0} {temp 0} {temp 1} 32: copy {var t} {temp 0} 10
  • 11. Quadcode Analysis Ü  Code is converted to Static Single Assignment (SSA) form Ü  Variables assigned only once Ü  Phi (φ) instructions used to merge variables at convergences (after if-branches and in loops) Ü  Lifetime analysis Ü  Corresponds to where to use Tcl_DecrRefCount   Ü  Type analysis Ü  What type of data actually goes in a variable? 11
  • 12. Example: Tcl code to cleaned-up quadcode proc cos {x {n 16}} { set x [expr {double($x)}] set n [expr {int($n)}] set j 0 set s 1.0 set t 1.0 set i 0 while {[incr i] < $n} { set t [expr { -$t*$x*$x / [incr j] / [incr j] }] set s [expr {$s + $t}] } return $s } 0: param {var x} {arg 0} 1: param {var n} {arg 1} 2: invoke {var x} {literal tcl::mathfunc::double} {var x} 3: invoke {var n} {literal tcl::mathfunc::int} {var n} 4: copy {var j} {literal 0} 5: copy {var s} {literal 1.0} 6: copy {var t} {literal 1.0} 7: copy {var i} {literal 0} 8: jump {pc 18} 9: uminus {temp 0} {var t} 10: mult {temp 0} {temp 0} {var x} 11: mult {temp 0} {temp 0} {var x} 12: add {var j} {var j} {literal 1} 13: div {temp 0} {temp 0} {var j} 14: add {var j} {var j} {literal 1} 15: div {temp 0} {temp 0} {var j} 16: copy {var t} {temp 0} 17: add {var s} {var s} {temp 0} 18: add {var i} {var i} {literal 1} 19: lt {temp 0} {var i} {var n} 20: jumpTrue {pc 9} {temp 0} 21: return {} {var s} 12 Note that this is before SSA analysis
  • 13. Example: In SSA form 0: param {var x 0} {arg 0} 1: param {var n 1} {arg 1} 2: invoke {var x 2} {literal tcl::mathfunc::double} {var x 0} 3: invoke {var n 3} {literal tcl::mathfunc::int} {var n 1} 4: copy {var j 4} {literal 0} 5: copy {var s 5} {literal 1.0} 6: copy {var t 6} {literal 1.0} 7: copy {var i 7} {literal 0} 8: jump {pc 18} 9: uminus {temp 0 9} {var t 21} 10: mult {temp 0 10} {temp 0 9} {var x 2} 11: mult {temp 0 11} {temp 0 10} {var x 2} 12: add {var j 12} {var j 19} {literal 1} 13: div {temp 0 13} {temp 0 11} {var j 12} 14: add {var j 14} {var j 12} {literal 1} 15: div {temp 0 15} {temp 0 13} {var j 14} 16: copy {var t 16} {temp 0 15} 17: add {var s 17} {var s 20} {temp 0 15} 18: confluence 19: phi {var j 19} {var j 4} {pc 8} {var j 14} {pc 17} 20: phi {var s 20} {var s 5} {pc 8} {var s 17} {pc 17} 21: phi {var t 21} {var t 6} {pc 8} {var t 16} {pc 17} 22: phi {var i 22} {var i 7} {pc 8} {var i 23} {pc 17} 23: add {var i 23} {var i 22} {literal 1} 24: lt {temp 0 24} {var i 23} {var n 3} 25: jumpTrue {pc 9} {temp 0 24} 26: return {} {var s 20} 13
  • 14. The Types of Tcl Ü  Tcl isn’t entirely typeless Ü  Our values have types Ü  String, Integer, Double- precision float, Boolean, List, Dictionary, etc. Ü  But everything is a string Ü  All other types are formally subtypes of string 14 string double integer booleannumeric bool int BOTTOM list dict
  • 15. Example: Determined Types Ü  Variable types inferred: Ü  DOUBLE (i.e., proven to only ever contain a floating point number) Ü  var x 0, var x 2,var t 8, var t 37, temp 0 16, … Ü  INT (i.e., proven to only ever contain an integer of unknown width) Ü  var n 1, var n 4, var j 10, var i 12, var j 35, var j 22, var j 26, … Ü  INT BOOLEAN (i.e., proven to only ever contain the values 0 or 1) Ü  var j 6, var i 9, temp 0 41, … Ü  Return type inferred: Ü  DOUBLE (i.e., always succeeds, always produces a floating point number) 15
  • 16. Neat Tech along the Way Ü  Uses TclBDD as Reasoning Engine Ü  Datalog is clean way to express complex programs Ü  Good for computing properties Ü  Stops us from going mad! Ü  (presented last year) Ü  Might be possible to use quadcode itself as an bytecode-interpreted execution target Ü  Totally not our aim, but it is quite a bit cleaner Ü  Not yet studied 16
  • 17. We’re at the Station… 17
  • 19. Generating LLVM Ü  LLVM Intermediate Representation (IR) is very concrete Ü  Lower level than C Ü  Virtual Assembler Ü  Each Tcl procedure goes to two functions 1.  Body of procedure 2.  “Thunk” to connect body to Tcl Ü  Each quadcode instruction goes to a non- branching sequence of IR opcodes Ü  Keep pattern of basic blocks Ü  Except branches, which branch of course
  • 20. Compiling Instructions: Add Ü  Adding two floats is trivial conversion %s = fadd double %phi_s, %tmp.08 Ü  Adding two integers is not, as we don’t know the bit width Ü  So call a helper function! %j = call %INT @tcl.add(%INT %phi_j, %INT %k) Ü  The INT type is really a discriminated union 20
  • 21. Compiling Instructions: Add Ü  Many ways to add Ü  Which to use in particular situation? Ü  How we do it: Ü  Look at the argument types (guaranteed known) Ü  Look up TclOO method in code issuer to actually get how to issue code Ü  Add the types to the method name Ü  Unknown method handler generates normal typecasts Ü  Just need to specify the “interesting” cases 21
  • 22. Example: Issuing an Add Ü  Want to issue an add: add {var a 1} {var b 2} {var c 3} Ü  Look up argument types: {var b 2} à DOUBLE {var c 3} à INT Ü  Call issuer method add(DOUBLE,INT) Ü  Doesn’t exist, build from add(DOUBLE,DOUBLE) and typecaster Ü  End up with required instructions, perhaps: %45 = call double @tcl.typecast.dbl(%INT %c.3) %a.1 = fadd double %b.2 %45 22 Internal Standard Library Function
  • 23. The Internal Standard Library Ü  Collection of Functions to be Inlined by LLVM Optimizer Ü  Implement many more complex operations 23 ; casts from our structured INT to a double-precision float define hidden double @tcl.typecast.dbl(%INT %x) #0 { ; extract the fields %x.flag = extractvalue %INT %x, 0 %x.32 = extractvalue %INT %x, 1 %x.64 = extractvalue %INT %x, 2 ; determine what the 64-bit value is %is32bit = icmp eq i32 %x.flag, 0 %value32bit = sext i32 %x.32 to i64 %value = select i1 %is32bit, i64 %value32bit, i64 %x.64 ; perform the cast and return it %casted = sitofp i64 %value to double ret double %casted }
  • 24. Optimization Ü  A critical step of IR generation is to run the optimizer Ü  Cleans up the code hugely Ü  Inlines functions Ü  Removes dead code paths Ü  We have much of Tcl API annotated to help the optimizer understand it Ü  Documents guarantees and assumptions 24
  • 25. Example: Optimized COS body %6 = fmul double %phi_t64, %x %7 = fmul double %6, %x %tmp.04 = fsub double -0.000000e+00, %7 %8 = extractvalue %INT %phi_j62, 0 %9 = icmp eq i32 %8, 0 %10 = extractvalue %INT %phi_j62, 1 %11 = sext i32 %10 to i64 %12 = extractvalue %INT %phi_j62, 2 %x.6425.i43 = select i1 %9, i64 %11, i64 %12 %z.643.i44 = add i64 %x.6425.i43, 1 %cast = sitofp i64 %z.643.i44 to double %tmp.05 = fdiv double %tmp.04, %cast %z.643.i = add i64 %x.6425.i43, 2 %13 = insertvalue %INT { i32 1, i32 undef, i64 undef }, i64 %z.643.i, 2 %cast7 = sitofp i64 %z.643.i to double %tmp.08 = fdiv double %tmp.05, %cast7 %s = fadd double %phi_s63, %tmp.08 25
  • 26. The Other Types Ü  Lists and Dictionaries are treated as Strings Ü  Mapped to a Tcl_Obj* reference Ü  Lifetime management used to control reference counting efficiently Ü  Failing operations become tagged derived types Ü  Failures cause jump to exception handling code Ü  The BOTTOM type only occurs in functions that cannot return Ü  If they return, they do so by an error 26
  • 27. Neat Tech along the Way Ü  Closures Ü  Callbacks which capture local variables Ü  Locally-scoped Variables Ü  Easy way to stop variables from one place spreading elsewhere Ü  Prevented many nasty bugs Example from Standard Library Builder # Create local LLVM function: tcl.int.32 set f [$m local "tcl.int.32" int<-INT ReadNone] params x build { my condBr [my isInt32 $x] $x32 $x64 label x32: my ret [my int.32 $x] label x64: my ret [my cast(int) [my int.64 $x]] } # Make closure to create call to tcl.int.32 my closure getInt32 {arg {resultName ""}} { my call [$f ref] [list $arg] $resultName } 27
  • 30. Performance Categories Ü  Numeric Code Ü  Test pure integer functions with iterative fibonacci Ü  Test floating point functions with cosines Ü  Reference-handling Code Ü  Test string handling functions with complex string replacement Ü  Test list handling functions with list joining Ü  Test dictionary/array functions with counting words in a list Ü  Error-path Code Ü  Test exception handling with non-trivial try usage
  • 31. Performance Category Test Time (µs) Acceleration (%) Target Reached?Uncompiled Compiled Numeric fib 12.15 0.4758 2453 ✓✓ Numeric cos 6.277 0.3936 1495 ✓✓ Reference replacing 1.233 0.8792 40 ✗ Reference listjoin 2.300 0.6946 231 ✓ Reference wordcount 18.67 5.660 230 ✓ Error errortester 13.73 4.999 175 ✓-ish 31 Looking Great!
  • 32. Summary and Analysis Ü  Numeric code is hugely faster Ü  Typically much more than 10 times faster! Ü  Reference management code is nicely faster Ü  Often around 2–3 times faster Ü  Automatically detecting how to unshare objects Ü  String code largely unaffected Ü  Critically dependent on buffer management Ü  Might also be due to code used for testing Ü  Error code mostly faster Ü  Could still do better, but not usually critical path 32
  • 35. Where Next? Ü  Finish filling out translation from bytecode Ü  Unset Ü  Introspection Ü  Address slow speed of compilation Ü  Resulting code is fast, but process to get to it… Ü  How to integrate into Tcl? Ü  When to compile? Ü  When to cache? Ü  How to use LLVM practically? Ü  What extensions to Tcl’s C API are needed?
  • 36. Advanced Compilation Ü  Compilation between procedures Ü  Can we use type info more extensively? Ü  Access to global variables Ü  Currently local-variable only Ü  Traces, variable scopes, etc. Ü  Other types of compileable things Ü  Lambdas, methods, … 36
  • 37. Longer-term Questions Ü  What changes should we do in Tcl in light of this? Ü  Already some ideas: Ü  Change incr to support floats Ü  Some way to annotate suggested types on arguments Ü  If we bite the LLVM bullet, what other changes follow? Ü  Need to link to C++ libraries to use LLVM Ü  Implement official C++ API to Tcl? 37