SlideShare a Scribd company logo
LEARNING INPUT TOKENS FOR EFFECTIVE FUZZING
BJÖRN MATHIS, RAHUL GOPINATH, ANDREAS ZELLER
FUZZING - THE ART OF AUTOMATIC BUG FINDING
2
PROGRAM UNDER TESTFUZZER
FUZZING - THE ART OF AUTOMATIC BUG FINDING
2
PROGRAM UNDER TEST
7245
FUZZER
FUZZING - THE ART OF AUTOMATIC BUG FINDING
2
PROGRAM UNDER TEST
7245
FUZZER
FUZZING - THE ART OF AUTOMATIC BUG FINDING
2
PROGRAM UNDER TEST
7245
FUZZER
FUZZING - THE ART OF AUTOMATIC BUG FINDING
2
PROGRAM UNDER TESTFUZZER
FUZZING - THE ART OF AUTOMATIC BUG FINDING
2
PROGRAM UNDER TEST
C4tscs
FUZZER
FUZZING - THE ART OF AUTOMATIC BUG FINDING
2
PROGRAM UNDER TEST
C4tscs
FUZZER
FUZZING - THE ART OF AUTOMATIC BUG FINDING
2
PROGRAM UNDER TEST
C4tscs
FUZZER
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
C4tscs
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
C4tscs
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
X + 0
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
X + 0
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
X + 0
PROGRAM UNDER TEST
FUZZING - THE ART OF AUTOMATIC BUG FINDING
3
FUZZER
X + 0
COMPLEX INPUT STRUCTURES NEED SYNTACTIC FUZZING
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
&
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
&
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
&
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X @
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X @
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X @
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X +
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X +
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X + 0
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
PFUZZER - SURVIVING THE PARSING STAGE
4
PFUZZER
X + 0
def parse_exp(i):
c = input[i]
if isDigit(c):
parse_op(i + 1)
elif isAlpha(c):
parse_op(i + 1)
def parse_op(i):
c = input[i]
if c == '-':
parse_exp(i + 1)
elif c == '+':
parse_exp(i + 1)
else:
raise InvalidSyntax
TOKENIZATION - COMPLEX PARSERS
5
TOKENIZATION - COMPLEX PARSERS
5
X + 0
TOKENIZATION - COMPLEX PARSERS
5
X + 0
TOKENIZER
TOKENIZATION - COMPLEX PARSERS
5
X + 0
TOKENIZER
T_ALPHA T_PLUS T_DIGIT
TOKENIZATION - COMPLEX PARSERS
5
X + 0
TOKENIZER
T_ALPHA T_PLUS T_DIGIT
PARSER
6
TOKENIZATION - COMPLEX PARSERS
X + 0
TOKENIZER
T_ALPHA T_PLUS T_DIGIT
PARSER
6
TOKENIZATION - COMPLEX PARSERS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
X + 0
TOKENIZER
T_ALPHA T_PLUS T_DIGIT
PARSER
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
T_DIGIT
T_ALPHA
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
T_DIGIT
T_ALPHA
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
T_DIGIT
T_ALPHA
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
T_DIGIT
T_ALPHA
T_MINUS
T_PLUS
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
T_DIGIT
T_ALPHA
T_MINUS
T_PLUS
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
T_DIGIT
T_ALPHA
T_MINUS
T_PLUS
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
DYNAMIC TAINTING - LOOKING INTO A PROGRAM
7
T_DIGIT
T_ALPHA
T_MINUS
T_PLUS
T_DIGIT
X + 0
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
&
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
&
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
&
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X
Tokenmapping
String Token
A .. Z, a .. z T_ALPHA
0 .. 9 T_DIGIT
- T_MINUS
+ T_PLUS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
Tokenmapping
String Token
A .. Z, a .. z T_ALPHA
0 .. 9 T_DIGIT
- T_MINUS
+ T_PLUS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X 3
Tokenmapping
String Token
A .. Z, a .. z T_ALPHA
0 .. 9 T_DIGIT
- T_MINUS
+ T_PLUS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X 3
Tokenmapping
String Token
A .. Z, a .. z T_ALPHA
0 .. 9 T_DIGIT
- T_MINUS
+ T_PLUS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X 3
Tokenmapping
String Token
A .. Z, a .. z T_ALPHA
0 .. 9 T_DIGIT
- T_MINUS
+ T_PLUS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X +
Tokenmapping
String Token
A .. Z, a .. z T_ALPHA
0 .. 9 T_DIGIT
- T_MINUS
+ T_PLUS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X +
Tokenmapping
String Token
A .. Z, a .. z T_ALPHA
0 .. 9 T_DIGIT
- T_MINUS
+ T_PLUS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X + 0
Tokenmapping
String Token
A .. Z, a .. z T_ALPHA
0 .. 9 T_DIGIT
- T_MINUS
+ T_PLUS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE
8
LFUZZER
X + 0
Tokenmapping
String Token
A .. Z, a .. z T_ALPHA
0 .. 9 T_DIGIT
- T_MINUS
+ T_PLUS
def parse_exp(i):
c = input[i]
token = tokenize(c)
if token == T_DIGIT:
parse_op(i + 1)
elif token == T_ALPHA:
parse_op(i + 1)
def parse_op(i):
c = input[i]
token = tokenize(c)
if token == T_MINUS:
parse_exp(i + 1)
elif token == T_PLUS:
parse_exp(i + 1)
else:
raise InvalidSyntax
def tokenize(c):
if isDigit(c):
return T_DIGIT
elif isAlpha(c):
return T_ALPHA
elif c == '-':
return T_MINUS
elif c == '+':
return T_PLUS
else:
raise InvalidToken
LFUZZER - BOOSTING FUZZERS
9
LFUZZER - BOOSTING FUZZERS
9
0 .. 9

A .. Z

a .. z

+

-
TOKENS
LFUZZER - BOOSTING FUZZERS
9
0 .. 9

A .. Z

a .. z

+

-
TOKENS
0 + 5

a + 6
SAMPLE INPUTS
LFUZZER - BOOSTING FUZZERS
9
0 .. 9

A .. Z

a .. z

+

-
TOKENS
0 + 5

a + 6
SAMPLE INPUTS
AFL

MIMID*

LIBFUZZER

…

YOURFAVORITEFUZZER
FUZZER
* In: "Mining Input Grammars from Dynamic Control Flow" at FSE 2020
LFUZZER - BOOSTING FUZZERS
9
0 .. 9

A .. Z

a .. z

+

-
TOKENS
0 + 5

a + 6
SAMPLE INPUTS
AFL

MIMID*

LIBFUZZER

…

YOURFAVORITEFUZZER
FUZZER
A - K

8 - I + P - q

R + y - 6 + u

…
INPUTS
* In: "Mining Input Grammars from Dynamic Control Flow" at FSE 2020
LFUZZER - BOOSTING FUZZERS
9
0 .. 9

A .. Z

a .. z

+

-
TOKENS
0 + 5

a + 6
SAMPLE INPUTS
AFL

MIMID*

LIBFUZZER

…

YOURFAVORITEFUZZER
FUZZER
A - K

8 - I + P - q

R + y - 6 + u

…
INPUTS
PROGRAM UNDER TEST
* In: "Mining Input Grammars from Dynamic Control Flow" at FSE 2020
LFUZZER - BOOSTING FUZZERS
9
0 .. 9

A .. Z

a .. z

+

-
TOKENS
0 + 5

a + 6
SAMPLE INPUTS
AFL

MIMID*

LIBFUZZER

…

YOURFAVORITEFUZZER
FUZZER
A - K

8 - I + P - q

R + y - 6 + u

…
INPUTS
PROGRAM UNDER TEST
* In: "Mining Input Grammars from Dynamic Control Flow" at FSE 2020
EVALUATION - TOKENS AND COVERAGE
10
EVALUATION - TOKENS AND COVERAGE
10
Fsv
ini
Fjson
lisS
tinyF
mjs
6uEjeFt
0
20
40
60
80
TokensExtraFted
6tring ExtraFtion
lFuzzer
NUMBER OF VALID TOKENS
EXTRACTED
EVALUATION - TOKENS AND COVERAGE
10
Fsv
ini
Fjson
lisS
tinyF
mjs
6uEjeFt
0
20
40
60
80
TokensExtraFted
6tring ExtraFtion
lFuzzer
NUMBER OF VALID TOKENS
EXTRACTED
Fsv
ini
Fjson
lisS
tinyF
mjs
SuEjeFt
0
25
50
75
100
125
150
175
200
7okensExtraFted
String ExtraFtion
lFuzzer
NUMBER OF INVALID TOKENS
EXTRACTED
EVALUATION - TOKENS AND COVERAGE
10
Fsv
ini
Fjson
lisS
tinyF
mjs
6uEjeFt
0
20
40
60
80
TokensExtraFted
6tring ExtraFtion
lFuzzer
NUMBER OF VALID TOKENS
EXTRACTED
Fsv
ini
Fjson
lisS
tinyF
mjs
SuEjeFt
0
25
50
75
100
125
150
175
200
7okensExtraFted
String ExtraFtion
lFuzzer
NUMBER OF INVALID TOKENS
EXTRACTED
0 4 8 12 16 20 24
TLme (h)
0
5
10
15
20
25
30
35
CoverDge(%)
mjs
A)L
A)L_DLFt
p)uzzer
p)uzzer + A)L
l)uzzer + A)L
COVERAGE OVER TIME FOR MJS
11
11
11
11
11
11
GITHUB.COM/UDS-SE/LFUZZER

More Related Content

PDF
LET US C (5th EDITION) CHAPTER 2 ANSWERS
KavyaSharma65
 
PPT
Unit2 C
arnold 7490
 
DOCX
Hargun
Mukund Trivedi
 
DOCX
Cd practical file (1) start se
dalipkumar64
 
PPT
12 lec 12 loop
kapil078
 
DOCX
Let us C (by yashvant Kanetkar) chapter 3 Solution
Hazrat Bilal
 
LET US C (5th EDITION) CHAPTER 2 ANSWERS
KavyaSharma65
 
Unit2 C
arnold 7490
 
Cd practical file (1) start se
dalipkumar64
 
12 lec 12 loop
kapil078
 
Let us C (by yashvant Kanetkar) chapter 3 Solution
Hazrat Bilal
 

What's hot (20)

PDF
c-programming-using-pointers
Sushil Mishra
 
PDF
The solution manual of c by robin
Abdullah Al Naser
 
RTF
Ansi c
dayaramjatt001
 
DOCX
DataStructures notes
Lakshmi Sarvani Videla
 
DOC
Infix to-postfix examples
mua99
 
PDF
Datastructures asignment
sreekanth3dce
 
DOCX
Cpds lab
praveennallavelly08
 
PDF
C programms
Mukund Gandrakota
 
DOCX
Data Structures Using C Practical File
Rahul Chugh
 
DOCX
ADA FILE
Gaurav Singh
 
DOCX
C Programming
Sumant Diwakar
 
PPTX
Simple c program
Ravi Singh
 
DOC
C basics
MSc CST
 
PDF
SPL 8 | Loop Statements in C
Mohammad Imam Hossain
 
DOCX
Program flowchart
Sowri Rajan
 
DOCX
Stack prgs
Ssankett Negi
 
PDF
Chapter 5 Balagurusamy Programming ANSI in c
BUBT
 
PDF
C++ Programming - 1st Study
Chris Ohk
 
c-programming-using-pointers
Sushil Mishra
 
The solution manual of c by robin
Abdullah Al Naser
 
DataStructures notes
Lakshmi Sarvani Videla
 
Infix to-postfix examples
mua99
 
Datastructures asignment
sreekanth3dce
 
C programms
Mukund Gandrakota
 
Data Structures Using C Practical File
Rahul Chugh
 
ADA FILE
Gaurav Singh
 
C Programming
Sumant Diwakar
 
Simple c program
Ravi Singh
 
C basics
MSc CST
 
SPL 8 | Loop Statements in C
Mohammad Imam Hossain
 
Program flowchart
Sowri Rajan
 
Stack prgs
Ssankett Negi
 
Chapter 5 Balagurusamy Programming ANSI in c
BUBT
 
C++ Programming - 1st Study
Chris Ohk
 
Ad

Similar to lFuzzer - Learning Input Tokens for Effective Fuzzing (20)

PDF
data structure and algorithm.pdf
Asrinath1
 
PDF
Applications of stack
A. S. M. Shafi
 
PDF
Assignment on Numerical Method C Code
Syed Ahmed Zaki
 
PDF
VTU Data Structures Lab Manual
Nithin Kumar,VVCE, Mysuru
 
PDF
Please need help on C++ language.Infix to Postfix) Write a program.pdf
pristiegee
 
DOCX
Write a program to check a given number is prime or not
aluavi
 
DOC
Ada file
Kumar Gaurav
 
PDF
Naive application of Machine Learning to Software Development
Andriy Khavryuchenko
 
PDF
openFrameworks、サウンド機能・音響合成、ofxMaxim, ofxOsc, ofxPd, ofxSuperCollider
Atsushi Tadokoro
 
PDF
Functional programming in Python
Colin Su
 
PPTX
Stack and queue
Shakila Mahjabin
 
PPTX
Stack Data Structure Intro and Explanation
RitikaLohiya2
 
PPS
pointers 1
gaurav koriya
 
PDF
C Code and the Art of Obfuscation
guest9006ab
 
DOCX
Solutionsfor co2 C Programs for data structures
Lakshmi Sarvani Videla
 
PDF
Swift School #1
Sergey Pronin
 
PPTX
Stack,queue and linked list data structure.pptx
yukti266975
 
PDF
design and analysis of algorithm Lab files
Nitesh Dubey
 
PDF
Pratt Parser in Python
Percolate
 
PDF
Data structure and algorithm.(dsa)
mailmerk
 
data structure and algorithm.pdf
Asrinath1
 
Applications of stack
A. S. M. Shafi
 
Assignment on Numerical Method C Code
Syed Ahmed Zaki
 
VTU Data Structures Lab Manual
Nithin Kumar,VVCE, Mysuru
 
Please need help on C++ language.Infix to Postfix) Write a program.pdf
pristiegee
 
Write a program to check a given number is prime or not
aluavi
 
Ada file
Kumar Gaurav
 
Naive application of Machine Learning to Software Development
Andriy Khavryuchenko
 
openFrameworks、サウンド機能・音響合成、ofxMaxim, ofxOsc, ofxPd, ofxSuperCollider
Atsushi Tadokoro
 
Functional programming in Python
Colin Su
 
Stack and queue
Shakila Mahjabin
 
Stack Data Structure Intro and Explanation
RitikaLohiya2
 
pointers 1
gaurav koriya
 
C Code and the Art of Obfuscation
guest9006ab
 
Solutionsfor co2 C Programs for data structures
Lakshmi Sarvani Videla
 
Swift School #1
Sergey Pronin
 
Stack,queue and linked list data structure.pptx
yukti266975
 
design and analysis of algorithm Lab files
Nitesh Dubey
 
Pratt Parser in Python
Percolate
 
Data structure and algorithm.(dsa)
mailmerk
 
Ad

Recently uploaded (20)

PDF
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
PPTX
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
PPTX
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
PPTX
The Toxic Effects of Aflatoxin B1 and Aflatoxin M1 on Kidney through Regulati...
OttokomaBonny
 
PPTX
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
PPTX
Pharmacognosy: ppt :pdf :pharmacognosy :
Vishnukanchi darade
 
PDF
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
PPTX
Seminar on ethics in biomedical research
poojabisht244
 
PPTX
2019 Upper Respiratory Tract Infections.pptx
jackophyta10
 
PPTX
General Characters and classification up to Order Level of Sub Class Pterygot...
Dr Showkat Ahmad Wani
 
PDF
Renewable Energy Resources (Solar, Wind, Nuclear, Geothermal) Presentation
RimshaNaeem23
 
PDF
Evaluating Benchmark Quality: a Mutation-Testing- Based Methodology
ESUG
 
PDF
Little Red Dots As Late-stage Quasi-stars
Sérgio Sacani
 
PDF
urticaria-1775-rahulkalal-250606145215-0ff37bc9.pdf
GajananPatil761074
 
PDF
N-enhancement in GN-z11: First evidence for supermassive stars nucleosynthesi...
Sérgio Sacani
 
PPTX
General Characters and Classification of Su class Apterygota.pptx
Dr Showkat Ahmad Wani
 
PDF
Integrating Executable Requirements in Prototyping
ESUG
 
PPTX
Qualification of.UV visible spectrophotometer pptx
shrutipandit17
 
PPTX
How to Add SBCGlobal.net Email to MacBook Air in Minutes
raymondjones7273
 
PPTX
Discovery of Novel Antibiotics from Uncultured Microbes.pptx
SaakshiSharma26
 
The Cosmic Symphony: How Photons Shape the Universe and Our Place Within It
kutatomoshi
 
Home Garden as a Component of Agroforestry system : A survey-based Study
AkhangshaRoy
 
Hydrocarbons Pollution. OIL pollutionpptx
AkCreation33
 
The Toxic Effects of Aflatoxin B1 and Aflatoxin M1 on Kidney through Regulati...
OttokomaBonny
 
fghvqwhfugqaifbiqufbiquvbfuqvfuqyvfqvfouiqvfq
PERMISONJERWIN
 
Pharmacognosy: ppt :pdf :pharmacognosy :
Vishnukanchi darade
 
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
Seminar on ethics in biomedical research
poojabisht244
 
2019 Upper Respiratory Tract Infections.pptx
jackophyta10
 
General Characters and classification up to Order Level of Sub Class Pterygot...
Dr Showkat Ahmad Wani
 
Renewable Energy Resources (Solar, Wind, Nuclear, Geothermal) Presentation
RimshaNaeem23
 
Evaluating Benchmark Quality: a Mutation-Testing- Based Methodology
ESUG
 
Little Red Dots As Late-stage Quasi-stars
Sérgio Sacani
 
urticaria-1775-rahulkalal-250606145215-0ff37bc9.pdf
GajananPatil761074
 
N-enhancement in GN-z11: First evidence for supermassive stars nucleosynthesi...
Sérgio Sacani
 
General Characters and Classification of Su class Apterygota.pptx
Dr Showkat Ahmad Wani
 
Integrating Executable Requirements in Prototyping
ESUG
 
Qualification of.UV visible spectrophotometer pptx
shrutipandit17
 
How to Add SBCGlobal.net Email to MacBook Air in Minutes
raymondjones7273
 
Discovery of Novel Antibiotics from Uncultured Microbes.pptx
SaakshiSharma26
 

lFuzzer - Learning Input Tokens for Effective Fuzzing

  • 1. LEARNING INPUT TOKENS FOR EFFECTIVE FUZZING BJÖRN MATHIS, RAHUL GOPINATH, ANDREAS ZELLER
  • 2. FUZZING - THE ART OF AUTOMATIC BUG FINDING 2 PROGRAM UNDER TESTFUZZER
  • 3. FUZZING - THE ART OF AUTOMATIC BUG FINDING 2 PROGRAM UNDER TEST 7245 FUZZER
  • 4. FUZZING - THE ART OF AUTOMATIC BUG FINDING 2 PROGRAM UNDER TEST 7245 FUZZER
  • 5. FUZZING - THE ART OF AUTOMATIC BUG FINDING 2 PROGRAM UNDER TEST 7245 FUZZER
  • 6. FUZZING - THE ART OF AUTOMATIC BUG FINDING 2 PROGRAM UNDER TESTFUZZER
  • 7. FUZZING - THE ART OF AUTOMATIC BUG FINDING 2 PROGRAM UNDER TEST C4tscs FUZZER
  • 8. FUZZING - THE ART OF AUTOMATIC BUG FINDING 2 PROGRAM UNDER TEST C4tscs FUZZER
  • 9. FUZZING - THE ART OF AUTOMATIC BUG FINDING 2 PROGRAM UNDER TEST C4tscs FUZZER
  • 10. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER
  • 11. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER
  • 12. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER
  • 13. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER C4tscs
  • 14. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER C4tscs
  • 15. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER
  • 16. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER X + 0
  • 17. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER X + 0
  • 18. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER X + 0
  • 19. PROGRAM UNDER TEST FUZZING - THE ART OF AUTOMATIC BUG FINDING 3 FUZZER X + 0 COMPLEX INPUT STRUCTURES NEED SYNTACTIC FUZZING
  • 20. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 21. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER & def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 22. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER & def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 23. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER & def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 24. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 25. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 26. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 27. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X @ def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 28. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X @ def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 29. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X @ def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 30. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X + def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 31. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X + def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 32. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X + 0 def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 33. PFUZZER - SURVIVING THE PARSING STAGE 4 PFUZZER X + 0 def parse_exp(i): c = input[i] if isDigit(c): parse_op(i + 1) elif isAlpha(c): parse_op(i + 1) def parse_op(i): c = input[i] if c == '-': parse_exp(i + 1) elif c == '+': parse_exp(i + 1) else: raise InvalidSyntax
  • 35. TOKENIZATION - COMPLEX PARSERS 5 X + 0
  • 36. TOKENIZATION - COMPLEX PARSERS 5 X + 0 TOKENIZER
  • 37. TOKENIZATION - COMPLEX PARSERS 5 X + 0 TOKENIZER T_ALPHA T_PLUS T_DIGIT
  • 38. TOKENIZATION - COMPLEX PARSERS 5 X + 0 TOKENIZER T_ALPHA T_PLUS T_DIGIT PARSER
  • 39. 6 TOKENIZATION - COMPLEX PARSERS X + 0 TOKENIZER T_ALPHA T_PLUS T_DIGIT PARSER
  • 40. 6 TOKENIZATION - COMPLEX PARSERS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax X + 0 TOKENIZER T_ALPHA T_PLUS T_DIGIT PARSER
  • 41. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 42. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 43. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 44. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 45. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 46. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 47. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 T_DIGIT T_ALPHA X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 48. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 T_DIGIT T_ALPHA X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 49. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 T_DIGIT T_ALPHA X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 50. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 T_DIGIT T_ALPHA T_MINUS T_PLUS X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 51. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 T_DIGIT T_ALPHA T_MINUS T_PLUS X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 52. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 T_DIGIT T_ALPHA T_MINUS T_PLUS X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 53. DYNAMIC TAINTING - LOOKING INTO A PROGRAM 7 T_DIGIT T_ALPHA T_MINUS T_PLUS T_DIGIT X + 0 def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 54. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 55. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER & def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 56. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER & def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 57. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER & def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 58. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 59. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 60. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 61. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X Tokenmapping String Token A .. Z, a .. z T_ALPHA 0 .. 9 T_DIGIT - T_MINUS + T_PLUS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 62. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER Tokenmapping String Token A .. Z, a .. z T_ALPHA 0 .. 9 T_DIGIT - T_MINUS + T_PLUS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 63. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X 3 Tokenmapping String Token A .. Z, a .. z T_ALPHA 0 .. 9 T_DIGIT - T_MINUS + T_PLUS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 64. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X 3 Tokenmapping String Token A .. Z, a .. z T_ALPHA 0 .. 9 T_DIGIT - T_MINUS + T_PLUS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 65. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X 3 Tokenmapping String Token A .. Z, a .. z T_ALPHA 0 .. 9 T_DIGIT - T_MINUS + T_PLUS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 66. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X + Tokenmapping String Token A .. Z, a .. z T_ALPHA 0 .. 9 T_DIGIT - T_MINUS + T_PLUS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 67. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X + Tokenmapping String Token A .. Z, a .. z T_ALPHA 0 .. 9 T_DIGIT - T_MINUS + T_PLUS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 68. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X + 0 Tokenmapping String Token A .. Z, a .. z T_ALPHA 0 .. 9 T_DIGIT - T_MINUS + T_PLUS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 69. LFUZZER - SURVIVING THE TOKENIZATION AND PARSING STAGE 8 LFUZZER X + 0 Tokenmapping String Token A .. Z, a .. z T_ALPHA 0 .. 9 T_DIGIT - T_MINUS + T_PLUS def parse_exp(i): c = input[i] token = tokenize(c) if token == T_DIGIT: parse_op(i + 1) elif token == T_ALPHA: parse_op(i + 1) def parse_op(i): c = input[i] token = tokenize(c) if token == T_MINUS: parse_exp(i + 1) elif token == T_PLUS: parse_exp(i + 1) else: raise InvalidSyntax def tokenize(c): if isDigit(c): return T_DIGIT elif isAlpha(c): return T_ALPHA elif c == '-': return T_MINUS elif c == '+': return T_PLUS else: raise InvalidToken
  • 70. LFUZZER - BOOSTING FUZZERS 9
  • 71. LFUZZER - BOOSTING FUZZERS 9 0 .. 9 A .. Z a .. z + - TOKENS
  • 72. LFUZZER - BOOSTING FUZZERS 9 0 .. 9 A .. Z a .. z + - TOKENS 0 + 5
 a + 6 SAMPLE INPUTS
  • 73. LFUZZER - BOOSTING FUZZERS 9 0 .. 9 A .. Z a .. z + - TOKENS 0 + 5
 a + 6 SAMPLE INPUTS AFL
 MIMID*
 LIBFUZZER … YOURFAVORITEFUZZER FUZZER * In: "Mining Input Grammars from Dynamic Control Flow" at FSE 2020
  • 74. LFUZZER - BOOSTING FUZZERS 9 0 .. 9 A .. Z a .. z + - TOKENS 0 + 5
 a + 6 SAMPLE INPUTS AFL
 MIMID*
 LIBFUZZER … YOURFAVORITEFUZZER FUZZER A - K 8 - I + P - q R + y - 6 + u … INPUTS * In: "Mining Input Grammars from Dynamic Control Flow" at FSE 2020
  • 75. LFUZZER - BOOSTING FUZZERS 9 0 .. 9 A .. Z a .. z + - TOKENS 0 + 5
 a + 6 SAMPLE INPUTS AFL
 MIMID*
 LIBFUZZER … YOURFAVORITEFUZZER FUZZER A - K 8 - I + P - q R + y - 6 + u … INPUTS PROGRAM UNDER TEST * In: "Mining Input Grammars from Dynamic Control Flow" at FSE 2020
  • 76. LFUZZER - BOOSTING FUZZERS 9 0 .. 9 A .. Z a .. z + - TOKENS 0 + 5
 a + 6 SAMPLE INPUTS AFL
 MIMID*
 LIBFUZZER … YOURFAVORITEFUZZER FUZZER A - K 8 - I + P - q R + y - 6 + u … INPUTS PROGRAM UNDER TEST * In: "Mining Input Grammars from Dynamic Control Flow" at FSE 2020
  • 77. EVALUATION - TOKENS AND COVERAGE 10
  • 78. EVALUATION - TOKENS AND COVERAGE 10 Fsv ini Fjson lisS tinyF mjs 6uEjeFt 0 20 40 60 80 TokensExtraFted 6tring ExtraFtion lFuzzer NUMBER OF VALID TOKENS EXTRACTED
  • 79. EVALUATION - TOKENS AND COVERAGE 10 Fsv ini Fjson lisS tinyF mjs 6uEjeFt 0 20 40 60 80 TokensExtraFted 6tring ExtraFtion lFuzzer NUMBER OF VALID TOKENS EXTRACTED Fsv ini Fjson lisS tinyF mjs SuEjeFt 0 25 50 75 100 125 150 175 200 7okensExtraFted String ExtraFtion lFuzzer NUMBER OF INVALID TOKENS EXTRACTED
  • 80. EVALUATION - TOKENS AND COVERAGE 10 Fsv ini Fjson lisS tinyF mjs 6uEjeFt 0 20 40 60 80 TokensExtraFted 6tring ExtraFtion lFuzzer NUMBER OF VALID TOKENS EXTRACTED Fsv ini Fjson lisS tinyF mjs SuEjeFt 0 25 50 75 100 125 150 175 200 7okensExtraFted String ExtraFtion lFuzzer NUMBER OF INVALID TOKENS EXTRACTED 0 4 8 12 16 20 24 TLme (h) 0 5 10 15 20 25 30 35 CoverDge(%) mjs A)L A)L_DLFt p)uzzer p)uzzer + A)L l)uzzer + A)L COVERAGE OVER TIME FOR MJS
  • 81. 11
  • 82. 11
  • 83. 11
  • 84. 11
  • 85. 11