SlideShare a Scribd company logo
How To Make The
Fastest Router
In Python
Makoto Kuwata
https://github.com/kwatch/
PloneConference 2018 Tokyo
Abstract
‣ TL;DR
‣ You can make router much faster (max: x10)
‣ Requirements
‣ Python3
‣ Experience of Web Application Framework (Django, Flask, Plone, etc)
‣ Sample Code
‣ https://github.com/kwatch/router-sample/
‣ https://github.com/kwatch/keight/tree/python/ (Framework)
Table of Contents
‣ What is Router?
‣ Linear Search
‣ Naive / Prefix String / Fixed Path Dictionary
‣ Regular Expression
‣ Naive / Smart / Optimized
‣ State Machine
‣ Conclusion
What is Router?
What is Router?
‣ Router is a component of web app framework (WAF).
‣ Router determines request handler according to request
method and request path.

Handler A
App
Server
Handler B
Handler C
Client
: HTTP Request
: HTTP Response
WSGI
App
Server Side
Router determines
"which handler?"
Request Handler Example
class BooksAPI(RequestHandler):
with on.path('/api/books/'):
@on('GET')
def do_index(self):
return {"action": "index"}
@on('POST')
def do_create(self):
return {"action": "create"}
....
← handler class
← URL Path
← request method
← handler func
← request method
← handler func
Request Handler Example
....
with on.path('/api/books/{id:int}'):
@on('GET')
def do_show(self, id):
return {"action": "show", "id": id}
@on('PUT')
def do_update(self, id):
return {"action": "update", "id": id}
@on('DELETE')
def do_delete(self, id):
return {"action": "delete", "id": id}
← URL Path
← request method
← handler func
← request method
← handler func
← request method
← handler func
Mapping Example: Request to Handler
mapping_table = [
## method path class func
("GET" , r"/api/books/" , BooksAPI , do_index),
("POST" , r"/api/books/" , BooksAPI , do_create),

("GET" , r"/api/books/(d+)" , BooksAPI , do_show),
("PUT" , r"/api/books/(d+)" , BooksAPI , do_update),

("DELETE", r"/api/books/(d+)" , BooksAPI , do_delete),

("GET" , r"/api/orders/" , OrdersAPI, do_index),

("POST" , r"/api/orders/" , OrdersAPI, do_create),

("GET" , r"/api/orders/(d+)", OrdersAPI, do_show),
("PUT" , r"/api/orders/(d+)", OrdersAPI, do_update),

("DELETE", r"/api/orders/(d+)", OrdersAPI, do_delete),

....
]
Mapping Example: Request to Handler
mapping_list = [
## path class {method: func}
(r"/api/books/" , BooksAPI , {"GET": do_index,
"POST": do_create}),
(r"/api/books/(d+)" , BooksAPI , {"GET": do_show,
"PUT": do_update,
"DELETE": do_delete}),

(r"/api/orders/" , OrdersAPI, {"GET": do_index,
"POST": do_create}),
(r"/api/orders/(d+)", OrdersAPI, {"GET": do_show,
"PUT": do_update,
"DELETE": do_delete}),

....
]
Same information
in different format
Router Example
>>> router = Router(mapping_list)
>>> router.lookup("GET", "/api/books/")
(BooksAPI, do_index, [])
>>> router.lookup("GET", "/api/books/123")
(BooksAPI, do_show, [123])
Router Example
### 404 Not Found
>>> router.lookup("GET", "/api/books/123/comments")
(None, None, None)
### 405 Method Not Allowed
>>> router.lookup("POST", "/api/books/123")
(BooksAPI, None, [123])
Linear Search
(Naive)
Linear Search
mapping = [
(r"/api/books/" , BooksAPI , {"GET": do_index,
"POST": do_create}),
(r"/api/books/(d+)" , BooksAPI , {"GET": do_show,
"PUT": do_update,
"DELETE": do_delete}),

(r"/api/orders/" , OrdersAPI, {"GET": do_index,
"POST": do_create}),
(r"/api/orders/(d+)", OrdersAPI, {"GET": do_show,
"PUT": do_update,
"DELETE": do_delete}),

....
]
Router Class
class LinearNaiveRouter(Router):
def __init__(self, mapping):
self._mapping_list = 
[ (compile_path(path), klass, funcs)
for path, klass, funcs in mapping ]
def lookup(req_meth, req_path):
for rexp, klass, funcs in self._mapping_list:
m = rexp.match(req_path)
if m:
params = [ int(v) for v in m.groups() ]
func = funcs.get(req_meth)
return klass, func, params
return None, None, None
Benchmark (Data)
mapping_list = [
(r'/api/aaa' , DummyAPI, {"GET": ...}),
(r'/api/aaa/{id:int}', DummyAPI, {"GET": ...}),
(r'/api/bbb' , DummyAPI, {"GET": ...}),
(r'/api/bbb/{id:int}', DummyAPI, {"GET": ...}),
....
(r'/api/yyy' , DummyAPI, {"GET": ...}),
(r'/api/yyy/{id:int}', DummyAPI, {"GET": ...}),
(r'/api/zzz' , DummyAPI, {"GET": ...}),
(r'/api/zzz/{id:int}', DummyAPI, {"GET": ...}),
]
### Benchmark environment:
### AWS EC2 t3.nano, Ubuntu 18.04, Python 3.6.6
See sample code
for details
Benchmark
0 10 20 30 40 50
Linear Naive
Seconds
(1M Requests)
/api/aaa
/api/aaa/{id}
/api/zzz
/api/zzz/{id}
sec
SlowerFaster
very fast on top of list
(/api/aaa, /api/aaa/{id})
very slow on bottom of list
(/api/zzz, /api/zzz/{id})
Pros.
Cons.
Pros & Cons
✗ Very slow when many mapping entries exist.
✓ Easy to understand and implement
Linear Search
(Prefix String)
Prefix String
mapping_list = [
("/books" , r"/books" , BooksAPI , {"GET": ...}),

("/books/" , r"/books/(d+)" , BooksAPI , {"GET": ...}),

("/orders" , r"/orders" , OrdersAPI, {"GET": ...}),

("/orders/", r"/orders/(d+)", OrdersAPI, {"GET": ...}),

]
for prefix, rexp, klass, funcs in mapping:
if not "/api/orders/123".startswith(prefix):
continue
m = rexp.match("/api/orders/123")
if m:
...
Much faster than
rexp.match()
(replace expensive operation
with cheap operation)
Prefix strings
Router Class
def prefix_str(s):
return s.split('{', 1)[0]
class PrefixLinearRouter(Router):
def __init__(self, mapping):
for path, klass, funcs in mapping:
prefix = prefix_str(path)
rexp = compile_path(path)
t = (prefix, rexp, klass, funcs)
self._mapping_list.append(t)
...
Router Class
....
def lookup(req_meth, req_path):
for prefix, rexp, klass, funcs in self._mapping:
if not req_path.startswith(prefix):
continue
m = rexp.match(req_path)
if m:
params = [ int(v) for v in m.groups() ]
func = funcs.get(req_meth)
return klass, func, params
return None, None, None
Much faster than
rexp.match()
Benchmark
0 10 20 30 40 50
Linear Naive
Prefix Str
Seconds
(1M Requests)
/api/aaa
/api/aaa/{id}
/api/zzz
/api/zzz/{id}
sec
SlowerFaster
about twice as fast as
naive implementation
Pros.
Cons.
Pros & Cons
✗ Still slow when many mapping entries exist.
✓ Makes linear search faster.
✓ Easy to understand and implement.
Linear Search
(Fixed Path Dictionary)
Fixed Path Dictionary
## variable path (contains one or more path parameters)
mapping_list = [
("/books" , r"/books" , BooksAPI , {"GET": ...}),

("/books/" , r"/books/(d+)" , BooksAPI , {"GET": ...}),

("/orders" , r"/orders" , OrdersAPI, {"GET": ...}),

("/orders/", r"/orders/(d+)", OrdersAPI, {"GET": ...}),

]
## fixed path (contains no path parameters)
mapping_dict = {
r"/books" : (BooksAPI , {"GET": ...}, []),
r"/orders": (OrdersAPI, {"GET": ...}, []),
}
Use fixed path as key of dict
Move fixed path to dict
Router Class
class FixedLinearRouter(object):
def __init__(self, mapping):
self._mapping_dict = {}
self._mapping_list = []
for path, klass, funcs in mapping:
if '{' not in path:
self._mapping_dict[path] = (klass, funcs, [])

else:
prefix = prefix_str(path)
rexp = compile_path(path)
t = (prefix, rexp, klass, funcs)
self._mapping_list.append(t)
....
Router Class
....
def lookup(req_meth, req_path):
t = self._mapping_dict.get(req_path)
if t: return t
for prefix, rexp, klass, funcs in self._mapping_list:

if not req_path.startswith(prefix)
continue
m = rexp.match(req_path)
if m:
params = [ int(v) for v in m.groups() ]
func = funcs.get(req_meth)
return klass, func, params
return None, None, None
Much faster than
for-loop
Number of entries
are reduced
Benchmark
0 10 20 30 40 50
Linear Naive
Prefix Str
Fixed Path
Seconds
(1M Requests)
/api/aaa
/api/aaa/{id}
/api/zzz
/api/zzz/{id}
sec
SlowerFaster
super fast on fixed path!
three times faster than
naive implementation
Pros.
Cons.
Pros & Cons
✗ Still slow when many mapping entries exist.
✓ Makes fixed path search super faster.
✓ Makes variable path search faster,

because number of entries are reduced.
✓ Easy to understand and implement.
Notice
‣ Don't use r"/api/v{version:int}".
‣ because all API paths are regarded as variable path.
‣ Instead, use r"/api/v1", r"/api/v2", ...
‣ in order to increase number of fixed path.
Regular Expression
(Naive)
Concatenate Regular Expressions
mapping_list = {
(r"/api/books/(d+)" , BooksAPI , {"GET": ...}),

(r"/api/orders/(d+)", OrdersAPI, {"GET": ...}),

(r"/api/users/(d+)" , UsersAPI , {"GET": ...}),

]
arr = [
r"(?P<_0>^/api/books/(d+)$)",
r"(?P<_1>^/api/orders/(d+)$)",
r"(?P<_2>^/api/users/(d+)$)",
]
all_rexp = re.compile("|".join(arr))
Named groups
Matching
m = all_rexp.match("/api/users/123")
d = m.groupdict() #=> {"_0": None,
# "_1": None,
# "_2": "/api/users/123"}
for k, v in d.items():
if v:
i = int(v[1:]) # ex: "_2" -> 2
break
_, klass, funcs, pos, nparams = mapping_list[i]
arr = m.groups() #=> (None, None, None, None,
# "/api/users/123", "123")
params = arr[5:6] #=> {"123"}
Router Class
class NaiveRegexpRouter(Router):
def __init__(self, mapping):
self._mapping_dict = {}
self._mapping_list = []
arr = []; i = 0; pos = 0
for path, klass, funcs in mapping:
if '{' not in path:
self._mapping_dict[path] = (klass, funcs, [])
else:
rexp = compile_path(path); pat = rexp.pattern
arr.append("(?P<_%s>%s)" % (i, pat))
t = (klass, funcs, pos, path.count('{'))
self._mapping_list.append(t)
i += 1; pos += 1 + path.count('{')
self._all_rexp = re.compile("|".join(arr))
Router Class
....
def lookup(req_meth, req_path):
t = self._mapping_dict.get(req_path)
if t: return t
m = self._all_rexp.match(req_path)
if m:
for k, v in m.groupdict().items():
if v:
i = int(v[1:])
break
klass, funcs, pos, nparams = self._mapping_list[i]

params = m.groups()[pos:pos+nparams]
func = funcs.get(req_meth)
return klass, func, params
return None, None, None
find index in list
find param values
Benchmark
0 10 20 30 40 50
Linear
Regexp
Naive
Prefix Str
Fixed Path
Naive
Seconds
(1M Requests)
/api/aaa
/api/aaa/{id}
/api/zzz
/api/zzz/{id}
sec
SlowerFaster
slower than
linear search :(
Pros.
Cons.
Pros & Cons
✗ Slower than linear search
✓ Nothing :(
Notice
$ python3 --version
3.4.5
$ python3
>>> import re
>>> arr = ['^/(d+)$'] * 101
>>> re.compile("|".join(arr))
File "/opt/vs/python/3.4.5/lib/python3.4/sre_compile.py",
line 579, in compile
"sorry, but this version only supports 100 named groups"
AssertionError: sorry, but this version only supports 100
named groups
Python <= 3.4 limits number of
groups in a regular expression,
and no work around :(
Regular Expression
(Smart)
Improved Regular Expression
mapping_list = {
(r"/api/books/(d+)" , BooksAPI , {"GET": ...}),
(r"/api/orders/(d+)" , OrdersAPI , {"GET": ...}),
(r"/api/users/(d+)" , UsersAPI , {"GET": ...}),
]
arr = [ r"^/api/books/(?:d+)($)",
r"^/api/orders/(?:d+)($)",
r"^/api/users/(?:d+)($)", ]
all_rexp = re.compile("|".join(arr))
m = all_rexp.match("/api/users/123")
arr = m.groups() #=> (None, None, "")
i = arr.index("") #=> 2
t = mapping_list[i] #=> (r"/api/users/(d+)",
# UsersAPI, {"GET": ...})
No more
named groups
Tuple is much light-
weight than dict
index() is faster
than for-loop
Router Class
class SmartRegexpRouter(Router):
def __init__(self, mapping):
self._mapping_dict = {}
self._mapping_list = []
arr = []
for path, klass, funcs in mapping:
if '{' not in path:
self._mapping_dict[path] = (klass, funcs, [])

else:
rexp = compile_path(path); pat = rexp.pattern

arr.append(pat.replace("(", "(?:")
.replace("$", "($)"))
t = (rexp, klass, funcs)
self._mapping_list.append(t)

self._all_rexp = re.compile("|".join(arr))
Router Class
...
def lookup(req_meth, req_path):
t = self._mapping_dict.get(req_path)
if t: return t
m = self._all_rexp.match(req_path)
if m:
i = m.groups().index("")
rexp, klass, funcs = self._mapping_list[i]
m2 = rexp.match(req_path)
params = [ int(v) for v in m2.groups() ]
func = funcs.get(req_meth)
return klass, func, params
return None, None, None
Matching to find
index in list
Matching to get
param values
Benchmark
0 10 20 30 40 50
Linear
Regexp
Naive
Prefix Str
Fixed Path
Naive
Smart
Seconds
(1M Requests)
/api/aaa
/api/aaa/{id}
/api/zzz
/api/zzz/{id}
sec
SlowerFaster
Difference between
/api/aaa/{id} and
/api/zzz/{id} is small
Pros.
Cons.
Pros & Cons
✗ Slower when number of entries is small.

(due to overhead of twice matching)
✗ May be difficult to debug large regular
expression.
✓ Much faster than ever,

especially when many mapping entries exist.
Regular Expression
(Optimized)
Optimize Regular Expression
## before
arr = [r"^/api/books/(?:d+)($)",
r"^/api/orders/(?:d+)($)",
r"^/api/users/(?:d+)($)"]
all_rexp = re.compile("|".join(arr))
### after
arr = [r"^/api",
r"(?:",
"|".join([r"/books/(?:d+)($)",
r"/orders/(?:d+)($)",
r"/users/(?:d+)($)"]),
r")?"]
all_rexp = re.compile("|".join(arr))
Router Class
class OptimizedRegexpRouter(Router):
def __init__(self, mapping):
## Code is too complicated to show here.
## Please download sample code from github.
## https://github.com/kwatch/router-sample/
def lookup(req_meth, req_path):
## nothing changed; same as previous section
Benchmark
0 10 20 30 40 50
Linear
Regexp
Naive
Prefix Str
Fixed Path
Naive
Smart
Optimized
Seconds
(1M Requests)
/api/aaa
/api/aaa/{id}
/api/zzz
/api/zzz/{id}
sec
SlowerFaster
A little faster
on /api/zzz/{id}
Pros.
Cons.
Pros & Cons
✗ Performance benefit is very small (on Python).
✗ Rather difficult to implement and debug.
✓ A little faster than smart regular expression

when a lot of variable paths exist.
State Machine
State Machine
"api"
"books"
"orders"
"123"
"456"
/d+/
/d+/
/api/books/{id:int}/api/books
/api/orders /api/orders/{id:int}
: Start
: Not Accepted
: Accepted
State Machine: Definition
path = "/api/books"
transition = {
"api": {
"books": {
None: (BooksAPI, {"GET": do_index, ...}),
},
},
}
>>> transition["api"]["books"][None]
(BooksAPI, {"GET": do_index, ...})
>>> transition["api"]["users"][None]
KeyError: 'users'
Use None as terminator
(mark of accepted status)
State Machine: Definition
path = "/api/books/{id:int}"
transition = {
"api": {
"books": {
None: (BooksAPI, {"GET": do_index, ...}),
1: {
None: (BooksAPI, {"GET": do_show, ...}),
},
},
},
}
>>> transition["api"]["books"][1][None]
(BooksAPI, {"GET": do_index, ...})
1 represents int parameter,
2 represents str parameter.
State Machine: Transition
def find(req_path):
req_path = req_path.lstrip('/') #ex: "/a/b/c" -> "a/b/c"
items = req_path.split('/') #ex: "a/b/c" -> ["a","b","c"]
d = transition; params = []
for s in items:
if s in d: d = d[s]
elif 1 in d: d = d[1]; params.append(int(s))
elif 2 in d: d = d[2]; params.append(str(s))
else: return None
if None not in d: return None
klass, funcs = d[None]
return klass, funcs, params
>>> find("/api/books/123")
(BooksAPI, {"GET": do_index, ...}, [123])
Router Class
class StateMachineRouter(Router):
def __init__(self, mapping):
self._mapping_dict = {}
self._mapping_list = []
self._transition = {}
for path, klass, funcs in mapping:
if '{' not in path:
self._mapping_dict[path] = (klass, funcs, [])

else:
self._register(path, klass, funcs)
Router Class
...
PARAM_TYPES = {"int": 1, "str": 2}
def _register(self, path, klass, funcs):
ptypes = self.PARAM_TYPES
d = self._transition
for s in path[1:].split('/'):
key = s
if s[0] == "{" and s[-1] == "}":
## ex: "{id:int}" -> ("id", "int")
pname, ptype = s[1:-1].split(':', 1)
key = ptypes.get(ptype) or ptypes["str"]
d = d.setdefault(key, {})
d[None] = (klass, funcs)
Router Class
...
def lookup(self, req_meth, req_path):
d = self._transition
params = []
for s in req_path[1:].split('/'):
if s in d: d = d[s]
elif 1 in d: d = d[1]; params.append(int(s))
elif 2 in d: d = d[2]; params.append(str(s))
else: return None, None, None
if None in d:
klass, funcs = d[None]
func = funcs.get(req_meth)
return klass, func, params
return None, None, None
Benchmark
0 10 20 30 40 50
Linear
Regexp
StateMachine
Naive
Prefix Str
Fixed Path
Naive
Smart
Optimized
Seconds
(1M Requests)
/api/aaa
/api/aaa/{id}
/api/zzz
/api/zzz/{id}
sec
SlowerFaster
/api/aaa/{id} and
/api/zzz/{id} are
same performance
Benchmark (PyPy3.5)
0 10 20 30 40 50
Linear
Regexp
StateMachine
Naive
Prefix Str
Fixed Path
Naive
Smart
Optimized
Seconds
(1M Requests)
/api/aaa
/api/aaa/{id}
/api/zzz
/api/zzz/{id}
sec
SlowerFaster
Regular Expression is
very slow in PyPy3.5
String operation is
very fast because
JIT friendly
Benchmark (PyPy3.5)
0 1 2 3 4 5
Linear
Regexp
StateMachine
Naive
Prefix Str
Fixed Path
Naive
Smart
Optimized
Seconds
(1M Requests)
/api/aaa
/api/aaa/{id}
/api/zzz
/api/zzz/{id}
sec
SlowerFaster
The fastest method due to
Regexp-free (= JIT friendly)
A little slower than StateMachine
because containing Regexp
Pros.
Cons.
Pros & Cons
✗ Not support complicated pattern.
✗ Requires some effort to support URL path suffix
(ex: /api/books/123.json).
✓ Performance champion in routing area.
✓ Much faster in PyPy3.5, due to regexp-free.
JIT friendly!
Conclusion
Conclusion
‣ Linear Search is slow.
‣ Prefix string and Fixed path dict make it faster.
‣ Regular expression is very fast.
‣ Do your best to avoid named group (or named caption).
‣ State Machine is the fastest method in Python.
‣ Especially in PyPy3, due to regexp-free (= JIT friendly).
One More Thing
My Products
‣ Benchmarker.py
Awesome benchmarking utility.
https://pythonhosted.org/Benchmarker/
‣ Oktest.py
New generation of testing framework.
https://pythonhosted.org/Oktest/
‣ PyTenjin
Super fast and feature-rich template engine.
https://www.kuwata-lab.com/tenjin/pytenjin-users-guide.html
https://bit.ly/tenjinpy_slide (presentation)
Thank You

More Related Content

What's hot (20)

PDF
x64 のスカラー,SIMD 演算性能を測ってみた @ C++ MIX #10
Muneyoshi Suzuki
 
PDF
DomainService の Repository 排除と
エラー表現のパターン
hogesuzuki
 
PPTX
20160526 依存関係逆転の原則
bonjin6770 Kurosawa
 
PDF
MySQLで論理削除と正しく付き合う方法
yoku0825
 
PDF
Let'Swift 2022 PencilKit과 Point in Polygon 알고리즘을 활용한 올가미 툴 개발기
Haeseok Lee
 
PDF
30分でわかる広告エンジンの作り方
Daisuke Yamazaki
 
PDF
C#でもメタプログラミングがしたい!!
TATSUYA HAYAMIZU
 
PDF
実践QBVH
Shuichi Hayashi
 
PDF
ドメインオブジェクトの見つけ方・作り方・育て方
増田 亨
 
PDF
3層アーキテクチャとMVCモデル -LaravelにおけるMVCモデルの流れ-
yoshitaro yoyo
 
PDF
SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)
Takuto Wada
 
PPTX
Spring CloudとZipkinを利用した分散トレーシング
Rakuten Group, Inc.
 
PDF
SQL上級者こそ知って欲しい、なぜO/Rマッパーが重要か?
kwatch
 
PDF
ホットペッパービューティーにおけるモバイルアプリ向けAPIのBFF/Backend分割
Recruit Lifestyle Co., Ltd.
 
PDF
ドメイン駆動設計という設計スタイル
増田 亨
 
PDF
トランザクションスクリプトのすすめ
pospome
 
PDF
Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015
CODE BLUE
 
PDF
GraphQLでフロントエンドの複雑性とたたかう
Yahoo!デベロッパーネットワーク
 
PDF
DDD x CQRS 更新系と参照系で異なるORMを併用して上手くいった話
Koichiro Matsuoka
 
PDF
REST API のコツ
pospome
 
x64 のスカラー,SIMD 演算性能を測ってみた @ C++ MIX #10
Muneyoshi Suzuki
 
DomainService の Repository 排除と
エラー表現のパターン
hogesuzuki
 
20160526 依存関係逆転の原則
bonjin6770 Kurosawa
 
MySQLで論理削除と正しく付き合う方法
yoku0825
 
Let'Swift 2022 PencilKit과 Point in Polygon 알고리즘을 활용한 올가미 툴 개발기
Haeseok Lee
 
30分でわかる広告エンジンの作り方
Daisuke Yamazaki
 
C#でもメタプログラミングがしたい!!
TATSUYA HAYAMIZU
 
実践QBVH
Shuichi Hayashi
 
ドメインオブジェクトの見つけ方・作り方・育て方
増田 亨
 
3層アーキテクチャとMVCモデル -LaravelにおけるMVCモデルの流れ-
yoshitaro yoyo
 
SQLアンチパターン - 開発者を待ち受ける25の落とし穴 (拡大版)
Takuto Wada
 
Spring CloudとZipkinを利用した分散トレーシング
Rakuten Group, Inc.
 
SQL上級者こそ知って欲しい、なぜO/Rマッパーが重要か?
kwatch
 
ホットペッパービューティーにおけるモバイルアプリ向けAPIのBFF/Backend分割
Recruit Lifestyle Co., Ltd.
 
ドメイン駆動設計という設計スタイル
増田 亨
 
トランザクションスクリプトのすすめ
pospome
 
Master Canary Forging: 新しいスタックカナリア回避手法の提案 by 小池 悠生 - CODE BLUE 2015
CODE BLUE
 
GraphQLでフロントエンドの複雑性とたたかう
Yahoo!デベロッパーネットワーク
 
DDD x CQRS 更新系と参照系で異なるORMを併用して上手くいった話
Koichiro Matsuoka
 
REST API のコツ
pospome
 

Similar to How to make the fastest Router in Python (20)

PDF
Python RESTful webservices with Python: Flask and Django solutions
Solution4Future
 
PDF
Eve - REST API for Humans™
Nicola Iarocci
 
ODP
Pyramid Lighter/Faster/Better web apps
Dylan Jay
 
PDF
TurboGears2 Pluggable Applications
Alessandro Molina
 
PPTX
Python Code Camp for Professionals 3/4
DEVCON
 
PPTX
A Deep Dive into RESTful API Design Part 2
VivekKrishna34
 
PDF
Flask docs
Kunal Sangwan
 
PPTX
REST Api Tips and Tricks
Maksym Bruner
 
PPTX
SW Security Lec4 Securing architecture.pptx
KhalidShawky1
 
PDF
Be RESTful (Symfony Camp 2008)
Fabien Potencier
 
KEY
Routes Controllers
Blazing Cloud
 
PDF
The Django Book / Chapter 3: Views and URLconfs
Vincent Chien
 
PDF
REST Introduction (PHP London)
Paul James
 
PPT
PHP Server side restful API - linkedin
Vũ Quang Sơn
 
PDF
RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK - Nicola Iarocci - Co...
Codemotion
 
PDF
The basics of fluentd
Treasure Data, Inc.
 
ODP
Why Python Web Frameworks Are Changing the Web
joelburton
 
PDF
APIs REST Usables con Hypermedia por Javier Ramirez, para codemotion
javier ramirez
 
PPTX
Fundamental concepts of Routing in detiled
poojapp6
 
KEY
Rails Presentation (Anton Dmitriyev)
True-Vision
 
Python RESTful webservices with Python: Flask and Django solutions
Solution4Future
 
Eve - REST API for Humans™
Nicola Iarocci
 
Pyramid Lighter/Faster/Better web apps
Dylan Jay
 
TurboGears2 Pluggable Applications
Alessandro Molina
 
Python Code Camp for Professionals 3/4
DEVCON
 
A Deep Dive into RESTful API Design Part 2
VivekKrishna34
 
Flask docs
Kunal Sangwan
 
REST Api Tips and Tricks
Maksym Bruner
 
SW Security Lec4 Securing architecture.pptx
KhalidShawky1
 
Be RESTful (Symfony Camp 2008)
Fabien Potencier
 
Routes Controllers
Blazing Cloud
 
The Django Book / Chapter 3: Views and URLconfs
Vincent Chien
 
REST Introduction (PHP London)
Paul James
 
PHP Server side restful API - linkedin
Vũ Quang Sơn
 
RESTFUL SERVICES MADE EASY: THE EVE REST API FRAMEWORK - Nicola Iarocci - Co...
Codemotion
 
The basics of fluentd
Treasure Data, Inc.
 
Why Python Web Frameworks Are Changing the Web
joelburton
 
APIs REST Usables con Hypermedia por Javier Ramirez, para codemotion
javier ramirez
 
Fundamental concepts of Routing in detiled
poojapp6
 
Rails Presentation (Anton Dmitriyev)
True-Vision
 
Ad

More from kwatch (20)

PDF
Migr8.rb チュートリアル
kwatch
 
PDF
なんでもID
kwatch
 
PDF
Nippondanji氏に怒られても仕方ない、配列型とJSON型の使い方
kwatch
 
PDF
【SQLインジェクション対策】徳丸先生に怒られない、動的SQLの安全な組み立て方
kwatch
 
PDF
O/Rマッパーによるトラブルを未然に防ぐ
kwatch
 
PDF
正規表現リテラルは本当に必要なのか?
kwatch
 
PDF
【公開終了】Python4PHPer - PHPユーザのためのPython入門 (Python2.5)
kwatch
 
PDF
DBスキーマもバージョン管理したい!
kwatch
 
PDF
PHPとJavaScriptにおけるオブジェクト指向を比較する
kwatch
 
PDF
Fantastic DSL in Python
kwatch
 
PDF
What is wrong on Test::More? / Test::Moreが抱える問題点とその解決策
kwatch
 
PDF
PHP5.5新機能「ジェネレータ」初心者入門
kwatch
 
PDF
Pretty Good Branch Strategy for Git/Mercurial
kwatch
 
PDF
Oktest - a new style testing library for Python -
kwatch
 
PDF
文字列結合のベンチマークをいろんな処理系でやってみた
kwatch
 
PDF
I have something to say about the buzz word "From Java to Ruby"
kwatch
 
PDF
Cより速いRubyプログラム
kwatch
 
PDF
Javaより速いLL用テンプレートエンジン
kwatch
 
PDF
Underlaying Technology of Modern O/R Mapper
kwatch
 
PDF
How to Make Ruby CGI Script Faster - CGIを高速化する小手先テクニック -
kwatch
 
Migr8.rb チュートリアル
kwatch
 
なんでもID
kwatch
 
Nippondanji氏に怒られても仕方ない、配列型とJSON型の使い方
kwatch
 
【SQLインジェクション対策】徳丸先生に怒られない、動的SQLの安全な組み立て方
kwatch
 
O/Rマッパーによるトラブルを未然に防ぐ
kwatch
 
正規表現リテラルは本当に必要なのか?
kwatch
 
【公開終了】Python4PHPer - PHPユーザのためのPython入門 (Python2.5)
kwatch
 
DBスキーマもバージョン管理したい!
kwatch
 
PHPとJavaScriptにおけるオブジェクト指向を比較する
kwatch
 
Fantastic DSL in Python
kwatch
 
What is wrong on Test::More? / Test::Moreが抱える問題点とその解決策
kwatch
 
PHP5.5新機能「ジェネレータ」初心者入門
kwatch
 
Pretty Good Branch Strategy for Git/Mercurial
kwatch
 
Oktest - a new style testing library for Python -
kwatch
 
文字列結合のベンチマークをいろんな処理系でやってみた
kwatch
 
I have something to say about the buzz word "From Java to Ruby"
kwatch
 
Cより速いRubyプログラム
kwatch
 
Javaより速いLL用テンプレートエンジン
kwatch
 
Underlaying Technology of Modern O/R Mapper
kwatch
 
How to Make Ruby CGI Script Faster - CGIを高速化する小手先テクニック -
kwatch
 
Ad

Recently uploaded (20)

PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PPTX
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
PDF
Complete Network Protection with Real-Time Security
L4RGINDIA
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
Wojciech Ciemski for Top Cyber News MAGAZINE. June 2025
Dr. Ludmila Morozova-Buss
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
WooCommerce Workshop: Bring Your Laptop
Laura Hartwig
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Jak MŚP w Europie Środkowo-Wschodniej odnajdują się w świecie AI
dominikamizerska1
 
Complete Network Protection with Real-Time Security
L4RGINDIA
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 

How to make the fastest Router in Python

  • 1. How To Make The Fastest Router In Python Makoto Kuwata https://github.com/kwatch/ PloneConference 2018 Tokyo
  • 2. Abstract ‣ TL;DR ‣ You can make router much faster (max: x10) ‣ Requirements ‣ Python3 ‣ Experience of Web Application Framework (Django, Flask, Plone, etc) ‣ Sample Code ‣ https://github.com/kwatch/router-sample/ ‣ https://github.com/kwatch/keight/tree/python/ (Framework)
  • 3. Table of Contents ‣ What is Router? ‣ Linear Search ‣ Naive / Prefix String / Fixed Path Dictionary ‣ Regular Expression ‣ Naive / Smart / Optimized ‣ State Machine ‣ Conclusion
  • 5. What is Router? ‣ Router is a component of web app framework (WAF). ‣ Router determines request handler according to request method and request path.
 Handler A App Server Handler B Handler C Client : HTTP Request : HTTP Response WSGI App Server Side Router determines "which handler?"
  • 6. Request Handler Example class BooksAPI(RequestHandler): with on.path('/api/books/'): @on('GET') def do_index(self): return {"action": "index"} @on('POST') def do_create(self): return {"action": "create"} .... ← handler class ← URL Path ← request method ← handler func ← request method ← handler func
  • 7. Request Handler Example .... with on.path('/api/books/{id:int}'): @on('GET') def do_show(self, id): return {"action": "show", "id": id} @on('PUT') def do_update(self, id): return {"action": "update", "id": id} @on('DELETE') def do_delete(self, id): return {"action": "delete", "id": id} ← URL Path ← request method ← handler func ← request method ← handler func ← request method ← handler func
  • 8. Mapping Example: Request to Handler mapping_table = [ ## method path class func ("GET" , r"/api/books/" , BooksAPI , do_index), ("POST" , r"/api/books/" , BooksAPI , do_create),
 ("GET" , r"/api/books/(d+)" , BooksAPI , do_show), ("PUT" , r"/api/books/(d+)" , BooksAPI , do_update),
 ("DELETE", r"/api/books/(d+)" , BooksAPI , do_delete),
 ("GET" , r"/api/orders/" , OrdersAPI, do_index),
 ("POST" , r"/api/orders/" , OrdersAPI, do_create),
 ("GET" , r"/api/orders/(d+)", OrdersAPI, do_show), ("PUT" , r"/api/orders/(d+)", OrdersAPI, do_update),
 ("DELETE", r"/api/orders/(d+)", OrdersAPI, do_delete),
 .... ]
  • 9. Mapping Example: Request to Handler mapping_list = [ ## path class {method: func} (r"/api/books/" , BooksAPI , {"GET": do_index, "POST": do_create}), (r"/api/books/(d+)" , BooksAPI , {"GET": do_show, "PUT": do_update, "DELETE": do_delete}),
 (r"/api/orders/" , OrdersAPI, {"GET": do_index, "POST": do_create}), (r"/api/orders/(d+)", OrdersAPI, {"GET": do_show, "PUT": do_update, "DELETE": do_delete}),
 .... ] Same information in different format
  • 10. Router Example >>> router = Router(mapping_list) >>> router.lookup("GET", "/api/books/") (BooksAPI, do_index, []) >>> router.lookup("GET", "/api/books/123") (BooksAPI, do_show, [123])
  • 11. Router Example ### 404 Not Found >>> router.lookup("GET", "/api/books/123/comments") (None, None, None) ### 405 Method Not Allowed >>> router.lookup("POST", "/api/books/123") (BooksAPI, None, [123])
  • 13. Linear Search mapping = [ (r"/api/books/" , BooksAPI , {"GET": do_index, "POST": do_create}), (r"/api/books/(d+)" , BooksAPI , {"GET": do_show, "PUT": do_update, "DELETE": do_delete}),
 (r"/api/orders/" , OrdersAPI, {"GET": do_index, "POST": do_create}), (r"/api/orders/(d+)", OrdersAPI, {"GET": do_show, "PUT": do_update, "DELETE": do_delete}),
 .... ]
  • 14. Router Class class LinearNaiveRouter(Router): def __init__(self, mapping): self._mapping_list = [ (compile_path(path), klass, funcs) for path, klass, funcs in mapping ] def lookup(req_meth, req_path): for rexp, klass, funcs in self._mapping_list: m = rexp.match(req_path) if m: params = [ int(v) for v in m.groups() ] func = funcs.get(req_meth) return klass, func, params return None, None, None
  • 15. Benchmark (Data) mapping_list = [ (r'/api/aaa' , DummyAPI, {"GET": ...}), (r'/api/aaa/{id:int}', DummyAPI, {"GET": ...}), (r'/api/bbb' , DummyAPI, {"GET": ...}), (r'/api/bbb/{id:int}', DummyAPI, {"GET": ...}), .... (r'/api/yyy' , DummyAPI, {"GET": ...}), (r'/api/yyy/{id:int}', DummyAPI, {"GET": ...}), (r'/api/zzz' , DummyAPI, {"GET": ...}), (r'/api/zzz/{id:int}', DummyAPI, {"GET": ...}), ] ### Benchmark environment: ### AWS EC2 t3.nano, Ubuntu 18.04, Python 3.6.6 See sample code for details
  • 16. Benchmark 0 10 20 30 40 50 Linear Naive Seconds (1M Requests) /api/aaa /api/aaa/{id} /api/zzz /api/zzz/{id} sec SlowerFaster very fast on top of list (/api/aaa, /api/aaa/{id}) very slow on bottom of list (/api/zzz, /api/zzz/{id})
  • 17. Pros. Cons. Pros & Cons ✗ Very slow when many mapping entries exist. ✓ Easy to understand and implement
  • 19. Prefix String mapping_list = [ ("/books" , r"/books" , BooksAPI , {"GET": ...}),
 ("/books/" , r"/books/(d+)" , BooksAPI , {"GET": ...}),
 ("/orders" , r"/orders" , OrdersAPI, {"GET": ...}),
 ("/orders/", r"/orders/(d+)", OrdersAPI, {"GET": ...}),
 ] for prefix, rexp, klass, funcs in mapping: if not "/api/orders/123".startswith(prefix): continue m = rexp.match("/api/orders/123") if m: ... Much faster than rexp.match() (replace expensive operation with cheap operation) Prefix strings
  • 20. Router Class def prefix_str(s): return s.split('{', 1)[0] class PrefixLinearRouter(Router): def __init__(self, mapping): for path, klass, funcs in mapping: prefix = prefix_str(path) rexp = compile_path(path) t = (prefix, rexp, klass, funcs) self._mapping_list.append(t) ...
  • 21. Router Class .... def lookup(req_meth, req_path): for prefix, rexp, klass, funcs in self._mapping: if not req_path.startswith(prefix): continue m = rexp.match(req_path) if m: params = [ int(v) for v in m.groups() ] func = funcs.get(req_meth) return klass, func, params return None, None, None Much faster than rexp.match()
  • 22. Benchmark 0 10 20 30 40 50 Linear Naive Prefix Str Seconds (1M Requests) /api/aaa /api/aaa/{id} /api/zzz /api/zzz/{id} sec SlowerFaster about twice as fast as naive implementation
  • 23. Pros. Cons. Pros & Cons ✗ Still slow when many mapping entries exist. ✓ Makes linear search faster. ✓ Easy to understand and implement.
  • 25. Fixed Path Dictionary ## variable path (contains one or more path parameters) mapping_list = [ ("/books" , r"/books" , BooksAPI , {"GET": ...}),
 ("/books/" , r"/books/(d+)" , BooksAPI , {"GET": ...}),
 ("/orders" , r"/orders" , OrdersAPI, {"GET": ...}),
 ("/orders/", r"/orders/(d+)", OrdersAPI, {"GET": ...}),
 ] ## fixed path (contains no path parameters) mapping_dict = { r"/books" : (BooksAPI , {"GET": ...}, []), r"/orders": (OrdersAPI, {"GET": ...}, []), } Use fixed path as key of dict Move fixed path to dict
  • 26. Router Class class FixedLinearRouter(object): def __init__(self, mapping): self._mapping_dict = {} self._mapping_list = [] for path, klass, funcs in mapping: if '{' not in path: self._mapping_dict[path] = (klass, funcs, [])
 else: prefix = prefix_str(path) rexp = compile_path(path) t = (prefix, rexp, klass, funcs) self._mapping_list.append(t) ....
  • 27. Router Class .... def lookup(req_meth, req_path): t = self._mapping_dict.get(req_path) if t: return t for prefix, rexp, klass, funcs in self._mapping_list:
 if not req_path.startswith(prefix) continue m = rexp.match(req_path) if m: params = [ int(v) for v in m.groups() ] func = funcs.get(req_meth) return klass, func, params return None, None, None Much faster than for-loop Number of entries are reduced
  • 28. Benchmark 0 10 20 30 40 50 Linear Naive Prefix Str Fixed Path Seconds (1M Requests) /api/aaa /api/aaa/{id} /api/zzz /api/zzz/{id} sec SlowerFaster super fast on fixed path! three times faster than naive implementation
  • 29. Pros. Cons. Pros & Cons ✗ Still slow when many mapping entries exist. ✓ Makes fixed path search super faster. ✓ Makes variable path search faster,
 because number of entries are reduced. ✓ Easy to understand and implement.
  • 30. Notice ‣ Don't use r"/api/v{version:int}". ‣ because all API paths are regarded as variable path. ‣ Instead, use r"/api/v1", r"/api/v2", ... ‣ in order to increase number of fixed path.
  • 32. Concatenate Regular Expressions mapping_list = { (r"/api/books/(d+)" , BooksAPI , {"GET": ...}),
 (r"/api/orders/(d+)", OrdersAPI, {"GET": ...}),
 (r"/api/users/(d+)" , UsersAPI , {"GET": ...}),
 ] arr = [ r"(?P<_0>^/api/books/(d+)$)", r"(?P<_1>^/api/orders/(d+)$)", r"(?P<_2>^/api/users/(d+)$)", ] all_rexp = re.compile("|".join(arr)) Named groups
  • 33. Matching m = all_rexp.match("/api/users/123") d = m.groupdict() #=> {"_0": None, # "_1": None, # "_2": "/api/users/123"} for k, v in d.items(): if v: i = int(v[1:]) # ex: "_2" -> 2 break _, klass, funcs, pos, nparams = mapping_list[i] arr = m.groups() #=> (None, None, None, None, # "/api/users/123", "123") params = arr[5:6] #=> {"123"}
  • 34. Router Class class NaiveRegexpRouter(Router): def __init__(self, mapping): self._mapping_dict = {} self._mapping_list = [] arr = []; i = 0; pos = 0 for path, klass, funcs in mapping: if '{' not in path: self._mapping_dict[path] = (klass, funcs, []) else: rexp = compile_path(path); pat = rexp.pattern arr.append("(?P<_%s>%s)" % (i, pat)) t = (klass, funcs, pos, path.count('{')) self._mapping_list.append(t) i += 1; pos += 1 + path.count('{') self._all_rexp = re.compile("|".join(arr))
  • 35. Router Class .... def lookup(req_meth, req_path): t = self._mapping_dict.get(req_path) if t: return t m = self._all_rexp.match(req_path) if m: for k, v in m.groupdict().items(): if v: i = int(v[1:]) break klass, funcs, pos, nparams = self._mapping_list[i]
 params = m.groups()[pos:pos+nparams] func = funcs.get(req_meth) return klass, func, params return None, None, None find index in list find param values
  • 36. Benchmark 0 10 20 30 40 50 Linear Regexp Naive Prefix Str Fixed Path Naive Seconds (1M Requests) /api/aaa /api/aaa/{id} /api/zzz /api/zzz/{id} sec SlowerFaster slower than linear search :(
  • 37. Pros. Cons. Pros & Cons ✗ Slower than linear search ✓ Nothing :(
  • 38. Notice $ python3 --version 3.4.5 $ python3 >>> import re >>> arr = ['^/(d+)$'] * 101 >>> re.compile("|".join(arr)) File "/opt/vs/python/3.4.5/lib/python3.4/sre_compile.py", line 579, in compile "sorry, but this version only supports 100 named groups" AssertionError: sorry, but this version only supports 100 named groups Python <= 3.4 limits number of groups in a regular expression, and no work around :(
  • 40. Improved Regular Expression mapping_list = { (r"/api/books/(d+)" , BooksAPI , {"GET": ...}), (r"/api/orders/(d+)" , OrdersAPI , {"GET": ...}), (r"/api/users/(d+)" , UsersAPI , {"GET": ...}), ] arr = [ r"^/api/books/(?:d+)($)", r"^/api/orders/(?:d+)($)", r"^/api/users/(?:d+)($)", ] all_rexp = re.compile("|".join(arr)) m = all_rexp.match("/api/users/123") arr = m.groups() #=> (None, None, "") i = arr.index("") #=> 2 t = mapping_list[i] #=> (r"/api/users/(d+)", # UsersAPI, {"GET": ...}) No more named groups Tuple is much light- weight than dict index() is faster than for-loop
  • 41. Router Class class SmartRegexpRouter(Router): def __init__(self, mapping): self._mapping_dict = {} self._mapping_list = [] arr = [] for path, klass, funcs in mapping: if '{' not in path: self._mapping_dict[path] = (klass, funcs, [])
 else: rexp = compile_path(path); pat = rexp.pattern
 arr.append(pat.replace("(", "(?:") .replace("$", "($)")) t = (rexp, klass, funcs) self._mapping_list.append(t)
 self._all_rexp = re.compile("|".join(arr))
  • 42. Router Class ... def lookup(req_meth, req_path): t = self._mapping_dict.get(req_path) if t: return t m = self._all_rexp.match(req_path) if m: i = m.groups().index("") rexp, klass, funcs = self._mapping_list[i] m2 = rexp.match(req_path) params = [ int(v) for v in m2.groups() ] func = funcs.get(req_meth) return klass, func, params return None, None, None Matching to find index in list Matching to get param values
  • 43. Benchmark 0 10 20 30 40 50 Linear Regexp Naive Prefix Str Fixed Path Naive Smart Seconds (1M Requests) /api/aaa /api/aaa/{id} /api/zzz /api/zzz/{id} sec SlowerFaster Difference between /api/aaa/{id} and /api/zzz/{id} is small
  • 44. Pros. Cons. Pros & Cons ✗ Slower when number of entries is small.
 (due to overhead of twice matching) ✗ May be difficult to debug large regular expression. ✓ Much faster than ever,
 especially when many mapping entries exist.
  • 46. Optimize Regular Expression ## before arr = [r"^/api/books/(?:d+)($)", r"^/api/orders/(?:d+)($)", r"^/api/users/(?:d+)($)"] all_rexp = re.compile("|".join(arr)) ### after arr = [r"^/api", r"(?:", "|".join([r"/books/(?:d+)($)", r"/orders/(?:d+)($)", r"/users/(?:d+)($)"]), r")?"] all_rexp = re.compile("|".join(arr))
  • 47. Router Class class OptimizedRegexpRouter(Router): def __init__(self, mapping): ## Code is too complicated to show here. ## Please download sample code from github. ## https://github.com/kwatch/router-sample/ def lookup(req_meth, req_path): ## nothing changed; same as previous section
  • 48. Benchmark 0 10 20 30 40 50 Linear Regexp Naive Prefix Str Fixed Path Naive Smart Optimized Seconds (1M Requests) /api/aaa /api/aaa/{id} /api/zzz /api/zzz/{id} sec SlowerFaster A little faster on /api/zzz/{id}
  • 49. Pros. Cons. Pros & Cons ✗ Performance benefit is very small (on Python). ✗ Rather difficult to implement and debug. ✓ A little faster than smart regular expression
 when a lot of variable paths exist.
  • 52. State Machine: Definition path = "/api/books" transition = { "api": { "books": { None: (BooksAPI, {"GET": do_index, ...}), }, }, } >>> transition["api"]["books"][None] (BooksAPI, {"GET": do_index, ...}) >>> transition["api"]["users"][None] KeyError: 'users' Use None as terminator (mark of accepted status)
  • 53. State Machine: Definition path = "/api/books/{id:int}" transition = { "api": { "books": { None: (BooksAPI, {"GET": do_index, ...}), 1: { None: (BooksAPI, {"GET": do_show, ...}), }, }, }, } >>> transition["api"]["books"][1][None] (BooksAPI, {"GET": do_index, ...}) 1 represents int parameter, 2 represents str parameter.
  • 54. State Machine: Transition def find(req_path): req_path = req_path.lstrip('/') #ex: "/a/b/c" -> "a/b/c" items = req_path.split('/') #ex: "a/b/c" -> ["a","b","c"] d = transition; params = [] for s in items: if s in d: d = d[s] elif 1 in d: d = d[1]; params.append(int(s)) elif 2 in d: d = d[2]; params.append(str(s)) else: return None if None not in d: return None klass, funcs = d[None] return klass, funcs, params >>> find("/api/books/123") (BooksAPI, {"GET": do_index, ...}, [123])
  • 55. Router Class class StateMachineRouter(Router): def __init__(self, mapping): self._mapping_dict = {} self._mapping_list = [] self._transition = {} for path, klass, funcs in mapping: if '{' not in path: self._mapping_dict[path] = (klass, funcs, [])
 else: self._register(path, klass, funcs)
  • 56. Router Class ... PARAM_TYPES = {"int": 1, "str": 2} def _register(self, path, klass, funcs): ptypes = self.PARAM_TYPES d = self._transition for s in path[1:].split('/'): key = s if s[0] == "{" and s[-1] == "}": ## ex: "{id:int}" -> ("id", "int") pname, ptype = s[1:-1].split(':', 1) key = ptypes.get(ptype) or ptypes["str"] d = d.setdefault(key, {}) d[None] = (klass, funcs)
  • 57. Router Class ... def lookup(self, req_meth, req_path): d = self._transition params = [] for s in req_path[1:].split('/'): if s in d: d = d[s] elif 1 in d: d = d[1]; params.append(int(s)) elif 2 in d: d = d[2]; params.append(str(s)) else: return None, None, None if None in d: klass, funcs = d[None] func = funcs.get(req_meth) return klass, func, params return None, None, None
  • 58. Benchmark 0 10 20 30 40 50 Linear Regexp StateMachine Naive Prefix Str Fixed Path Naive Smart Optimized Seconds (1M Requests) /api/aaa /api/aaa/{id} /api/zzz /api/zzz/{id} sec SlowerFaster /api/aaa/{id} and /api/zzz/{id} are same performance
  • 59. Benchmark (PyPy3.5) 0 10 20 30 40 50 Linear Regexp StateMachine Naive Prefix Str Fixed Path Naive Smart Optimized Seconds (1M Requests) /api/aaa /api/aaa/{id} /api/zzz /api/zzz/{id} sec SlowerFaster Regular Expression is very slow in PyPy3.5 String operation is very fast because JIT friendly
  • 60. Benchmark (PyPy3.5) 0 1 2 3 4 5 Linear Regexp StateMachine Naive Prefix Str Fixed Path Naive Smart Optimized Seconds (1M Requests) /api/aaa /api/aaa/{id} /api/zzz /api/zzz/{id} sec SlowerFaster The fastest method due to Regexp-free (= JIT friendly) A little slower than StateMachine because containing Regexp
  • 61. Pros. Cons. Pros & Cons ✗ Not support complicated pattern. ✗ Requires some effort to support URL path suffix (ex: /api/books/123.json). ✓ Performance champion in routing area. ✓ Much faster in PyPy3.5, due to regexp-free. JIT friendly!
  • 63. Conclusion ‣ Linear Search is slow. ‣ Prefix string and Fixed path dict make it faster. ‣ Regular expression is very fast. ‣ Do your best to avoid named group (or named caption). ‣ State Machine is the fastest method in Python. ‣ Especially in PyPy3, due to regexp-free (= JIT friendly).
  • 65. My Products ‣ Benchmarker.py Awesome benchmarking utility. https://pythonhosted.org/Benchmarker/ ‣ Oktest.py New generation of testing framework. https://pythonhosted.org/Oktest/ ‣ PyTenjin Super fast and feature-rich template engine. https://www.kuwata-lab.com/tenjin/pytenjin-users-guide.html https://bit.ly/tenjinpy_slide (presentation)