SlideShare a Scribd company logo
BASEL | BERN | BRUGG | BUCHAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I.BR. | GENEVA
HAMBURG | COPENHAGEN | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICH
http://guidoschmutz.wordpress.com@gschmutz
Location Analytics
Real-Time Geofencing using Kafka
Guido Schmutz
Guido Schmutz
Working at Trivadis for more than 22 years
Consultant, Trainer Software Architect for Java, Oracle, SOA and Big Data / Fast Data
Oracle Groundbreaker Ambassador & Oracle ACE Director
Head of Trivadis Architecture Board
Technology Manager @ Trivadis
More than 30 years of software development experience
Contact: guido.schmutz@trivadis.com
Blog: http://guidoschmutz.wordpress.com
Slideshare: http://www.slideshare.net/gschmutz
Twitter: gschmutz
167th edition
Agenda
1. Introduction & Motivation
2. Using KSQL
3. Using Kafka Streams
4. Using Tile38
5. Visualization using ArcadiaData
6. Summary
Guido Schmutz
Working at Trivadis for more than 22 years
Oracle Groundbreaker Ambassador & Oracle ACE Director
Consultant, Trainer, Software Architect for Java, AWS, Azure,
Oracle Cloud, SOA and Big Data / Fast Data
Platform Architect & Head of Trivadis Architecture Board
More than 30 years of software development experience
Contact: guido.schmutz@trivadis.com
Blog: http://guidoschmutz.wordpress.com
Slideshare: http://www.slideshare.net/gschmutz
Twitter: gschmutz
155th edition
Introduction
Geofencing – What is it?
• the use of GPS or RFID technology to
create a virtual geographic boundary,
enabling software to trigger a response
when a object/device enters or leaves a
particular area
• Possible Events
• OUTSIDE
• lNSIDE
• ENTER
• EXIT
Source: https://tile38.com
Geofencing – What can we do with it?
• On-Demand and Delivery Services - assign
orders to an area's designated service
provider
• On-Demand Transportation - track Electronic
Transportation Devices and their distance
from charging stations
• Transportation Management - track flow of
people using public transport systems
• Commercial Real Estate - Identify how many
people drive or walk by a specific location
• Retail Shopper Guidance - Guide
customer to a specific product once they
are in your store
• Property Security - Open or lock doors as
individuals with designated devices
approach or leave a building or vehicle.
• Property Control - restrict vehicles to be
operational only inside a geofenced area –
like drones or construction equipment
Geo-Processing
• Well-known text (WKT) is a text markup language for
representing vector geometry objects on a map
• GeoTools is a free software GIS toolkit for developing standards
compliant solutions
Apache Kafka – A Streaming Platform
Source
Connector
Sink
Connector
trucking_
driver
KSQL Engine
Kafka Streams
Kafka Broker
Dash
board
High Level Overview of Use Case
geofence
Join Position
& Geofences
Vehicle
Position
object
position
pos &
geofences
Geo
fencing
geofence
status
key=10
{ "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
key=3
{"id":3,"name":"Berlin, Germany","geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443,
…))","last_update":1560607149015}
Geofence
Mgmt
Vehicle
Position
Weather
Service
Using KSQL
KSQL – Streams and Tabless
geofence
Table
vehicle
position
Stream
CREATE STREAM vehicle_position_s
(id VARCHAR,
latitude DOUBLE,
longitude DOUBLE)
WITH (KAFKA_TOPIC='vehicle_position',
VALUE_FORMAT='DELIMITED');
CREATE TABLE geo_fence_t
(id BIGINT,
name VARCHAR,
geometry_wkt VARCHAR)
WITH (KAFKA_TOPIC='geo_fence',
VALUE_FORMAT='JSON',
KEY = 'id');KSQL
Geofencing
How to determine "inside" or "outside" geofence?
Only one standard UDF for geo processing in KSQL: GEO_DISTANCE
Implement custom UDF using functionality from GeoTools Java library
public String geo_fence(final double latitude, final double longitude,
final String geometryWKT){ .. }
public List<String> geo_fence_bulk(final double latitude
, final double longitude, List<String> idGeometryListWKT) { .. }
ksql> SELECT geo_fence(latitude, longitude, ' POLYGON ((13.297920227050781
52.56195151687443, 13.2440185546875 52.530216577830124, ...))')
FROM test_geo_udf_s;
52.4497 | 13.3096 | OUTSIDE
52.4556 | 13.3178 | INSIDE
Custom UDF to determine if Point is inside a geometry
@Udf(description = "determines if a lat/long is inside or outside the
geometry passed as the 3rd parameter as WKT encoded ...")
public String geo_fence(final double latitude, final double longitude,
final String geometryWKT) {
String status = "";
GeometryFactory geometryFactory = JTSFactoryFinder.getGeometryFactory();
WKTReader reader = new WKTReader(geometryFactory);
Polygon polygon = (Polygon) reader.read(geometryWKT);
Coordinate coord = new Coordinate(longitude, latitude);
Point point = geometryFactory.createPoint(coord);
if (point.within(polygon)) {
status = "INSIDE";
} else {
status = "OUTSIDE";
}
return status;
}
1) Using Cross Join
geofence
Table
Join Position
& Geofences
vehicle
position
Stream
Stream
pos &
geofences
CREATE STREAM vp_join_gf_s
AS
SELECT vp.id, vp.latitude, vp.longitude,
gf.geometry_wkt
FROM vehicle_position_s AS vp
CROSS JOIN geo_fence_t AS gf
There is no Cross Join
in KSQL!
2) INNER Join
geofence
Stream
Join Position
& Geofences
vehicle
position
Stream
Stream
pos &
geofences
{ "group":1", "name":"St. Louis",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":1", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
Enrich Group
Table
geofences
by group 1
Enrich Group
Stream
postion by
group 1 Cannot insert into Table
from Stream
>INSERT INTO geo_fence_t
>SELECT '1' AS group_id, geof.id, …
>FROM geo_fence_s geof;
INSERT INTO can only be used to insert into
a stream. A02_GEO_FENCE_T is a table.
{ "group":"1", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
3) Geofences aggregated in one group
Join Position
& Geofences
Stream
geofence
status
Geofences
aggby group
Table
{ "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
geo_fence_bulk
geofence
Stream
vehicle
position
Stream
{ "group":1", "name":"St. Louis",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":1", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
Enrich With
Group-1
Stream
geofences
by group 1
Enrich With
Group-1
Stream
postion by
group 1
geofences
by group 1
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "group":"1", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
3) Geofences aggregated in one group
CREATE TABLE a03_geo_fence_aggby_group_t
AS
SELECT group_id
, collect_set(id + ':' + geometry_wkt) AS id_geometry_wkt_list
FROM a03_geo_fence_by_group_s geof
GROUP BY group_id;
CREATE STREAM a03_vehicle_position_by_group_s
AS
SELECT '1' group_id, vehp.id, vehp.latitude, vehp.longitude
FROM vehicle_position_s vehp
PARTITION BY group_id;
3) Geofences aggregated in one group
• CREATE STREAM a03_geo_fence_status_s
• AS
• SELECT vehp.id, vehp.latitude, vehp.longitude,
geo_fence_bulk(vehp.latitude, vehp.longitude,
geofaggid_geometry_wkt_list) AS geofence_status
• FROM a03_vehicle_position_by_group_s vehp
• LEFT JOIN a03_geo_fence_aggby_group_t geofagg
• ON vehp.group_id = geofagg.group_id;
ksql> SELECT * FROM a03_geo_fence_status_s;
46 | 52.47546 | 13.34851 | [1:OUTSIDE, 3:INSIDE]
46 | 52.47521 | 13.34881 | [1:OUTSIDE, 3:INSIDE]
...
As many as there are geo-fences
Geo Hash for a better distribution
Geohash is a geocoding which
encodes a geographic location
into a short string of letters and
digits
Length Area width x height
1 5,009.4km x 4,992.6km
2 1,252.3km x 624.1km
3 156.5km x 156km
4 39.1km x 19.5km
12 3.7cm x 1.9cm
http://geohash.gofreerange.com/
Geo Hash Custom UDF
ksql> SELECT latitude, longitude, geo_hash(latitude, longitude, 3)
>FROM test_geo_udf_s;
38.484769753492536 | -90.23345947265625 | 9yz
public String geohash(final double latitude,
final double longitude, int length)
public List<String> neighbours(String geohash)
public String adjacentHash(String geohash, String directionString)
public List<String> coverBoundingBox(String geometryWKT, int length)
ksql> SELECT geometry_wkt, geo_hash(geometry_wkt, 5)
>FROM test_geo_udf_s;
POLYGON ((-90.23345947265625 38.484769753492536, -90.25886535644531
38.47455675836861, ...)) | [9yzf6, 9yzf7, 9yzfd, 9yzfe, 9yzff, 9yzfg, 9yzfk,
9yzfs, 9yzfu]
4) Geofences aggregated by GeoHash
Join Position
& Geofences
Stream
geofence
status
Geofences
gpby geohash
Table
{ "geohash":"u33", "name":"Postdam",
"geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"geohash":"u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))",
"last_update":1560607149015}
geo_fence_bulk()
geofence
Table
vehicle
position
Stream
{ "geohash":"u33", "name":"Potsam",
"geometry_wkt":"POLYGON ((13.297920227050781
52.56195151687443, …))",
"last_update":1560607149015}
{ "group":"u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
Enrich with
GeoHash
Stream
geofences
& geohash
Enrich with
GeoHash
Stream
position &
geohash
geofences
by geohash
geo_hash()
geo_hash()
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
4) Geofences aggregated by GeoHash
CREATE STREAM a04_geo_fence_by_geohash_s
AS
SELECT geo_hash(geometry_wkt, 3)[0] geo_hash, id, name, geometry_wkt
FROM a04_geo_fence_s
PARTITION by geo_hash;
INSERT INTO a04_geo_fence_by_geohash_s
SELECT geo_hash(geometry_wkt, 3)[1] geo_hash, id, name, geometry_wkt
FROM a04_geo_fence_s
WHERE geo_hash(geometry_wkt, 3)[1] IS NOT NULL
PARTITION BY geo_hash;s
INSERT INTO a04_geo_fence_by_geohash_s
SELECT ...
There is no explode()
functionality in KSQL! https://github.com/confluentinc/ksql/issues/527
4) Geofences aggregated by GeoHash
CREATE TABLE a04_geo_fence_by_geohash_t
AS
SELECT geo_hash,
COLLECT_SET(id + ':' + geometry_wkt) AS id_geometry_wkt_list,
COLLECT_SET(id) id_list
FROM a04_geo_fence_by_geohash_s
GROUP BY geo_hash;
CREATE STREAM a04_vehicle_position_by_geohash_s
AS
SELECT vp.id, vp.latitude, vp.longitude,
geo_hash(vp.latitude, vp.longitude, 3) geo_hash
FROM vehicle_position_s vp
PARTITION BY geo_hash;
4) Geofences aggregated by GeoHash
CREATE STREAM a04_geo_fence_status_s
AS
SELECT vp.geo_hash, vp.id, vp.latitude, vp.longitude,
geo_fence_bulk (vp.latitude, vp.longitude, gf.id_geometry_wkt_list)
AS fence_status
FROM a04_vehicle_position_by_geohash_s vp 
LEFT JOIN a04_geo_fence_by_geohash_t gf 
ON (vp.geo_hash = gf.geo_hash);
ksql> SELECT * FROM a04_geo_fence_status_s;
u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE]
u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE]
9yz | 12 | 38.34409 | -90.15034 | [2:OUTSIDE, 1:OUTSIDE]
...
As many as there are geo-fences in
geohash
4a) Geofences aggregated by GeoHash
Join Position
& Geofences
Geofences
gpby geohash
Table
{ "group":"u33", "name":" Potsdam",
"geometry_wkt":"POLYGON ((5.668945 51.416016, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
geo_fence_bulk()
geofence
Table
vehicle
position
Stream
{ "geohash":u33", "name":"Postsdam",
"geometry_wkt":"POLYGON ((5.668945 51.416016, …))",
"last_update":1560607149015}
{ "geohash":"u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
Enrich with
GeoHash
Stream
geofences
& geohash
Enrich with
GeoHash
Stream
position &
geohash
geofences
by geohash
geo_hash()
geo_hash()
Stream
udf
status
geofence
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
4b) Geofences aggregated by GeoHash
Join Position
& Geofences
Geofences
gpby geohash
Table
{ "geohash":"u33", "name":"Potsdam",
"geometry_wkt":"POLYGON ((5.668945 51.416016, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON
((-90.23345947265625 38.484769753492536,…))",
"last_update":1560607149015}
geo_fence()
geofence
Table
vehicle
position
Stream
{ "geohash":"u33", "name":"Potsdam",
"geometry_wkt":"POLYGON ((5.668945 51.416016, …))",
"last_update":1560607149015}
{ "group":"u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-90.23345947265625
38.484769753492536,…))", "last_update":1560607149015}
Enrich with
GeoHash
Stream
geofences
& geohash
Enrich with
GeoHash
Stream
position &
geohash
geofences by
geohash
geo_hash()
geo_hash()
Stream
position &
geofence
Explode
Geofendes
Stream
geofence
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
{ "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
4b) Geofences aggregated by GeoHash
CREATE STREAM a04b_geofence_udf_status_s
AS
SELECT id, latitude, longitude, id_list[0] AS geofence_id,
geo_fence(latitude, longitude, geometry_wkt_list[0]) AS geofence_status
FROM a04_vehicle_position_by_geohash_s vp 
LEFT JOIN a04_geo_fence_by_geohash_t gf 
ON (vp.geo_hash = gf.geo_hash);
INSERT INTO a04b_geofence_udf_status_s
SELECT id, latitude, longitude, id_list[1] geofence_id,
geo_fence(latitude, longitude, geometry_wkt_list[1]) AS geofence_status
FROM a04_vehicle_position_by_geohash_s vp 
LEFT JOIN a04_geo_fence_by_geohash_t gf 
ON (vp.geo_hash = gf.geo_hash)
WHERE id_list[1] IS NOT NULL;
Berne
Fribourg
It works …. but ….
• By re-partitioning by geohash
we lose the guaranteed order
for a given vehicle
• Can be problematic, if there is a
backlog in one of the
topics/partitions
u0m5
u0m4
u0m7
u0m6
Consumer 1 Consumer 2
Using Kafka Streams
Geo-Fencing with Kafka Streams and Global KTable
Enrich Position with GeoHash
& Join with Geofences
Global
KTable
{ "geohash":u33", "name":"Potsdam",
"geometry_wkt":"POLYGON ((5.668945
51.416016, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin",
"geometry_wkt":"POLYGON ((-
90.23345947265625
38.484769753492536,…))",
"last_update":1560607149015}
geofence
KTable
vehicle
position
{ "geohash":u33", "name":"Potsdam",
"geometry_wkt":"POLYGON ((5.668945
51.416016, …))",
"last_update":1560607149015}
{ "group":u33", "name":"Berlin",
"geometry_wkt":"POLYGON ((-
90.23345947265625
38.484769753492536,…))",
"last_update":1560607149015}
Enricht and Group
by GeoHash
matched
geofences
Detect Geo
Event
geofece_sa
tus
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
geofence
by geohash
{"id":"10", "latitude" : 52.3924,
"longitude" : 13.0514, [
{"name":"Berlin"} ] }
{ "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
{"id":"10", "status" : "ENTER", "geofenceName":"Berlin"} }
position &
geohash
Geo-Fencing with Kafka Streams and Global KTable
KStream<String, GeoFence> geoFence = builder.stream(GEO_FENCE);
KStream<String, GeoFence> geoFenceByGeoHash =
geoFence.map((k,v) -> KeyValue.<GeoFence, List<String>> pair(v,
GeoHashUtil.coverBoundingBox(v.getWkt().toString(), 5)))
.flatMapValues(v -> v)
.map((k,v) -> KeyValue.<String,GeoFence>pair(v, createFrom(k, v)));
KTable<String, GeoFenceList> geofencesByGeohash =
geoFenceByGeoHash.groupByKey().aggregate(
() -> new GeoFenceList(new ArrayList<GeoFenceItem>()),
(aggKey, newValue, aggValue) -> {
GeoFenceItem geoFenceItem = new
GeoFenceItem(newValue.getId(), newValue.getName(),
newValue.getWkt(), "");
if (!aggValue.getGeoFences().contains(geoFenceItem))
aggValue.getGeoFences().add(geoFenceItem);
return aggValue;
},
Materialized.<String, GeoFenceList,
KeyValueStore<Bytes,byte[]>>as("geofences-by-geohash-store"));
geofencesByGeohash.toStream().to(GEO_FENCES_KEYEDBY_GEOHASH,
Produced.<String, GeoFenceList> keySerde(stringSerde));
Geo-Fencing with Kafka Streams and Global KTable
final GlobalKTable<String, GeoFenceList> geofences =
builder.globalTable(GEO_FENCES_KEYEDBY_GEOHASH);
KStream<String, VehiclePositionWithMatchedGeoFences> positionWithMatchedGeoFences =
vehiclePositionsWithGeoHash.leftJoin(geofences,
(k, pos) -> pos.getGeohash().toString(),
(pos, geofenceList) -> {
List<MatchedGeoFence> matchedGeofences = new ArrayList<MatchedGeoFence>();
if(geofenceList != null) {
for (GeoFenceItem geoFenceItem : geofenceList.getGeoFences()) {
boolean geofenceStatus =
GeoFenceUtil.geofence(pos.getLatitude(), pos.getLongitude(),
geoFenceItem.getWkt().toString());
if(geofenceStatus)
matchedGeofences.add(new MatchedGeoFence(geoFenceItem.getId(),
geoFenceItem.getName(), null));
}
}
return new VehiclePositionWithMatchedGeoFences(pos.getVehicleId(), 0L,
pos.getLatitude(), pos.getLongitude(),
pos.getEventTime(), matchedGeofences);
});
Geo-Fencing with Kafka Streams and Global KTable
final KStream<String, VehiclePositionWithMatchedGeoFences> positionWithMatchedGeoFences =
builder.stream(MATCHED_FENCE_STREAM);
final StoreBuilder<KeyValueStore<String, VehiclePositionWithMatchedGeoFences>>
vehicleGeoFenceStatusStore = Stores
.keyValueStoreBuilder(Stores.persistentKeyValueStore("GeoFenceSnapshotStore"),
Serdes.String(), positionWithMatchedGeoFencesSerde)
.withCachingEnabled();
builder.addStateStore(bargeGeoFenceStatusStore);
KStream<String, List<GeoEvent>> geoEvents = positionWithMatchedGeoFences.transformValues(
() -> new GeoEventEmitter (bargeGeoFenceStatusStore.name())
,vehicleGeoFenceStatusStore.name());
KStream<String, GeoEvent> geoEvent = geoEvents.flatMapValues(v -> v);
KStream<String, GeoEvent> geoEventByVehicleId =
geoEvent.selectKey((k, v) -> v.getVehicleId().toString());
geoEventByVechicleId.to(GEO_EVENT_STREAM);
Using Tile38
Tile38
• https://tile38.com
• Open Source Geospatial Database & Geofencing Server
• Real Time Geofencing
• Roaming Geofencing
• Fast Spatial Indices
• Pluggable Event Notifications
Tile38 – How does it work?
> SETCHAN berlin WITHIN vehicle FENCE OBJECT
{"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1
3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598],
[13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473
],[13.501167297363281,52.47148826410652], ...]]}
> SUBSCRIBE berlin
{"ok":true,"command":"subscribe","channel":"berlin","num":1,"elapsed":"5.85
µs"}
.
.
.
{"command":"set","group":"5d07581689807d000193ac33","detect":"outside","hoo
k":"berlin","key":"vehicle","time":"2019-06-
17T09:06:30.624923584Z","id":"10","object":{"type":"Point","coordinates":[1
3.3096,52.4497]}}
SET vehicle 10 POINT 52.4497 13.3096
Tile38 – How does it work?
> SETHOOK berlin_hook kafka://broker-1:9092/tile38_geofence_status WITHIN
vehicle FENCE OBJECT
{"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1
3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598],
[13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473
],[13.501167297363281,52.47148826410652], ...]]}
bigdata@bigdata:~$ kafkacat -b localhost -t tile38_geofence_status
% Auto-selecting Consumer mode (use -P or -C to override)
{"command":"set","group":"5d07581689807d000193ac34","detect":"outside","hoo
k":"berlin_hook","key":"vehicle","time":"2019-06-
17T09:12:00.488599119Z","id":"10","object":{"type":"Point","coordinates":[1
3.3096,52.4497]}}
SET vehicle 10 POINT 52.4497 13.3096
1) Enrich with GeoFences – aggregated by geohash
geofence
Stream
vehicle
position
Stream
Invoke UDF
{"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-
90.23345947265625 38.484769753492536,…))", "last_update":1560607149015}
{ "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
Invoke UDF
Geofence
Service
geofence
status
set_pos()
set_fence()
Stream
udf
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
2) Using Custom Kafka Connector for Tile38
geofence
vehicle
position
{"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON
((13.297920227050781 52.56195151687443, …))",
"last_update":1560607149015}
{"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-
90.23345947265625 38.484769753492536,…))", "last_update":1560607149015}
{ "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311}
Geofence
Service
kafka-to-
tile38
kafka-to-
tile38
geofence
status
high low
low high
low high
Scalable
Latency
"Code Smell"
medium
medium
medium
2) Using Custom Kafka Connector for Tile38
curl -X PUT 
/api/kafka-connect-1/connectors/Tile38SinkConnector/config 
-H 'Content-Type: application/json' 
-H 'Accept: application/json' 
-d '{
"connector.class":
"com.trivadis.geofence.kafka.connect.Tile38SinkConnector",
"topics": "vehicle_position",
"tasks.max": "1",
"tile38.key": "vehicle",
"tile38.operation": "SET",
"tile38.hosts": "tile38:9851"
}'
Currently only supports SET command
Visualization using Arcadia
Data
Arcadia Data https://www.arcadiadata.com/
Summary
Summary & Outlook
• Summary
• Geo Fencing is doable using Kafka and KSQL
• KSQL is similar to SQL, but don't think relational
• UDF and UDAF's is a powerful way to extend KSQL
• Use Geo Hashes to partition work
• Outlook
• Performance Tests
• Cleanup code of UDFs and UDAFs
• Implement Kafka Source Connector for Tile 38
Technology on its own won't help you.
You need to know how to use it properly.

More Related Content

What's hot (20)

PPTX
Introduction to Kafka
Akash Vacher
 
PDF
Image models infrastructure at OLX
Alexey Grigorev
 
PDF
MLOps Virtual Event: Automating ML at Scale
Databricks
 
PPTX
Kafka 101
Clement Demonchy
 
PDF
Simplifying Model Management with MLflow
Databricks
 
PDF
Linux-HA with Pacemaker
Kris Buytaert
 
PDF
From Zero to Hero with Kafka Connect
confluent
 
PDF
Change Data Streaming Patterns for Microservices With Debezium
confluent
 
PDF
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
StreamNative
 
PDF
Apache ZooKeeper
Scott Leberknight
 
PPTX
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
StreamNative
 
PDF
Introduction to Elasticsearch
Ruslan Zavacky
 
PDF
Building Event Driven (Micro)services with Apache Kafka
Guido Schmutz
 
PPTX
Kafka presentation
Mohammed Fazuluddin
 
DOCX
Micro services vs Monolith Architecture
MohamedElGohary71
 
PDF
mago3D 한국어 소개 자료
SANGHEE SHIN
 
PDF
How Snowflake Sink Connector Uses Snowpipe’s Streaming Ingestion Feature, Jay...
HostedbyConfluent
 
PDF
Apples and Oranges - Comparing Kafka Streams and Flink with Bill Bejeck
HostedbyConfluent
 
PDF
mlflow: Accelerating the End-to-End ML lifecycle
Databricks
 
PDF
Apache Kafka - Martin Podval
Martin Podval
 
Introduction to Kafka
Akash Vacher
 
Image models infrastructure at OLX
Alexey Grigorev
 
MLOps Virtual Event: Automating ML at Scale
Databricks
 
Kafka 101
Clement Demonchy
 
Simplifying Model Management with MLflow
Databricks
 
Linux-HA with Pacemaker
Kris Buytaert
 
From Zero to Hero with Kafka Connect
confluent
 
Change Data Streaming Patterns for Microservices With Debezium
confluent
 
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
StreamNative
 
Apache ZooKeeper
Scott Leberknight
 
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
StreamNative
 
Introduction to Elasticsearch
Ruslan Zavacky
 
Building Event Driven (Micro)services with Apache Kafka
Guido Schmutz
 
Kafka presentation
Mohammed Fazuluddin
 
Micro services vs Monolith Architecture
MohamedElGohary71
 
mago3D 한국어 소개 자료
SANGHEE SHIN
 
How Snowflake Sink Connector Uses Snowpipe’s Streaming Ingestion Feature, Jay...
HostedbyConfluent
 
Apples and Oranges - Comparing Kafka Streams and Flink with Bill Bejeck
HostedbyConfluent
 
mlflow: Accelerating the End-to-End ML lifecycle
Databricks
 
Apache Kafka - Martin Podval
Martin Podval
 

Similar to Location Analytics - Real-Time Geofencing using Kafka (20)

PDF
Location Analytics Real-Time Geofencing using Kafka
Guido Schmutz
 
PDF
QGIS training class 3
Hiroaki Sengoku
 
PDF
State of the Art Web Mapping with Open Source
OSCON Byrum
 
PPTX
LocationTech Projects
Jody Garnett
 
PDF
Geoint2017 training open interfaces - luis bermudez
Luis Bermudez
 
KEY
Handling Real-time Geostreams
guest35660bc
 
KEY
Handling Real-time Geostreams
Raffi Krikorian
 
PPT
060128 Galeon Rept
Rudolf Husar
 
PPT
2006-01-11 Data Flow & Interoperability in DataFed Service-based AQ Analysis ...
Rudolf Husar
 
PPT
0603 Esip Fed Wash Dc Tech Pres 060103 Esip Aq Tech Track
Rudolf Husar
 
PDF
MapStore Create, save and share maps and mashups @ GRASS-GFOSS 2013
GeoSolutions
 
PDF
LocationTech Projects
Jody Garnett
 
PDF
PyData London - Scaling AI workloads with Ray & Airflow.pdf
Tatiana Al-Chueyr
 
PPTX
How To Analyze Geolocation Data with Hive and Hadoop
Hortonworks
 
PDF
Building Location-Aware Apps using Open Source (AnDevCon SF 2014)
Chuck Greb
 
PDF
Integrating PostGIS in Web Applications
Command Prompt., Inc
 
PPT
Optimizing GIS based Systems
Ajinkya Deshpande
 
PPTX
Field Data Collecting, Processing and Sharing: Using web Service Technologies
Niroshan Sanjaya
 
PDF
GeoMesa on Apache Spark SQL with Anthony Fox
Databricks
 
PDF
Ingesting and Processing IoT Data - using MQTT, Kafka Connect and KSQL
Guido Schmutz
 
Location Analytics Real-Time Geofencing using Kafka
Guido Schmutz
 
QGIS training class 3
Hiroaki Sengoku
 
State of the Art Web Mapping with Open Source
OSCON Byrum
 
LocationTech Projects
Jody Garnett
 
Geoint2017 training open interfaces - luis bermudez
Luis Bermudez
 
Handling Real-time Geostreams
guest35660bc
 
Handling Real-time Geostreams
Raffi Krikorian
 
060128 Galeon Rept
Rudolf Husar
 
2006-01-11 Data Flow & Interoperability in DataFed Service-based AQ Analysis ...
Rudolf Husar
 
0603 Esip Fed Wash Dc Tech Pres 060103 Esip Aq Tech Track
Rudolf Husar
 
MapStore Create, save and share maps and mashups @ GRASS-GFOSS 2013
GeoSolutions
 
LocationTech Projects
Jody Garnett
 
PyData London - Scaling AI workloads with Ray & Airflow.pdf
Tatiana Al-Chueyr
 
How To Analyze Geolocation Data with Hive and Hadoop
Hortonworks
 
Building Location-Aware Apps using Open Source (AnDevCon SF 2014)
Chuck Greb
 
Integrating PostGIS in Web Applications
Command Prompt., Inc
 
Optimizing GIS based Systems
Ajinkya Deshpande
 
Field Data Collecting, Processing and Sharing: Using web Service Technologies
Niroshan Sanjaya
 
GeoMesa on Apache Spark SQL with Anthony Fox
Databricks
 
Ingesting and Processing IoT Data - using MQTT, Kafka Connect and KSQL
Guido Schmutz
 
Ad

More from Guido Schmutz (20)

PDF
30 Minutes to the Analytics Platform with Infrastructure as Code
Guido Schmutz
 
PDF
Event Broker (Kafka) in a Modern Data Architecture
Guido Schmutz
 
PDF
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Guido Schmutz
 
PDF
ksqlDB - Stream Processing simplified!
Guido Schmutz
 
PDF
Kafka as your Data Lake - is it Feasible?
Guido Schmutz
 
PDF
Event Hub (i.e. Kafka) in Modern Data Architecture
Guido Schmutz
 
PDF
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
PDF
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Guido Schmutz
 
PDF
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Guido Schmutz
 
PDF
What is Apache Kafka? Why is it so popular? Should I use it?
Guido Schmutz
 
PDF
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
PDF
Streaming Visualisation
Guido Schmutz
 
PDF
Kafka as an event store - is it good enough?
Guido Schmutz
 
PDF
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Guido Schmutz
 
PDF
Fundamentals Big Data and AI Architecture
Guido Schmutz
 
PDF
Streaming Visualization
Guido Schmutz
 
PDF
Streaming Visualization
Guido Schmutz
 
PDF
Building Event-Driven (Micro) Services with Apache Kafka
Guido Schmutz
 
PDF
Introduction to Stream Processing
Guido Schmutz
 
PDF
Stream Processing – Concepts and Frameworks
Guido Schmutz
 
30 Minutes to the Analytics Platform with Infrastructure as Code
Guido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Guido Schmutz
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Guido Schmutz
 
ksqlDB - Stream Processing simplified!
Guido Schmutz
 
Kafka as your Data Lake - is it Feasible?
Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Guido Schmutz
 
What is Apache Kafka? Why is it so popular? Should I use it?
Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Guido Schmutz
 
Streaming Visualisation
Guido Schmutz
 
Kafka as an event store - is it good enough?
Guido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Guido Schmutz
 
Fundamentals Big Data and AI Architecture
Guido Schmutz
 
Streaming Visualization
Guido Schmutz
 
Streaming Visualization
Guido Schmutz
 
Building Event-Driven (Micro) Services with Apache Kafka
Guido Schmutz
 
Introduction to Stream Processing
Guido Schmutz
 
Stream Processing – Concepts and Frameworks
Guido Schmutz
 
Ad

Recently uploaded (20)

PDF
apidays Munich 2025 - The Physics of Requirement Sciences Through Application...
apidays
 
PPTX
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
PPTX
Data-Users-in-Database-Management-Systems (1).pptx
dharmik832021
 
PDF
apidays Munich 2025 - Making Sense of AI-Ready APIs in a Buzzword World, Andr...
apidays
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PDF
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
PPTX
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
PDF
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
PPTX
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
PPTX
MR and reffffffvvvvvvvfversal_083605.pptx
manjeshjain
 
PPTX
Introduction to computer chapter one 2017.pptx
mensunmarley
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PPTX
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
PPTX
HSE WEEKLY REPORT for dummies and lazzzzy.pptx
ahmedibrahim691723
 
PDF
Top Civil Engineer Canada Services111111
nengineeringfirms
 
PPTX
Insurance-Analytics-Branch-Dashboard (1).pptx
trivenisapate02
 
PPTX
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
7 Easy Ways to Improve Clarity in Your BI Reports
sophiegracewriter
 
apidays Munich 2025 - The Physics of Requirement Sciences Through Application...
apidays
 
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
Data-Users-in-Database-Management-Systems (1).pptx
dharmik832021
 
apidays Munich 2025 - Making Sense of AI-Ready APIs in a Buzzword World, Andr...
apidays
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
Blue Futuristic Cyber Security Presentation.pdf
tanvikhunt1003
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
White Blue Simple Modern Enhancing Sales Strategy Presentation_20250724_21093...
RamNeymarjr
 
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
MR and reffffffvvvvvvvfversal_083605.pptx
manjeshjain
 
Introduction to computer chapter one 2017.pptx
mensunmarley
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
HSE WEEKLY REPORT for dummies and lazzzzy.pptx
ahmedibrahim691723
 
Top Civil Engineer Canada Services111111
nengineeringfirms
 
Insurance-Analytics-Branch-Dashboard (1).pptx
trivenisapate02
 
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
7 Easy Ways to Improve Clarity in Your BI Reports
sophiegracewriter
 

Location Analytics - Real-Time Geofencing using Kafka

  • 1. BASEL | BERN | BRUGG | BUCHAREST | DÜSSELDORF | FRANKFURT A.M. | FREIBURG I.BR. | GENEVA HAMBURG | COPENHAGEN | LAUSANNE | MANNHEIM | MUNICH | STUTTGART | VIENNA | ZURICH http://guidoschmutz.wordpress.com@gschmutz Location Analytics Real-Time Geofencing using Kafka Guido Schmutz
  • 2. Guido Schmutz Working at Trivadis for more than 22 years Consultant, Trainer Software Architect for Java, Oracle, SOA and Big Data / Fast Data Oracle Groundbreaker Ambassador & Oracle ACE Director Head of Trivadis Architecture Board Technology Manager @ Trivadis More than 30 years of software development experience Contact: [email protected] Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 167th edition
  • 3. Agenda 1. Introduction & Motivation 2. Using KSQL 3. Using Kafka Streams 4. Using Tile38 5. Visualization using ArcadiaData 6. Summary
  • 4. Guido Schmutz Working at Trivadis for more than 22 years Oracle Groundbreaker Ambassador & Oracle ACE Director Consultant, Trainer, Software Architect for Java, AWS, Azure, Oracle Cloud, SOA and Big Data / Fast Data Platform Architect & Head of Trivadis Architecture Board More than 30 years of software development experience Contact: [email protected] Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 155th edition
  • 6. Geofencing – What is it? • the use of GPS or RFID technology to create a virtual geographic boundary, enabling software to trigger a response when a object/device enters or leaves a particular area • Possible Events • OUTSIDE • lNSIDE • ENTER • EXIT Source: https://tile38.com
  • 7. Geofencing – What can we do with it? • On-Demand and Delivery Services - assign orders to an area's designated service provider • On-Demand Transportation - track Electronic Transportation Devices and their distance from charging stations • Transportation Management - track flow of people using public transport systems • Commercial Real Estate - Identify how many people drive or walk by a specific location • Retail Shopper Guidance - Guide customer to a specific product once they are in your store • Property Security - Open or lock doors as individuals with designated devices approach or leave a building or vehicle. • Property Control - restrict vehicles to be operational only inside a geofenced area – like drones or construction equipment
  • 8. Geo-Processing • Well-known text (WKT) is a text markup language for representing vector geometry objects on a map • GeoTools is a free software GIS toolkit for developing standards compliant solutions
  • 9. Apache Kafka – A Streaming Platform Source Connector Sink Connector trucking_ driver KSQL Engine Kafka Streams Kafka Broker
  • 10. Dash board High Level Overview of Use Case geofence Join Position & Geofences Vehicle Position object position pos & geofences Geo fencing geofence status key=10 { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} key=3 {"id":3,"name":"Berlin, Germany","geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))","last_update":1560607149015} Geofence Mgmt Vehicle Position Weather Service
  • 12. KSQL – Streams and Tabless geofence Table vehicle position Stream CREATE STREAM vehicle_position_s (id VARCHAR, latitude DOUBLE, longitude DOUBLE) WITH (KAFKA_TOPIC='vehicle_position', VALUE_FORMAT='DELIMITED'); CREATE TABLE geo_fence_t (id BIGINT, name VARCHAR, geometry_wkt VARCHAR) WITH (KAFKA_TOPIC='geo_fence', VALUE_FORMAT='JSON', KEY = 'id');KSQL Geofencing
  • 13. How to determine "inside" or "outside" geofence? Only one standard UDF for geo processing in KSQL: GEO_DISTANCE Implement custom UDF using functionality from GeoTools Java library public String geo_fence(final double latitude, final double longitude, final String geometryWKT){ .. } public List<String> geo_fence_bulk(final double latitude , final double longitude, List<String> idGeometryListWKT) { .. } ksql> SELECT geo_fence(latitude, longitude, ' POLYGON ((13.297920227050781 52.56195151687443, 13.2440185546875 52.530216577830124, ...))') FROM test_geo_udf_s; 52.4497 | 13.3096 | OUTSIDE 52.4556 | 13.3178 | INSIDE
  • 14. Custom UDF to determine if Point is inside a geometry @Udf(description = "determines if a lat/long is inside or outside the geometry passed as the 3rd parameter as WKT encoded ...") public String geo_fence(final double latitude, final double longitude, final String geometryWKT) { String status = ""; GeometryFactory geometryFactory = JTSFactoryFinder.getGeometryFactory(); WKTReader reader = new WKTReader(geometryFactory); Polygon polygon = (Polygon) reader.read(geometryWKT); Coordinate coord = new Coordinate(longitude, latitude); Point point = geometryFactory.createPoint(coord); if (point.within(polygon)) { status = "INSIDE"; } else { status = "OUTSIDE"; } return status; }
  • 15. 1) Using Cross Join geofence Table Join Position & Geofences vehicle position Stream Stream pos & geofences CREATE STREAM vp_join_gf_s AS SELECT vp.id, vp.latitude, vp.longitude, gf.geometry_wkt FROM vehicle_position_s AS vp CROSS JOIN geo_fence_t AS gf There is no Cross Join in KSQL!
  • 16. 2) INNER Join geofence Stream Join Position & Geofences vehicle position Stream Stream pos & geofences { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich Group Table geofences by group 1 Enrich Group Stream postion by group 1 Cannot insert into Table from Stream >INSERT INTO geo_fence_t >SELECT '1' AS group_id, geof.id, … >FROM geo_fence_s geof; INSERT INTO can only be used to insert into a stream. A02_GEO_FENCE_T is a table. { "group":"1", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 17. 3) Geofences aggregated in one group Join Position & Geofences Stream geofence status Geofences aggby group Table { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk geofence Stream vehicle position Stream { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich With Group-1 Stream geofences by group 1 Enrich With Group-1 Stream postion by group 1 geofences by group 1 high low low high low high Scalable Latency "Code Smell" medium medium medium { "group":"1", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 18. 3) Geofences aggregated in one group CREATE TABLE a03_geo_fence_aggby_group_t AS SELECT group_id , collect_set(id + ':' + geometry_wkt) AS id_geometry_wkt_list FROM a03_geo_fence_by_group_s geof GROUP BY group_id; CREATE STREAM a03_vehicle_position_by_group_s AS SELECT '1' group_id, vehp.id, vehp.latitude, vehp.longitude FROM vehicle_position_s vehp PARTITION BY group_id;
  • 19. 3) Geofences aggregated in one group • CREATE STREAM a03_geo_fence_status_s • AS • SELECT vehp.id, vehp.latitude, vehp.longitude, geo_fence_bulk(vehp.latitude, vehp.longitude, geofaggid_geometry_wkt_list) AS geofence_status • FROM a03_vehicle_position_by_group_s vehp • LEFT JOIN a03_geo_fence_aggby_group_t geofagg • ON vehp.group_id = geofagg.group_id; ksql> SELECT * FROM a03_geo_fence_status_s; 46 | 52.47546 | 13.34851 | [1:OUTSIDE, 3:INSIDE] 46 | 52.47521 | 13.34881 | [1:OUTSIDE, 3:INSIDE] ... As many as there are geo-fences
  • 20. Geo Hash for a better distribution Geohash is a geocoding which encodes a geographic location into a short string of letters and digits Length Area width x height 1 5,009.4km x 4,992.6km 2 1,252.3km x 624.1km 3 156.5km x 156km 4 39.1km x 19.5km 12 3.7cm x 1.9cm http://geohash.gofreerange.com/
  • 21. Geo Hash Custom UDF ksql> SELECT latitude, longitude, geo_hash(latitude, longitude, 3) >FROM test_geo_udf_s; 38.484769753492536 | -90.23345947265625 | 9yz public String geohash(final double latitude, final double longitude, int length) public List<String> neighbours(String geohash) public String adjacentHash(String geohash, String directionString) public List<String> coverBoundingBox(String geometryWKT, int length) ksql> SELECT geometry_wkt, geo_hash(geometry_wkt, 5) >FROM test_geo_udf_s; POLYGON ((-90.23345947265625 38.484769753492536, -90.25886535644531 38.47455675836861, ...)) | [9yzf6, 9yzf7, 9yzfd, 9yzfe, 9yzff, 9yzfg, 9yzfk, 9yzfs, 9yzfu]
  • 22. 4) Geofences aggregated by GeoHash Join Position & Geofences Stream geofence status Geofences gpby geohash Table { "geohash":"u33", "name":"Postdam", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"geohash":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk() geofence Table vehicle position Stream { "geohash":"u33", "name":"Potsam", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by geohash geo_hash() geo_hash() high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 23. 4) Geofences aggregated by GeoHash CREATE STREAM a04_geo_fence_by_geohash_s AS SELECT geo_hash(geometry_wkt, 3)[0] geo_hash, id, name, geometry_wkt FROM a04_geo_fence_s PARTITION by geo_hash; INSERT INTO a04_geo_fence_by_geohash_s SELECT geo_hash(geometry_wkt, 3)[1] geo_hash, id, name, geometry_wkt FROM a04_geo_fence_s WHERE geo_hash(geometry_wkt, 3)[1] IS NOT NULL PARTITION BY geo_hash;s INSERT INTO a04_geo_fence_by_geohash_s SELECT ... There is no explode() functionality in KSQL! https://github.com/confluentinc/ksql/issues/527
  • 24. 4) Geofences aggregated by GeoHash CREATE TABLE a04_geo_fence_by_geohash_t AS SELECT geo_hash, COLLECT_SET(id + ':' + geometry_wkt) AS id_geometry_wkt_list, COLLECT_SET(id) id_list FROM a04_geo_fence_by_geohash_s GROUP BY geo_hash; CREATE STREAM a04_vehicle_position_by_geohash_s AS SELECT vp.id, vp.latitude, vp.longitude, geo_hash(vp.latitude, vp.longitude, 3) geo_hash FROM vehicle_position_s vp PARTITION BY geo_hash;
  • 25. 4) Geofences aggregated by GeoHash CREATE STREAM a04_geo_fence_status_s AS SELECT vp.geo_hash, vp.id, vp.latitude, vp.longitude, geo_fence_bulk (vp.latitude, vp.longitude, gf.id_geometry_wkt_list) AS fence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash); ksql> SELECT * FROM a04_geo_fence_status_s; u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE] u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE] 9yz | 12 | 38.34409 | -90.15034 | [2:OUTSIDE, 1:OUTSIDE] ... As many as there are geo-fences in geohash
  • 26. 4a) Geofences aggregated by GeoHash Join Position & Geofences Geofences gpby geohash Table { "group":"u33", "name":" Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk() geofence Table vehicle position Stream { "geohash":u33", "name":"Postsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} { "geohash":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by geohash geo_hash() geo_hash() Stream udf status geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 27. 4b) Geofences aggregated by GeoHash Join Position & Geofences Geofences gpby geohash Table { "geohash":"u33", "name":"Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence() geofence Table vehicle position Stream { "geohash":"u33", "name":"Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} { "group":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by geohash geo_hash() geo_hash() Stream position & geofence Explode Geofendes Stream geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514}
  • 28. 4b) Geofences aggregated by GeoHash CREATE STREAM a04b_geofence_udf_status_s AS SELECT id, latitude, longitude, id_list[0] AS geofence_id, geo_fence(latitude, longitude, geometry_wkt_list[0]) AS geofence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash); INSERT INTO a04b_geofence_udf_status_s SELECT id, latitude, longitude, id_list[1] geofence_id, geo_fence(latitude, longitude, geometry_wkt_list[1]) AS geofence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash) WHERE id_list[1] IS NOT NULL;
  • 29. Berne Fribourg It works …. but …. • By re-partitioning by geohash we lose the guaranteed order for a given vehicle • Can be problematic, if there is a backlog in one of the topics/partitions u0m5 u0m4 u0m7 u0m6 Consumer 1 Consumer 2
  • 31. Geo-Fencing with Kafka Streams and Global KTable Enrich Position with GeoHash & Join with Geofences Global KTable { "geohash":u33", "name":"Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geofence KTable vehicle position { "geohash":u33", "name":"Potsdam", "geometry_wkt":"POLYGON ((5.668945 51.416016, …))", "last_update":1560607149015} { "group":u33", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enricht and Group by GeoHash matched geofences Detect Geo Event geofece_sa tus high low low high low high Scalable Latency "Code Smell" medium medium medium geofence by geohash {"id":"10", "latitude" : 52.3924, "longitude" : 13.0514, [ {"name":"Berlin"} ] } { "geohash":"u33", "id" : "10", "latitude" : 52.3924, "longitude" : 13.0514} {"id":"10", "status" : "ENTER", "geofenceName":"Berlin"} } position & geohash
  • 32. Geo-Fencing with Kafka Streams and Global KTable KStream<String, GeoFence> geoFence = builder.stream(GEO_FENCE); KStream<String, GeoFence> geoFenceByGeoHash = geoFence.map((k,v) -> KeyValue.<GeoFence, List<String>> pair(v, GeoHashUtil.coverBoundingBox(v.getWkt().toString(), 5))) .flatMapValues(v -> v) .map((k,v) -> KeyValue.<String,GeoFence>pair(v, createFrom(k, v))); KTable<String, GeoFenceList> geofencesByGeohash = geoFenceByGeoHash.groupByKey().aggregate( () -> new GeoFenceList(new ArrayList<GeoFenceItem>()), (aggKey, newValue, aggValue) -> { GeoFenceItem geoFenceItem = new GeoFenceItem(newValue.getId(), newValue.getName(), newValue.getWkt(), ""); if (!aggValue.getGeoFences().contains(geoFenceItem)) aggValue.getGeoFences().add(geoFenceItem); return aggValue; }, Materialized.<String, GeoFenceList, KeyValueStore<Bytes,byte[]>>as("geofences-by-geohash-store")); geofencesByGeohash.toStream().to(GEO_FENCES_KEYEDBY_GEOHASH, Produced.<String, GeoFenceList> keySerde(stringSerde));
  • 33. Geo-Fencing with Kafka Streams and Global KTable final GlobalKTable<String, GeoFenceList> geofences = builder.globalTable(GEO_FENCES_KEYEDBY_GEOHASH); KStream<String, VehiclePositionWithMatchedGeoFences> positionWithMatchedGeoFences = vehiclePositionsWithGeoHash.leftJoin(geofences, (k, pos) -> pos.getGeohash().toString(), (pos, geofenceList) -> { List<MatchedGeoFence> matchedGeofences = new ArrayList<MatchedGeoFence>(); if(geofenceList != null) { for (GeoFenceItem geoFenceItem : geofenceList.getGeoFences()) { boolean geofenceStatus = GeoFenceUtil.geofence(pos.getLatitude(), pos.getLongitude(), geoFenceItem.getWkt().toString()); if(geofenceStatus) matchedGeofences.add(new MatchedGeoFence(geoFenceItem.getId(), geoFenceItem.getName(), null)); } } return new VehiclePositionWithMatchedGeoFences(pos.getVehicleId(), 0L, pos.getLatitude(), pos.getLongitude(), pos.getEventTime(), matchedGeofences); });
  • 34. Geo-Fencing with Kafka Streams and Global KTable final KStream<String, VehiclePositionWithMatchedGeoFences> positionWithMatchedGeoFences = builder.stream(MATCHED_FENCE_STREAM); final StoreBuilder<KeyValueStore<String, VehiclePositionWithMatchedGeoFences>> vehicleGeoFenceStatusStore = Stores .keyValueStoreBuilder(Stores.persistentKeyValueStore("GeoFenceSnapshotStore"), Serdes.String(), positionWithMatchedGeoFencesSerde) .withCachingEnabled(); builder.addStateStore(bargeGeoFenceStatusStore); KStream<String, List<GeoEvent>> geoEvents = positionWithMatchedGeoFences.transformValues( () -> new GeoEventEmitter (bargeGeoFenceStatusStore.name()) ,vehicleGeoFenceStatusStore.name()); KStream<String, GeoEvent> geoEvent = geoEvents.flatMapValues(v -> v); KStream<String, GeoEvent> geoEventByVehicleId = geoEvent.selectKey((k, v) -> v.getVehicleId().toString()); geoEventByVechicleId.to(GEO_EVENT_STREAM);
  • 36. Tile38 • https://tile38.com • Open Source Geospatial Database & Geofencing Server • Real Time Geofencing • Roaming Geofencing • Fast Spatial Indices • Pluggable Event Notifications
  • 37. Tile38 – How does it work? > SETCHAN berlin WITHIN vehicle FENCE OBJECT {"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1 3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598], [13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473 ],[13.501167297363281,52.47148826410652], ...]]} > SUBSCRIBE berlin {"ok":true,"command":"subscribe","channel":"berlin","num":1,"elapsed":"5.85 µs"} . . . {"command":"set","group":"5d07581689807d000193ac33","detect":"outside","hoo k":"berlin","key":"vehicle","time":"2019-06- 17T09:06:30.624923584Z","id":"10","object":{"type":"Point","coordinates":[1 3.3096,52.4497]}} SET vehicle 10 POINT 52.4497 13.3096
  • 38. Tile38 – How does it work? > SETHOOK berlin_hook kafka://broker-1:9092/tile38_geofence_status WITHIN vehicle FENCE OBJECT {"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1 3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598], [13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473 ],[13.501167297363281,52.47148826410652], ...]]} bigdata@bigdata:~$ kafkacat -b localhost -t tile38_geofence_status % Auto-selecting Consumer mode (use -P or -C to override) {"command":"set","group":"5d07581689807d000193ac34","detect":"outside","hoo k":"berlin_hook","key":"vehicle","time":"2019-06- 17T09:12:00.488599119Z","id":"10","object":{"type":"Point","coordinates":[1 3.3096,52.4497]}} SET vehicle 10 POINT 52.4497 13.3096
  • 39. 1) Enrich with GeoFences – aggregated by geohash geofence Stream vehicle position Stream Invoke UDF {"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} Invoke UDF Geofence Service geofence status set_pos() set_fence() Stream udf status high low low high low high Scalable Latency "Code Smell" medium medium medium
  • 40. 2) Using Custom Kafka Connector for Tile38 geofence vehicle position {"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} Geofence Service kafka-to- tile38 kafka-to- tile38 geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium
  • 41. 2) Using Custom Kafka Connector for Tile38 curl -X PUT /api/kafka-connect-1/connectors/Tile38SinkConnector/config -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "connector.class": "com.trivadis.geofence.kafka.connect.Tile38SinkConnector", "topics": "vehicle_position", "tasks.max": "1", "tile38.key": "vehicle", "tile38.operation": "SET", "tile38.hosts": "tile38:9851" }' Currently only supports SET command
  • 45. Summary & Outlook • Summary • Geo Fencing is doable using Kafka and KSQL • KSQL is similar to SQL, but don't think relational • UDF and UDAF's is a powerful way to extend KSQL • Use Geo Hashes to partition work • Outlook • Performance Tests • Cleanup code of UDFs and UDAFs • Implement Kafka Source Connector for Tile 38
  • 46. Technology on its own won't help you. You need to know how to use it properly.