Neo4j after 1 year in production

Neo4j
After 1 year in production
with Andrey Nikishaev

What we will talk about today
Neo4j internals
Cypher - query language
Extensions developing
Neo4j in production
Conclusion

Data
Properties
Linked lists of properties records. Key:Value in
each.
Node
Refers to its first Property & first node in its
relationship chain.
Relationship
Refers to its first Property & Start and End Nodes.
Also it refers to Prev/Next Relationship of its
Start/End Nodes.
All data in Neo4j is Linked lists with fixed size records.
● ID lookup = O(1)
● It's great at localized searches. E.g. to get the
people you follow.
● It's not great at aggregation. E.g. the nodes or
relationships aren't stored in any sorted order,
so deriving the 20 most popular users
requires a full scan.
● It suffers from the "supernode problem". At
least currently, a node's neighboring
relationships are stored as a flat list, so if you
have a million followers, fetching even one
person you follow is slow.

Caching
File Cache
Blocks of the same size.
Map blocks with OS Mmap to memory.
Evicts data by LFU policy
(hits vs misses).
Object Cache (removed in v2.3+)
Saves serialized data to memory to boost
queries.
No eviction policy (can eat all your memory)
Evicted only on transaction log sync(HA) or data
deletion.
To use it you should warm it up with query like
this:
MATCH (n)
OPTIONAL MATCH (n)-[r]->()
RETURN count(n.prop) + count(r.prop);

Transactions
As a context Tx using Thread Local Object.
Gathering lists of
commands
Sorting commands
(predictable
execution order)
Write commands to
Tx log
Mark Tx in log as
finished
Write to DB
Tx Log
Tx ID

HA
Only Master-Slave replication
● Sync every N time (configurable).
● All writes only through the master. Writes on slave would
be done slower.
● Same Node/Rels IDs on all servers.
● Needs quorum for write else read-only mode.
● IDs allocated by blocks.
● Master elects by this rules:
○ Highest Tx ID.
○ If multiple: instance that was master for this Tx.
○ If unavailable: instance with the lowest clock value.
○ If multiple: instance with the lowest ID.

Cypher
MATCH (girl: Girl)
WHERE girl.age > 18 AND girl.age < 25
AND (
NOT (girl)-[:HAS_BOYFRIEND]->(some_dick: Guy) )
OR NOT (girl)-[:HAS_BOYFRIEND]->(pussy: Guy)-[:ENGAGED_IN]->(gym: Gym)
)
RETURN girl
ORDER BY girl.age ASC

Cypher
No query watcher
You should control each query that goes to a server, because a query can kill the server.
Read all data first
When you engage with properties(extend operation) data gets cached in memory, if it does not fit there
then query will crash(or even the server). Evan MATCH (n) DELETE n will fail if you have many nodes.
Locking
Making an update query doesn’t mean that you set an update lock, even in a transaction.
MATCH (n:Node)
SET n.count = n.count + 1
MATCH (n:Node)
SET n._lock = true
SET n.count = n.count + 1
FAIL PASS
More about this at: http://goo.gl/Cy3MEU

Cypher
You can try it on real data for free here: https://neo4j.com/sandbox-v2/
Similarity example.
Used recommendation dataset: 32314 Nodes, 332622 Relations
Top 25 similar users:
MATCH
(u1:User)-[:RATED]->(:Movie)<-[:RATED]-(u2:User)
return [u1.name,u2.name] as pairs, count(*) as cnt
order by cnt desc
limit 25
Run time: 16366 ms. Number of pairs: 6 246 674
Most queries will not work
without warming up.
Use Indexes as much as
possible.

Cypher
> Sushi restaurants in New York that my friends like.
MATCH (person:Person)-[:IS_FRIEND_OF]->(friend),
(friend)-[:LIKES]->(restaurant:Restaurant),
(restaurant)-[:LOCATED_IN]->(loc:Location),
(restaurant)-[:SERVES]->(type:Cuisine)
WHERE person.name = 'Philip'
AND loc.location = 'New York'
AND type.cuisine = 'Sushi'
RETURN restaurant.name, count(*) AS occurrence
ORDER BY occurrence DESC
LIMIT 5
https://neo4j.com/developer/guide-build-a-recommendation-engine/

Extensions developing
User-Defined Procedures & Functions
Same as in SQL DBs
Unmanaged server extensions
Extensions that can create new API to work with Neo4j. You can even create
new Dashboard.
Server plugins
Extensions that only can extend Neo4j Core API.
Kernel extensions
Here you can do almost anything.
https://github.com/creotiv/neo4j-kernel-plugin-example

User-Defined Procedures & Functions (v3.0+ only)
public class Join
{
@UserFunction
@Description("example.join(['s1','s2',...], delimiter) - join the given strings with the
given delimiter.")
public String join(
@Name("strings") List<String> strings,
@Name(value = "delimiter", defaultValue = ",") String delimiter) {
if (strings == null || delimiter == null) {
return null;
}
return String.join(delimiter, strings);
}
}
Calling:
MATCH (p: Person)
WHERE p.age = 36
RETURN org.neo4j.examples.join(collect(p.names))

Unmanaged extensions
@Path("/helloworld")
public class HelloWorldResource {
private final GraphDatabaseService database;
public HelloWorldResource(@Context GraphDatabaseService database) {
this.database = database;
}
@GET
@Produces(MediaType.TEXT_PLAIN)
@Path("/{nodeId}")
public Response hello(@PathParam("nodeId") long nodeId) {
return Response.status(Status.OK).entity(
UTF8.encode("Hello World, nodeId=" + nodeId)).build();
}
}

Kernel extensions - Factory
public class ExampleKernelExtensionFactory extends KernelExtensionFactory<ExampleKernelExtensionFactory.Dependencies> {
public static abstract class ExampleSettings {
public static Setting<Boolean> debug = setting("examplekernelextension.debug", BOOLEAN, Settings.FALSE);
}
public ExampleKernelExtensionFactory() {super(SERVICE_NAME);}
@Override
public Lifecycle newKernelExtension(Dependencies dependencies) throws Throwable {
Config config = dependencies.getConfig();
return new ExampleExtension(dependencies.getGraphDatabaseService(), config.get(ExampleSettings.debug), ...);
}
public interface Dependencies {
GraphDatabaseService getGraphDatabaseService();
Config getConfig();
}
}

Kernel extensions - Extension
public class ExampleExtension implements Lifecycle {
...
public ExampleExtension(GraphDatabaseService gds, Boolean debug, String somevar) {
this.gds = gds;
this.debug = debug;
this.somevar = somevar;
}
@Override
public void init() throws Throwable {
handler = new ExampleEventHandler(gds, debug, somevar);
gds.registerTransactionEventHandler(handler);
}
... Start/Stop methods ...
@Override
public void shutdown() throws Throwable {
gds.unregisterTransactionEventHandler(handler);
}
}

Kernel extensions - Event Handler
class ExampleEventHandler implements TransactionEventHandler<String> {
...
@Override
public String beforeCommit(TransactionData transactionData) throws Exception {
updateConstraints();
return prepareCreatedNodes(transactionData);
}
@Override
public void afterCommit(TransactionData transactionData, String result) {
processCreatedNodes(result);
}
@Override
public void afterRollback(TransactionData transactionData, String result) {
error("Something bad happend, Harry: " + result);
}
}

Kernel extensions - Event Handler
Problems
beforeCommit (which should be run when DB is not changed)
You can’t access deleted nodes params, labels, relations, because they are already deleted. Yeah..
strange. So you need to gather them from events data.
afterCommit (which should be run after transaction committed and closed)
Executed when transaction is still opened, which will lead to deadlock(without any info and exception) if
you try to update your local db.
Local DB
- Bad API.
- You can’t access to the HA status of the local server, need to run requests through REST API.
- No way to access user request.
- Plugins can conflict with each other and cause deadlocks.

Neo4j in Production - Cache-Based Sharding
Cache A Cache B Cache C
Router

Neo4j in Production - Settings
Log slow queries
dbms.querylog.enabled=true
dbms.querylog.threshold=4s
Logical logs for debug
keep_logical_logs=7 days
Enable online backup
online_backup_enabled=true
online_backup_server=127.0.0.1:6362
Number of threads (for concurrent access)
org.neo4j.server.webserver.maxthreads=64
(default number of CPUs)
Memory used for page cache
dbms.pagecache.memory=2g
Time of pulling updates from master
ha.pull_interval=10 (seconds)
Without timeout replication
Number of slaves to which Tx will be pushed
upon commit on master.(Optimistic - can mark Tx
success even if some pushes failed)
ha.tx_push_factor=1
Push strategy
Fixed push Txs based on server id order.
ha.tx_push_strategy=fixed|round_robin
Master to slave communication chunk size
ha.com_chunk_size=2M
Maximum number of connections a slave can have
to the master
ha.max_concurrent_channels_per_slave=20
http://neo4j.com/docs/stable/ha-configuration.html

Neo4j in Production - Performance
Use SSD
It is much cheaper than 16-32Gb RAM
IO tunning
Disable file and dir access time updates.
Set deadline scheduler for disk operations. This will increase read
speed but decrease write speed.
$ echo 'deadline' > /sys/block/sda/queue/scheduler
$ cat /sys/block/sda/queue/scheduler
Memory tunning
Set dbms.pagecache.memory to the size of *store*.db files +
20-40% for growth.
Leave some memory for OS
OS Memory = 1GB + (size of graph.db/index) + (size of
graph.db/schema)
If you see swapping then increase OS memory size.
JVM tunning
Set dbms.memory.heap.initial_size and
dbms.memory.heap.max_size to the same size to avoid
unwanted full garbage collection pauses.
Use concurrent Garbage Collector -XX:+UseG1GC
Set old/new generation ration -XX:NewRatio=N (1
minimum. calculated like old/new = ratio)
The more data updated in Txs the lower ratio you need.

Neo4j in Production - Problems
- Based on Java
- Not stable
- Problems with memory use and control
- No control over queries
- Problems with some silly queries like “delete all”
- No sharding
- No DC - replication
- No master-master replication
- Query planning is a mystery
- Can’t work without big amount of memory
- Dashboard shows unreal execution time
- Hell with plugin deployment
- Problems with data loss on master
death
- Problems with not synced data
during requests.
- Coming soon ...

Thank You!
User Stories: https://neo4j.com/case-studies/
Free Sand box with data: https://neo4j.com/sandbox-v2/
Kernel extension example https://github.com/creotiv/neo4j-kernel-plugin-example
Advanced locking: http://goo.gl/Cy3MEU
HA configuration: http://neo4j.com/docs/stable/ha-configuration.html
Andrey Nikishaev
creotiv@gmail.com
fb.me/anikishaev

Neo4j after 1 year in production

More Related Content

What's hot (20)

Similar to Neo4j after 1 year in production (20)

More from Andrew Nikishaev (10)

Recently uploaded (20)

Neo4j after 1 year in production