datastax 2016 presentations event breakouts c* sessions talks apache cassandra cassandra summit nosql datastax enterprise big data cassandra dse the last pickle spark real time database datastax enterprise graph database cql enterprise database data modeling data management netflix cluster apache spark open source solr data model graph database "cassandra summit hadoop datastax opscenter rdbms nodes apache solr instaclustr data clusters opscenter real-time cassandra query language distributed database kafka fault tolerance high availability operations distributed systems apache hadoop iot azure challenges thrift aws pythian nodetool datascale analytics graph financial services replication cx 3.0 time series fraud patrick mcfadin stream processing machine learning accenture oracle scalability duyhai doan jon haddad expero real time developer tools cql3 uptime ben slater the cloud microservices docker customer experience sony s3 fault tolerant data mining achilles at scale applications ansible real time data features use case database administrator insights dba load cloud applications data analytics database security smack stack comcast backup data science java 3.x christopher batey improvements bootstrapping virtual nodes monitoring protectwise designing apache tinkerpop distributed computing data ingestion deployment spotify building testing brian hess vnodes data centers jdbc application development schema alexander dejanovski elasticsearch issues logs fraud detection architecture apache cassandra committer mongodb google cloud platform spark shark example sstables acid apis python vinay chella latency predictive analytics cassandra data modeling quorum rob murphy scalable philip thompson ananth ram dse spark iot data startup in-memory database dse search integration alain rodriguez tombstones read path performance application nate mccall apache kafta ben lackey google codecentric ag microsoft security features high-performance sla big data application 3.4 time series data cloud target cassandra internals production ecommerce rachel pedreschi office 365 aaron morton user experience search datacenter single datacenter deployment repairs bootstrapping nodes fast reads/writes low volume clusters video sharing multi-datacenter clusters compactions high latency high volume restore backed up usability enhancements platform benchmark database migrate effective approach yabin meng exposed client drivers not supported protocol moving away writes improve throughput adam zegelin client application reduce cpu micro-batching submitting real world insurance company killrvideo indexing changes developer cassandra 2.x common mistakes art streamlined achilles 4.x syntax mistake ds object mapping code base automated restoratiom linearly scalable fleet jim peregord robust element corp stack pluggable iag blackrock cross-wan consistency multi-region clusters investment management platform aladdin glob metric path queries display plug-in aggregate analyse graphite cyanite sasi indexes store markus hofer sstable files bucketing datamodeling tombstoneoverwhelmingexception optimizing queries maintain buckets microservice architectures it consultant hofer dba's knewton's restoration strategy vpcs system_traces keyspace diagnose issues open sourced tools which again leverages ansible jeffrey berge cassandra-tracing tool real-time output metadata processing feature extraction centric architecture kildane software technologies kerry koitzsch image formats high accuracy analysis image processing applications implementation technology types quality messaging services randy fradin scalable storage scalable data architecture validation metadata data exploration test tool perform under stress re-writes cassandra-stress best and worst production cluster functions cassandra operations advanced interface jmx internal structures process changes cql changes benjamin lerer cassandra 2.2 disaster avoidance customized scripts split partitions ajay upadhyay scalable persistence streaming services senior software engineer arun agrawal payment information bookmarks billing distributed backups viewing history aaron ploetz regions providers lead technical architect crossfit relational compound keys high-volume database systems cassandra data adam hutson data architect composite keys clustering keys primary ad tech streaming processing roopa tangirala infrastructure prasanna padmanabhan recommendations time machine personalization cloud application architecture master data management customer 360 degree bank fraud monitor risk intuit trenches availability charles rich jkool demands streaming data analyzing i20 coursera spark analytics cloud comput) cassandra at instagram patches high scalability facebook low latency core infra team dikang gu yahoo japan next generation infrastructure nosql team satoshi konno automation emilio del tessandoro terabytes parallelizable problem trivially exporting data tooling cassandra exports storage infrastructure messaging tips brooke jensen internet of things diagnosis methods vp technical operations clear capital cqlengine triggers transactions real ooyala real t mysql data mode apache hadoop (software) real-time database message bus redis open message bus database engineer distributed processing cep storm cassadra atomic batchs leveled compaction databases sourceninja apache hadoop enterprise cap theorem data consistency data partitioning cassandra training healthcare technology infographic shark databricks free spark driver apache 2.0 license apache 2.0 silicon valley headquarters employees culture x1 dvr data center distributed architecture performance tuning training tunable consistency eventual consistency bulk loading pci-dss security compliance gazzang cassandra dba matija gobec failure smartcat hardware sasi ride performant indices full text query data like '%term%' full text search columns accuracy view jason cacciatore monitoring cassandra health monitoring system false alarms reactive jeffrey carpenter international choice hotels microservice cloud-based reservation system distributed eddie satterly leverage australia datanexus integrate silos anubhav kale running 400 node 300 tb job and talent data models carlos alonso core concepts parameters detailed look stability cassandra.yaml file configuration settings configuration files edward capriolo above and beyond data corruption cassandra tuning network splits schema design disk driver internals coordinator reading failures selecting replicas paging problems stephan kepser lessons event sourcing cqrs matthias niehoff standard deletes without tombstones fail eric stevens limitations deleting data deletion options solutions ttls scaling instagram infrastructure use cases key-value storage andrew baker multiple datacenters mesos abhishek verma uber node repairs framework machine utilization automate statically partitioned wide row store cdm eventually consistent shortcut to awesome dataset manager partition messaging system high throughput airbnb massive real-time datasets apache kafka data integration confluent linkedin ewen cheslack-postava tyler hobbs cloud database symantec endorse collisions shu zhang no-sql pl/sql ilya sokolov proxy nodes simplereach decommissioning nodes eric lubow dc clusters multiple data centers antientropy data inconsistency cassandra-11206 000 cells per partition large partitions 100 100mb robert stupp out-of-memory failures aws regions datadog grafana dashboards isolation 2.2.5 report generation 2.1.13 2.0.14 reconciliation anti-entropy deleted data nodetool repair primary range alexandra klimova pricing managers data visualisation allianz deutschland ag pipelines enterprise platforms flink db single installation scalable solution life-cycle management clustered solutions prepaid billing voucher management ericsson spark nodes centralised system playstation4 videos storage available platform alexander filipchik dustin pham mutation propagate streams support amazon kinesis replication lag norton ndbench open source tools iops heap pressure pool configurations 99th latency wide partition relational database avoid library component agent radovan zvoncek cluster topology replacing nodes hecuba2 bug free api expanding christos kalantzis distributed databases center of excellence db engineering rabbitmq users time-series pat patterson streamsets data collector cleanse user defined aggregates ingest collect full-table scan atomicity distributed software oom errors batching parameters memory russell spitzer throughput matt stump large-scale software vorstella strong consistency lwt replicas semantics data centres " light weight transactions syntax buffer objects custom scripts knewton collection g1 garbage jvm g1gc carlos monroy java tuning cassandra-7019 tick-tock release line delete data go90 datetieredcompactionstrategy videos scala mobile entertainment activity feed sstable 2.1 the pythian group john schulz transform extract ihs markit execution framework jobs jim hatcher etl protectwise " ip addresses approximate data data structures low cost ben kornmeier michaël figuière speculative retries 2.0 drivers cloud storage java driver 3.0 j.b. langston troubleshooting task automation black friday gdpr hybrid cloud contextual payments aci worldwide webinar banking finance digital transformation cloud computing linkurious datastax managed cloud softeam dsp2 virtual reality mesosphere psd2 payment services directive inventory adam hutson conceptual data model service scheduling application data modeler enterprise physical data model logical data model data storage on demand expiring columns live geospatial geometrical transformations exclusion search features convex hull complex polygons metrics 1.0 operational tooling chris lohfink software engineer cold storage joshua hollander parquet patch final approval local quorum consistency level performant systems keyspaces christopher bradford consistency rack networktopologystrategy node topology data enrichment responsive data platform sigmoid rahul kumar visualisation apache mesos version 2.1 soft delete cloud platform distributed architecture " carlos rolo subsystems release model james witschey tick-tock encryption ameesh divatia customer data data breaches baffle.io 3.6 javadocs lcs wei deng solutions (cassandra-10805) jiras dba-free rotating clusters mock data disk utilization dtcs security monitoring platform threat stack natural use case intersection bounding box stratio's cassandra lucene centroid live coding rdd apache zeppelin hardware requirements secundary indexes materialized views carlos rolo cassandra single node compaction strategies udf small cluster jbod shrink ben bromhead token pinning ebs backed disks scale reduce costs design performant scalable data model software development techniques design session remi trouville kibana independant elassandra cassandra-stress tool load test scaling requirements vaibhav puranik reporting visualization 5.0 enhancements high level nick panahi ariel weisberg future features beyond recovery thomas valley multi-data center failure scenarios pagerduty " donny nadolny conflicting data clock skew rimas silkaitis data analysis postgres heroku massive data ingest http based api cassieq no dependencies building queues anton kropp authentication distributed data stores installation distributed queue is hard php client libraries ruby c/c++ cloud datacenters c# “always on” in-memory performance apache ignite gridgain sql-99 read latencies oltp supply full sql-99 slas graphframes gke gce google compute engine kubernetes ravi madasu google container engine open source tracing http consultant mick semb wever cassandra-10392 zipkin opentracing project single tracing cluster-wide metadata daily batched gumgum microbatching lambda architecture access data store data development steps optimizing diagnostics setups design patterns micro batch processing system grpc luke tillman falcor cql 3 language re-write storage engine emodb non-blocking conflict-free replicated datatype cross data center communication distributed compactions json json delta global writes restful crdt solution architecture cpus murali kannan gradle open-source ci servers predrag knežević docker swarms production code dockerized hdfs speed level security integrated security ssl streaming applications widows dc kerberos human error manikandan srinivasan custom scripting mike lococo configuration deviations protecting cliff gilmore constrained deployment advanced replication replicating data multi-dc hub-and-spoke configurations auditing attribute based access control (abac) securing network communication secure deployments union devices sensors traversal language marko rodriguez graph structure olap-based vendor-agnostic dsegraph oltp- gremlin graph graph systems large-scale distributed alex popescu development teams large scale pivotal cloud foundry bosh cloud native applications platforms-as-a-service automated lifecycle manual deployment paas zero downtime developer experience usability system integration dsefs functional coverage rocco varela dse file system detection data graph theory identity theft synthetic identities agile sql client applications jacek lewandowski cqlsh property graphs conceptual-logical-physical artem chebotko flexible graph data model always-on applications spark integrations data analysis services application design robbie strickland configuration enable linearly scalable academic network source code pragmatic problem bob briody inspection reproducibles real-world analysis concept network analysis techniques node.js ldap/active directory authentication with kerberos key-management interoperability protocol (kmip) role-based authorization encrypting data files ldap role assignment batch analysis operational data dse analytics operational analytics ui tools strategies visualize meaningful audience graph data chris lacava user-centered gary stewart ing devops christopher middleware engineering distributed data smart meter zookeeper kafka brokers kafka-rest wei deng dse spark masters schema registry adversarial modeling
See more