SlideShare a Scribd company logo
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
      


      


      


      



• 
      


      


      
• 

      



• 

      


      


      




      
 
     


     




 
     


     


     
 
 


 


 
 


 


 
!

 


 
 
     


     




 
     


     
 
     


     




 
     


     


     


     
 
�
�   �
HadoopとMongoDBを活用したソーシャルアプリのログ解析
HadoopとMongoDBを活用したソーシャルアプリのログ解析
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
     ‣ 

     ‣ 

     ‣ 

• 
     ‣ 

     ‣ 
• 
     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
def mapper(key, value):!
      for word in value.split(): yield word,1!
def reducer(key, values):!
      yield key,sum(values)!
if __name__ == "__main__":!
      import dumbo!
      dumbo.run(mapper, reducer)




dumbo start wordcount.py !
-hadoop /path/to/hadoop !
-input wc_input.txt -output wc_output
• 
      ‣ 

      ‣ 

      ‣ 

      ‣ 

python wordcount.py map < wc_input.txt | sort | !
     python wordcount.py red > wc_output.txt
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 


     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
• 
 -----Change------!
 ActionLogger    a{ChangeP}          (Point,1371,1383)       !
 ActionLogger    a{ChangeP}          (Point,2373,2423)!

 ActionLogger    a{ChangeMedal}      (lucky_star,9,10)    !
 ActionLogger    a{ChangeMedal}      (lucky_sea_bream,0,1)!

 ActionLogger    a{ChangeG}                  !

 ActionLogger    a{ChangeSubG} (SubGold,13,16)               !

 ActionLogger    a{ChangeWakuwakuP}       (buy,0,30)!
 ActionLogger    a{ChangeWakuwakuP}       (by gacha,30,0) !

 ------Get------!
 ActionLogger     a{GetMaterial}    (syouhinnomoto,0,-1) !
 ActionLogger     a{GetMaterial}    usesyouhinnomoto !
 ActionLogger     a{GetMaterial}    (omotyanomotoPRO,1,6)!
 ActionLogger     a{GetMaterial}    (sui-tunomoto,5,4)!

 ActionLogger    a{GetInterior}      (bakery_counter,0,1)!

 ActionLogger    a{GetAvatarPart}      (190167,0,1)      !
 ActionLogger    a{GetAvatarPart}      (old_girl_09,0,1) !

 -----Trade-----!
 ActionLogger     a{Trade}              buy 3 itigoke-kis from gree.jp:xxxxx   !
• 
            
           ‣ 
           ‣ 
2010-07-26 00:00:02,446 INFO catalina-exec-483 ActionLogger –
userId a{Make} make item onsenmanjyuu!
2010-07-26 00:00:02,478 INFO catalina-exec-411 ActionLogger –
userId a{LifeCycle} Login



userId 2010-07-26 00:00:02,446 a{Make}        {onsenmanjyuu,1}!
userId 2010-07-26 00:00:02,478 a{LifeCycle}   {Login,1}!
userId 2010-07-26 00:00:02,478 a{GetMaterial} {omotyanomotoPRO,5}!
• 
      
     ‣ 
     ‣ 
     ‣ 
     ‣ 
• 


     • 

     • 
     • 

     • 
     • 

     • 
     • 

     • 
     • 
• 
     { !
        "_id" : "2010-06-27+xxxxx+a{ChangeP}",!
        "lastUpdate" : "2010-09-17",!
        "date" : "2010-06-27" !
        "userId" : “xxxxx",!
        "actionType" : "a{ChangeP}",!
        "actionDetail" : { "Point" : 600 },!
     }!
     { !
        "_id" : "2010-06-27+xxxxx+a{LifeCycle}", !
        "lastUpdate" : "2010-09-17",!
        "date" : "2010-06-27" !
        "userId" : ”xxxxx",!
        "actionType" : "a{LifeCycle}",!
        "actionDetail" : { ”Login" : 3 }!
     }!
• 
     { "_id" : "2010-08-31+group+a{PutOn}", !
       "date" : "2010-08-31", !
       "lastUpdate" : "2010-09-21",!
       "actionType" : "a{PutOn}",!
        "actionDetail" : { "a{PutOn}" : 52050 } !
     }!
     {...!
      "actionType" : "a{Make}",!
      "actionDetail" : { !
                         ”syurijyou” : 11,!
                         ”aisukuri-mu” : 378,!
                         ”kinnokarakuridokei” : 103,!
                         ”puramoderu” : 22,!
                         ”guremurinno_n” : 164,!
                         ”kyodaipenginno_n” : 76,!
                         ”patinko” : 67,!
                         “wakizasi” : 250,!
                         “dendendaiko” : 13651,!
                         ... (over 100 items)!
                        }!
     }!
HadoopとMongoDBを活用したソーシャルアプリのログ解析
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
     ‣ 


     ‣ 


     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
     ‣ 

     ‣ 

     ‣ 


     ‣ 

     ‣ 

     ‣ 
• 
MySQL:   select * from things where x=3 and y="foo"!
MongoDB: db.things.find( { x : 3, y : "foo" } );!


MySQL: select z from things where x=3!
MongoDB: db.things.find( { x : 3 }, { z : 1 } );


db.collection.find({ "field" : { $gt: value } } ); !
//       : field > value !

db.collection.find({ "field" : { $lt: value } } ); !
//       : field < value !

db.collection.find({"field”: {$gt: value1, $lt: value2}});!
 // value1 <= field <= value2
MySQL:   select * from things where x in (b,a,c)!
MongoDB: db.collection.find( { "field" : { $in : array } } ); !

     db.things.find({j:{$in: [2,4,6]}});!



db.customers.find( { name : /acme.*corp/i } ); !


db.myCollection.find().sort( { ts : -1 } ); // ts             !



>   m = function() { emit(this.user_id, 1); } !
>   r = function(k,vals) { return 1; } !
>   res = db.events.mapreduce(m, r, { query : {type:'sale'} }); !
>   db[res.result].find().limit(2) !
{   "_id" : 8321073716060 , "value" : 1 } !
{   "_id" : 7921232311289 , "value" : 1 } !
• 
{ !
   "_id" : "2010-06-27+xxxxx+a{ChangeP}",!
   "lastUpdate" : "2010-09-17",!
   "date" : "2010-06-27" !
   "userId" : “xxxxx",!
   "actionType" : "a{ChangeP}",!
   "actionDetail" : { "Point" : 600 },!
}!
{ !
   "_id" : "2010-06-27+xxxxx+a{LifeCycle}", !
   "lastUpdate" : "2010-09-17",!
   "date" : "2010-06-27" !
   "userId" : ”xxxxx",!
   "actionType" : "a{LifeCycle}",!
   "actionDetail" : { ”Login" : 3 }!
}!
HadoopとMongoDBを活用したソーシャルアプリのログ解析
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
     ‣ 

• 
     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
      




         • 
         • 
         • 


         • 
         • 
         • 


         • 
         • 
         • 
• 

{!
 "_id" : "2010-06-28+xxxx+Charge",!
 "lastUpdate" : "2010-09-20",!
 "userId" : ”xxxx",!
 "date" : "2010-06-28",!
 "actionType" : "Charge",!
 "totalCharge" : 1210,!
 "boughtItem" : { "          EX 5 " : 1,!
                  "           5 " : 1,!
                  "         5 " : 1,!
                  "          " : 1,!
                  "     " : 2 }!
}!
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
     ‣ 

     ‣ 

• 
     ‣ 

     ‣ 

     ‣ 
• 
• 
      
• 
      
• 
     {!
          "_id" : "2010-07-11+xxxxx+Registration",!
          "lastUpdate" : "2010-09-25",!
          "actionType" : "Registration",!
          "userId" : ”xxxxx",!
          "date" : "2010-07-11",!
          "firstCharge" : "2010-07-12",!
          "lastCharge" : "2010-09-02",!
          "lastLogin" : "2010-09-02",!
          "firstChargeTerm" : 1,!
          "playTerm" : 50,!
          "totalMonthCharge" : 1000,!
          "totalMonthChargeDetail" : {!
              "1th" : 74.3!
              "2th" : 17.1,!
              "3th" : 8.6,!                             i.e.
              "4th" : 0,!
          },!
          "totalCumlativeCharge" : 10000,!
          "totalCumlativeChargeDetail" : {!
              "1th" : 2,!
              "2th" : 0.5,!
              "3th" : 0.2,!
              "4th" : 0,!
              "5th" : 0.1,!
              "6th" : 27.5,!
              "7th" : 1.2,!
              "8th" : 49!
              "9th" : 19.5,!                     2.7%
          }!
     }!
• 
topMonthCharge = function(n){!
 return db.user_registration.find({},{!
   totalMonthCharge:true,!
   totalMonthChargeDetail:true,!
   userId:true!
 }).sort({totalMonthCharge:-1}).limit(n);!
}!

> topMonthCharge(20)                                          !
{ !
   "_id" : "2010-07-10+9999+Registration",!   Top20
   "userId" : ”9999”,!
   "totalMonthCharge" : 10000,!
   "totalMonthChargeDetail" : { "5th" : 13.7, "4th" : 27.6,
"3th" : 21, "2th" : 16.2, "1th" : 21.5 }!
}!
…!
findUser = function(x){ !
 return db.user_charge.find({userId:x},{!
   userId:true,!
   totalCharge:true,!
   boughtItem:true}).sort({date:-1})!
}!
> findUserCharge("9999")!
{!
     "_id" : "2010-09-08+9223458+Charge",!
     "totalCharge" : 2000,!
     "userId" : ”9999",!
     "boughtItem" : {!                       Top
         "        110 " : 2!
     }!
}!
{!
     "_id" : "2010-09-07+9223458+Charge",!
     "totalCharge" : 5000,!
     "userId" : ”9999",!
     "boughtItem" : {!
         "        350 " : 1,!
         "        110 " : 2!
     }!
}!
…!
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
• 
• 
• 
• 
• 
db.user_error!
 db.user_access!    (           )!   db.user_trace!
(from          )!                    (from       )!




                    db.user_attr!
                    (          )!




 db.user_status!                     db.user_charge!
(from Cassandra)!                     (from MySQL)!
 
• 
     ‣ 


     ‣ 


     ‣ 

     ‣ 
• 
     ‣ 


     ‣ 


     ‣ 



     ‣ 
HadoopとMongoDBを活用したソーシャルアプリのログ解析

More Related Content

What's hot (11)

PDF
SQLAlchemy Seminar
Yury Yurevich
 
PDF
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
MongoDB
 
PDF
PHP Loves MongoDB - Dublin MUG (by Hannes)
Mark Hillick
 
PDF
FrontInBahia 2014: 10 dicas de desempenho para apps mobile híbridas
Loiane Groner
 
PDF
Юрий Буянов «Squeryl — ORM с человеческим лицом»
e-Legion
 
PDF
Solr & Lucene @ Etsy by Gregg Donovan
Gregg Donovan
 
PDF
前端MVC之BackboneJS
Zhang Xiaoxue
 
PDF
Letswift18 워크숍#1 스위프트 클린코드와 코드리뷰
Jung Kim
 
PDF
JSF Mashups in Action
Hazem Saleh
 
PDF
I Love Ruby
mahersaif
 
PDF
jQuery%20on%20Rails%20Presentation
guestcf600a
 
SQLAlchemy Seminar
Yury Yurevich
 
Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range
MongoDB
 
PHP Loves MongoDB - Dublin MUG (by Hannes)
Mark Hillick
 
FrontInBahia 2014: 10 dicas de desempenho para apps mobile híbridas
Loiane Groner
 
Юрий Буянов «Squeryl — ORM с человеческим лицом»
e-Legion
 
Solr & Lucene @ Etsy by Gregg Donovan
Gregg Donovan
 
前端MVC之BackboneJS
Zhang Xiaoxue
 
Letswift18 워크숍#1 스위프트 클린코드와 코드리뷰
Jung Kim
 
JSF Mashups in Action
Hazem Saleh
 
I Love Ruby
mahersaif
 
jQuery%20on%20Rails%20Presentation
guestcf600a
 

Viewers also liked (20)

PDF
MongoDBとAjaxで作る解析フロントエンド&GraphDBを用いたソーシャルデータ解析
Takahiro Inoue
 
PDF
A21 列指向DB HP Vertica ~その圧倒的な高速検索の謎を解き明かす~ byKeizo Aizawa
Insight Technology, Inc.
 
KEY
ソーシャルゲームログ解析基盤のHadoop活用事例
知教 本間
 
PPTX
SQLまで使える高機能NoSQLであるCouchbase Serverの勉強会資料
樽八 仲川
 
PDF
Javaでmongo db
Funato Takashi
 
PDF
MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜
Takahiro Inoue
 
PDF
Awsでつくるapache kafkaといろんな悩み
Keigo Suda
 
PDF
Rancher/Kubernetes入門ハンズオン資料~第2回さくらとコンテナの夕べ #さくらの夕べ 番外編
Masahito Zembutsu
 
PDF
「GraphDB徹底入門」〜構造や仕組み理解から使いどころ・種々のGraphDBの比較まで幅広く〜
Takahiro Inoue
 
PDF
分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)
NTT DATA OSS Professional Services
 
PDF
Mongo DBを半年運用してみた
Masakazu Matsushita
 
PPTX
がっつりMongoDB事例紹介
Tetsutaro Watanabe
 
PDF
AWS Blackbelt 2015シリーズ Amazon Storage Service (S3)
Amazon Web Services Japan
 
PDF
Amazon S3を中心とするデータ分析のベストプラクティス
Amazon Web Services Japan
 
PDF
Nosqlの基礎知識(2013年7月講義資料)
CLOUDIAN KK
 
PDF
ログ管理のベストプラクティス
Akihiro Kuwano
 
PPTX
初心者向けMongoDBのキホン!
Tetsutaro Watanabe
 
PDF
Cassandraとh baseの比較して入門するno sql
Yutuki r
 
PDF
最新業界事情から見るデータサイエンティストの「実像」
Takashi J OZAKI
 
PPTX
何故DeNAがverticaを選んだか?
Kenshin Yamada
 
MongoDBとAjaxで作る解析フロントエンド&GraphDBを用いたソーシャルデータ解析
Takahiro Inoue
 
A21 列指向DB HP Vertica ~その圧倒的な高速検索の謎を解き明かす~ byKeizo Aizawa
Insight Technology, Inc.
 
ソーシャルゲームログ解析基盤のHadoop活用事例
知教 本間
 
SQLまで使える高機能NoSQLであるCouchbase Serverの勉強会資料
樽八 仲川
 
Javaでmongo db
Funato Takashi
 
MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜
Takahiro Inoue
 
Awsでつくるapache kafkaといろんな悩み
Keigo Suda
 
Rancher/Kubernetes入門ハンズオン資料~第2回さくらとコンテナの夕べ #さくらの夕べ 番外編
Masahito Zembutsu
 
「GraphDB徹底入門」〜構造や仕組み理解から使いどころ・種々のGraphDBの比較まで幅広く〜
Takahiro Inoue
 
分散処理基盤ApacheHadoop入門とHadoopエコシステムの最新技術動向(OSC2015 Kansai発表資料)
NTT DATA OSS Professional Services
 
Mongo DBを半年運用してみた
Masakazu Matsushita
 
がっつりMongoDB事例紹介
Tetsutaro Watanabe
 
AWS Blackbelt 2015シリーズ Amazon Storage Service (S3)
Amazon Web Services Japan
 
Amazon S3を中心とするデータ分析のベストプラクティス
Amazon Web Services Japan
 
Nosqlの基礎知識(2013年7月講義資料)
CLOUDIAN KK
 
ログ管理のベストプラクティス
Akihiro Kuwano
 
初心者向けMongoDBのキホン!
Tetsutaro Watanabe
 
Cassandraとh baseの比較して入門するno sql
Yutuki r
 
最新業界事情から見るデータサイエンティストの「実像」
Takashi J OZAKI
 
何故DeNAがverticaを選んだか?
Kenshin Yamada
 
Ad

Similar to HadoopとMongoDBを活用したソーシャルアプリのログ解析 (20)

PDF
MongoDBで作るソーシャルデータ新解析基盤
Takahiro Inoue
 
PDF
1 24 - user data management
MongoDB
 
PDF
Webinar: User Data Management with MongoDB
MongoDB
 
PPTX
Super spike
Michael Falanga
 
PDF
Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)
MongoSF
 
PDF
Why couchdb is cool
Gabriele Lana
 
PPTX
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Dan Robinson
 
PDF
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
Johannes Hoppe
 
PDF
Next Top Data Model by Ian Plosker
SyncConf
 
PDF
MongoDB, Hadoop and humongous data - MongoSV 2012
Steven Francia
 
PDF
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
Big Data Spain
 
PDF
MongoDB 在盛大大数据量下的应用
iammutex
 
PDF
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...
HostedbyConfluent
 
PDF
CouchDB Vs MongoDB
Gabriele Lana
 
PDF
Advanced Analytics & Statistics with MongoDB
John De Goes
 
PDF
Apéro RubyBdx - MongoDB - 8-11-2011
pierrerenaudin
 
PDF
Advanced MongoDB #1
Takahiro Inoue
 
PDF
Bringing back the excitement to data analysis
Data Science London
 
PDF
Xxx treme aggregation
Bill Slacum
 
PPTX
Utilizing Arrays: Modeling, Querying and Indexing
Keshav Murthy
 
MongoDBで作るソーシャルデータ新解析基盤
Takahiro Inoue
 
1 24 - user data management
MongoDB
 
Webinar: User Data Management with MongoDB
MongoDB
 
Super spike
Michael Falanga
 
Map/reduce, geospatial indexing, and other cool features (Kristina Chodorow)
MongoSF
 
Why couchdb is cool
Gabriele Lana
 
Powering Heap With PostgreSQL And CitusDB (PGConf Silicon Valley 2015)
Dan Robinson
 
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
Johannes Hoppe
 
Next Top Data Model by Ian Plosker
SyncConf
 
MongoDB, Hadoop and humongous data - MongoSV 2012
Steven Francia
 
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
Big Data Spain
 
MongoDB 在盛大大数据量下的应用
iammutex
 
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...
HostedbyConfluent
 
CouchDB Vs MongoDB
Gabriele Lana
 
Advanced Analytics & Statistics with MongoDB
John De Goes
 
Apéro RubyBdx - MongoDB - 8-11-2011
pierrerenaudin
 
Advanced MongoDB #1
Takahiro Inoue
 
Bringing back the excitement to data analysis
Data Science London
 
Xxx treme aggregation
Bill Slacum
 
Utilizing Arrays: Modeling, Querying and Indexing
Keshav Murthy
 
Ad

More from Takahiro Inoue (20)

PDF
Treasure Data × Wave Analytics EC Demo
Takahiro Inoue
 
PDF
トレジャーデータとtableau実現する自動レポーティング
Takahiro Inoue
 
PDF
Tableauが魅せる Data Visualization の世界
Takahiro Inoue
 
PDF
トレジャーデータのバッチクエリとアドホッククエリを理解する
Takahiro Inoue
 
PDF
20140708 オンラインゲームソリューション
Takahiro Inoue
 
PDF
トレジャーデータ流,データ分析の始め方
Takahiro Inoue
 
PDF
オンラインゲームソリューション@トレジャーデータ
Takahiro Inoue
 
PDF
事例で学ぶトレジャーデータ 20140612
Takahiro Inoue
 
PDF
トレジャーデータ株式会社について(for all Data_Enthusiast!!)
Takahiro Inoue
 
PDF
この Visualization がすごい2014 〜データ世界を彩るツール6選〜
Takahiro Inoue
 
PDF
Treasure Data Intro for Data Enthusiast!!
Takahiro Inoue
 
PDF
Hadoop and the Data Scientist
Takahiro Inoue
 
PDF
MongoDB: Intro & Application for Big Data
Takahiro Inoue
 
PDF
An Introduction to Fluent & MongoDB Plugins
Takahiro Inoue
 
PDF
An Introduction to Tinkerpop
Takahiro Inoue
 
PDF
An Introduction to Neo4j
Takahiro Inoue
 
PDF
The Definition of GraphDB
Takahiro Inoue
 
PDF
Large-Scale Graph Processing〜Introduction〜(完全版)
Takahiro Inoue
 
PDF
Large-Scale Graph Processing〜Introduction〜(LT版)
Takahiro Inoue
 
PDF
はじめてのGlusterFS
Takahiro Inoue
 
Treasure Data × Wave Analytics EC Demo
Takahiro Inoue
 
トレジャーデータとtableau実現する自動レポーティング
Takahiro Inoue
 
Tableauが魅せる Data Visualization の世界
Takahiro Inoue
 
トレジャーデータのバッチクエリとアドホッククエリを理解する
Takahiro Inoue
 
20140708 オンラインゲームソリューション
Takahiro Inoue
 
トレジャーデータ流,データ分析の始め方
Takahiro Inoue
 
オンラインゲームソリューション@トレジャーデータ
Takahiro Inoue
 
事例で学ぶトレジャーデータ 20140612
Takahiro Inoue
 
トレジャーデータ株式会社について(for all Data_Enthusiast!!)
Takahiro Inoue
 
この Visualization がすごい2014 〜データ世界を彩るツール6選〜
Takahiro Inoue
 
Treasure Data Intro for Data Enthusiast!!
Takahiro Inoue
 
Hadoop and the Data Scientist
Takahiro Inoue
 
MongoDB: Intro & Application for Big Data
Takahiro Inoue
 
An Introduction to Fluent & MongoDB Plugins
Takahiro Inoue
 
An Introduction to Tinkerpop
Takahiro Inoue
 
An Introduction to Neo4j
Takahiro Inoue
 
The Definition of GraphDB
Takahiro Inoue
 
Large-Scale Graph Processing〜Introduction〜(完全版)
Takahiro Inoue
 
Large-Scale Graph Processing〜Introduction〜(LT版)
Takahiro Inoue
 
はじめてのGlusterFS
Takahiro Inoue
 

Recently uploaded (20)

PDF
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Exolore The Essential AI Tools in 2025.pdf
Srinivasan M
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Achieving Consistent and Reliable AI Code Generation - Medusa AI
medusaaico
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 

HadoopとMongoDBを活用したソーシャルアプリのログ解析

  • 2. •          •       
  • 3. •    •         
  • 4.            
  • 5.
  • 9.          
  • 10.              
  • 11.
  • 12. � �
  • 16. •  ‣  ‣  ‣  •  ‣  ‣ 
  • 17. •  ‣  ‣  ‣  ‣  ‣ 
  • 18. •  ‣  ‣  ‣  ‣  ‣ 
  • 19. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  • 20. •  def mapper(key, value):! for word in value.split(): yield word,1! def reducer(key, values):! yield key,sum(values)! if __name__ == "__main__":! import dumbo! dumbo.run(mapper, reducer) dumbo start wordcount.py ! -hadoop /path/to/hadoop ! -input wc_input.txt -output wc_output
  • 21. •  ‣  ‣  ‣  ‣  python wordcount.py map < wc_input.txt | sort | ! python wordcount.py red > wc_output.txt
  • 23. •  ‣  ‣  ‣  ‣  ‣ 
  • 25. •  -----Change------! ActionLogger a{ChangeP} (Point,1371,1383) ! ActionLogger a{ChangeP} (Point,2373,2423)! ActionLogger a{ChangeMedal} (lucky_star,9,10) ! ActionLogger a{ChangeMedal} (lucky_sea_bream,0,1)! ActionLogger a{ChangeG} ! ActionLogger a{ChangeSubG} (SubGold,13,16) ! ActionLogger a{ChangeWakuwakuP} (buy,0,30)! ActionLogger a{ChangeWakuwakuP} (by gacha,30,0) ! ------Get------! ActionLogger a{GetMaterial} (syouhinnomoto,0,-1) ! ActionLogger a{GetMaterial} usesyouhinnomoto ! ActionLogger a{GetMaterial} (omotyanomotoPRO,1,6)! ActionLogger a{GetMaterial} (sui-tunomoto,5,4)! ActionLogger a{GetInterior} (bakery_counter,0,1)! ActionLogger a{GetAvatarPart} (190167,0,1) ! ActionLogger a{GetAvatarPart} (old_girl_09,0,1) ! -----Trade-----! ActionLogger a{Trade} buy 3 itigoke-kis from gree.jp:xxxxx !
  • 26. •    ‣  ‣  2010-07-26 00:00:02,446 INFO catalina-exec-483 ActionLogger – userId a{Make} make item onsenmanjyuu! 2010-07-26 00:00:02,478 INFO catalina-exec-411 ActionLogger – userId a{LifeCycle} Login userId 2010-07-26 00:00:02,446 a{Make} {onsenmanjyuu,1}! userId 2010-07-26 00:00:02,478 a{LifeCycle} {Login,1}! userId 2010-07-26 00:00:02,478 a{GetMaterial} {omotyanomotoPRO,5}!
  • 27. •    ‣  ‣  ‣  ‣ 
  • 28. •  •  •  •  •  •  •  •  •  • 
  • 29. •  { ! "_id" : "2010-06-27+xxxxx+a{ChangeP}",! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : “xxxxx",! "actionType" : "a{ChangeP}",! "actionDetail" : { "Point" : 600 },! }! { ! "_id" : "2010-06-27+xxxxx+a{LifeCycle}", ! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : ”xxxxx",! "actionType" : "a{LifeCycle}",! "actionDetail" : { ”Login" : 3 }! }!
  • 30. •  { "_id" : "2010-08-31+group+a{PutOn}", ! "date" : "2010-08-31", ! "lastUpdate" : "2010-09-21",! "actionType" : "a{PutOn}",! "actionDetail" : { "a{PutOn}" : 52050 } ! }! {...! "actionType" : "a{Make}",! "actionDetail" : { ! ”syurijyou” : 11,! ”aisukuri-mu” : 378,! ”kinnokarakuridokei” : 103,! ”puramoderu” : 22,! ”guremurinno_n” : 164,! ”kyodaipenginno_n” : 76,! ”patinko” : 67,! “wakizasi” : 250,! “dendendaiko” : 13651,! ... (over 100 items)! }! }!
  • 33. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  • 34. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  • 35. •  MySQL: select * from things where x=3 and y="foo"! MongoDB: db.things.find( { x : 3, y : "foo" } );! MySQL: select z from things where x=3! MongoDB: db.things.find( { x : 3 }, { z : 1 } ); db.collection.find({ "field" : { $gt: value } } ); ! // : field > value ! db.collection.find({ "field" : { $lt: value } } ); ! // : field < value ! db.collection.find({"field”: {$gt: value1, $lt: value2}});! // value1 <= field <= value2
  • 36. MySQL: select * from things where x in (b,a,c)! MongoDB: db.collection.find( { "field" : { $in : array } } ); ! db.things.find({j:{$in: [2,4,6]}});! db.customers.find( { name : /acme.*corp/i } ); ! db.myCollection.find().sort( { ts : -1 } ); // ts ! > m = function() { emit(this.user_id, 1); } ! > r = function(k,vals) { return 1; } ! > res = db.events.mapreduce(m, r, { query : {type:'sale'} }); ! > db[res.result].find().limit(2) ! { "_id" : 8321073716060 , "value" : 1 } ! { "_id" : 7921232311289 , "value" : 1 } !
  • 37. •  { ! "_id" : "2010-06-27+xxxxx+a{ChangeP}",! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : “xxxxx",! "actionType" : "a{ChangeP}",! "actionDetail" : { "Point" : 600 },! }! { ! "_id" : "2010-06-27+xxxxx+a{LifeCycle}", ! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : ”xxxxx",! "actionType" : "a{LifeCycle}",! "actionDetail" : { ”Login" : 3 }! }!
  • 40. •  ‣  •  ‣  ‣  ‣  ‣ 
  • 41. •    •  •  •  •  •  •  •  •  • 
  • 42. •  {! "_id" : "2010-06-28+xxxx+Charge",! "lastUpdate" : "2010-09-20",! "userId" : ”xxxx",! "date" : "2010-06-28",! "actionType" : "Charge",! "totalCharge" : 1210,! "boughtItem" : { " EX 5 " : 1,! " 5 " : 1,! " 5 " : 1,! " " : 1,! " " : 2 }! }!
  • 44. •  ‣  ‣  •  ‣  ‣  ‣ 
  • 46. • 
  • 47. • 
  • 48. •  {! "_id" : "2010-07-11+xxxxx+Registration",! "lastUpdate" : "2010-09-25",! "actionType" : "Registration",! "userId" : ”xxxxx",! "date" : "2010-07-11",! "firstCharge" : "2010-07-12",! "lastCharge" : "2010-09-02",! "lastLogin" : "2010-09-02",! "firstChargeTerm" : 1,! "playTerm" : 50,! "totalMonthCharge" : 1000,! "totalMonthChargeDetail" : {! "1th" : 74.3! "2th" : 17.1,! "3th" : 8.6,! i.e. "4th" : 0,! },! "totalCumlativeCharge" : 10000,! "totalCumlativeChargeDetail" : {! "1th" : 2,! "2th" : 0.5,! "3th" : 0.2,! "4th" : 0,! "5th" : 0.1,! "6th" : 27.5,! "7th" : 1.2,! "8th" : 49! "9th" : 19.5,! 2.7% }! }!
  • 49. •  topMonthCharge = function(n){! return db.user_registration.find({},{! totalMonthCharge:true,! totalMonthChargeDetail:true,! userId:true! }).sort({totalMonthCharge:-1}).limit(n);! }! > topMonthCharge(20) ! { ! "_id" : "2010-07-10+9999+Registration",! Top20 "userId" : ”9999”,! "totalMonthCharge" : 10000,! "totalMonthChargeDetail" : { "5th" : 13.7, "4th" : 27.6, "3th" : 21, "2th" : 16.2, "1th" : 21.5 }! }! …!
  • 50. findUser = function(x){ ! return db.user_charge.find({userId:x},{! userId:true,! totalCharge:true,! boughtItem:true}).sort({date:-1})! }! > findUserCharge("9999")! {! "_id" : "2010-09-08+9223458+Charge",! "totalCharge" : 2000,! "userId" : ”9999",! "boughtItem" : {! Top " 110 " : 2! }! }! {! "_id" : "2010-09-07+9223458+Charge",! "totalCharge" : 5000,! "userId" : ”9999",! "boughtItem" : {! " 350 " : 1,! " 110 " : 2! }! }! …!
  • 53. db.user_error! db.user_access! ( )! db.user_trace! (from )! (from )! db.user_attr! ( )! db.user_status! db.user_charge! (from Cassandra)! (from MySQL)!
  • 54.
  • 55. •  ‣  ‣  ‣  ‣ 
  • 56. •  ‣  ‣  ‣  ‣