SlideShare a Scribd company logo
Frequency-based Constraint Relaxation
for Private Query Processing in Cloud Databases
Junpei Kawamoto (Kyushu University, Japan)
Patricia L. Gillett (École Polytechnique de Montréal)
Cloud services and Privacy
• Cloud services as cloud databases.
• Sometimes people want to keep what they request to databases
secret.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 2
Find restaurants near by current location x.
Location based
services
Want to read article x
Forum sites
Private query processing
• Methodologies to obtain data w/o exposing queries.
• Several protocols such as cPIR† & bbPIR†† are introduced.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 3
Cloud Database
Current location x
query
res.
Encode x
Compute query results w/o
decoding queries
†Kushilevitz, E. and Ostrovsky, R.: Replication Is Not Needed: Single Database, Computationally-
Private Information Retrieval, Proc. of the 38th Annual Symposium on Foundations of Computer
Science, pp. 364-373, 1997.
E.g. Location Based Services
††Wang, S., Agrawal, D., and Abbadi, A.: Generalizing PIR for Practical Private Retrieval of Public
Data, Proc. of the 24th Annual IFIP WG 11.3 Working Conference on Data and Applications Security
and Privacy, pp. 1-16, 2010.
Three ideas of private queries
• We introduce three ideas for our discussion;
• Search intention: what users hope to obtain from cloud services,
• Query: request users send to servers to obtain data,
• Handled set: data set which servers must check to compute the results.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 4
Location x
Want information associated with
location x (search intention)
query
res.
Cloud DB
Must check these items to
compute the result(Handled set)
Handled set
for x
Existing private querying protocols
• Most existing protocols impose two constraints:
1. Queries are encoded in such a way that servers can handle query
processes but cannot actually decode queries;
2. Servers are made to check all data in the databases when computing
any query result.
• (2) means servers cannot distinguish any data in the DB.
• Servers spend O(n) computational cost executing each query, where
the database has n entries.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 5
Cloud DB
check all data any time
query
• There are cases in which we do not need to retrieve the
entire database to sufficiently obscure search intentions.
• It may be enough to hide where I am in downtown.
• What area is enough to ensure our privacy?
• E.g.
• Servers may guess the search intention is x with high probability.
• We should consider frequency of search intentions.
Constraint Relaxation
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 6
shop
x
shop
a
shop
b
shop
c
Popular place Unpopular places
Specified by a handled set
a handled set
Database model & frequency
• Database model
• Database D consists of n items: D = {t1, t2, …, tn}.
• Users request x-th item tx (The search intention is x).
• Handled set H(x)
• The item set servers must check to compute the results of a given
query associated with search intention x.
• Frequency of search intentions
• Freq(x) denotes the frequency of search intention x.
• We assume Freq(x) is normalized.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 7
Want tx (search intention)
query
res.
DB
H(x)
Must check these items to
compute the result(Handled set)
Definition: Query Risk
• A measure of exposure risk for private queries.
• The query risk of search intention x and handled set H(x) is
• Conditional probability that the search
intention is x given that we know the
handled set H used.
• The more frequent x is,
the higher the risk becomes.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 8


)(
)(Freq
)(Freq
))(|(Risk
xHy
y
x
xHx
frequenc
y
E.g.
• Risk(3|{1,2,3,4,5}) = 3/9
• Risk(1|{1,2,3,4,5}) = 1/9
• Risk(1|{1,2}) = 1/3
Definition: Privateness
• The maximum risk of complete protocols:
• We assume query risks should be less than or equals to
the maximum risk of complete protocols:
• Complete protocols are widely accepted that is the max. risk is, too.
• Our approach will also be considered private if it satisfies this condition.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 9
))(|(Riskmax))(|(Risk completerelaxed yHyxHx
Dy

))(|(Riskmax complete xHx
Dx
i.e. w/o relaxation
Our handled set Handled set for existing protocols
Definition: Query processing cost
• Query processing costs on servers.
• We evaluate them by the size of handled sets.
• The cost of search intention x is .
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 10
|)(|)(Cost xHx 
query
res.
DB
H(x)
H(x)
query
res.
DB
vs.
The problem
• Find handled set H such that, for search intention x:
1. ,
2. ,
3. minimize Cost(x),
4. if multiple solutions have equal cost, chose the one maximizes
• We solve this problem by Dynamic Programming.
• O(n) algorithm but details are in our paper.
• We also extend these problem and algorithm to
• range queries in 1D data,
• exact match queries in 2D data.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 11
x Î H(x)
)|(Riskmax))(|(Risk DyxHx Dy
  )(
)(FreqxHy
y
The protocol
• Our protocol employs some existing protocol (PIR, etc.).
• We assume the frequencies are public information.
• User: whose search intension is x,
1. Compute optimized handled set H(x) using query frequencies.
2. Compute a private query for x assuming a DB has only items in H(x).
3. Send the query and H(x) to the cloud server.
• The cloud server receiving the query,
1. Consider a sub-database consists of items in H(x).
2. Process the received query and return the result to the user.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 12
query
res. H(x) : sub-DB
Cost: O(|H(x)|)
• Dataset
• Query logs from †.
• Sampled 100,000 songs and 1,800,145 queries.
Evaluation
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 13
†http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/lastfm-1K.html
Frequency of search intention x.
x-axis: search intentions x
y-axis: # of times users requested
item tx
Evaluation
• Comparison of query risks.
• In most cases, risks are bigger than those of the complete protocols.
• Do not exceed the maximum risks of the complete protocol.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 14
relaxed (avg.) is computed by


Dx
xHxx ))(|(Risk)(Freq
min. max. avg.
comp. 5.6×10−7 3.9×10−3 1.0×10−5
relaxed 3.4×10−3 3.9×10−3 3.5×10−3
Query risk of search intention x.
Evaluation
• Comparison of query costs.
• Our relaxation methodologies reduce costs in most cases.
• The average cost is 6.5% that of complete protocols.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 15
relaxed (avg.) is computed by
Freq(x)´Cost(x)
xÎD
å
min. max. avg.
comp. 100000 100000 100000
relaxed 2 100000 6417
Query cost of search intention x.
Conclusion
• We introduced a frequency-based constraint relaxation
methodology for private queries.
• We relaxed constraint (2) of the complete protocols so that
only a subset of the database is retrieved for each query.
• We evaluated our proposal using a real dataset from .
• Our protocol can reduce computational costs in servers in most cases,
• The risk of a query being exposed is not bigger than the maximum risk
in complete protocols.
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 16
Servers must check all data in the databases
when computing any query result.
Acknowledgement
• This work is partly supported by
• The Nakajima Foundation,
• Artificial Intelligence Research Promotion Foundation,
• Grant-in-Aid for Young Scientists (B) (26730065), Japan Society for
the Promotion of Science (JSPS).
May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 17
Thank you for your attention!

More Related Content

What's hot (17)

PPT
Hands on Mahout!
OSCON Byrum
 
PDF
Using the Open Science Data Cloud for Data Science Research
Robert Grossman
 
PPT
Orchestrating the Intelligent Web with Apache Mahout
aneeshabakharia
 
PDF
Increasing Security Awareness in Enterprise Using Automated Feature Extractio...
Burman Noviansyah
 
PDF
Information Flow and Search in Unstructured Keyword based Social Networks
Prantik Bhattacharyya
 
PPSX
Csi
Paul Raj
 
PDF
What Are Science Clouds?
Robert Grossman
 
PDF
Jj3616251628
IJERA Editor
 
PPT
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
AllenWu
 
PDF
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
MOVING Project
 
PPT
HGrid A Data Model for Large Geospatial Data Sets in HBase
Dan Han
 
PPT
Probablistic information retrieval
Nisha Arankandath
 
PPTX
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
HPCC Systems
 
PDF
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
IJTET Journal
 
PPTX
Presented by Anu Mattatholi
Anu Mattatholi
 
PPTX
How To Analyze Geolocation Data with Hive and Hadoop
Hortonworks
 
PDF
CSCC-X2007
Vijay Desai
 
Hands on Mahout!
OSCON Byrum
 
Using the Open Science Data Cloud for Data Science Research
Robert Grossman
 
Orchestrating the Intelligent Web with Apache Mahout
aneeshabakharia
 
Increasing Security Awareness in Enterprise Using Automated Feature Extractio...
Burman Noviansyah
 
Information Flow and Search in Unstructured Keyword based Social Networks
Prantik Bhattacharyya
 
What Are Science Clouds?
Robert Grossman
 
Jj3616251628
IJERA Editor
 
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams
AllenWu
 
Keeping Linked Open Data Caches Up-to-date by Predicting the Life-time of RDF...
MOVING Project
 
HGrid A Data Model for Large Geospatial Data Sets in HBase
Dan Han
 
Probablistic information retrieval
Nisha Arankandath
 
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
HPCC Systems
 
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
IJTET Journal
 
Presented by Anu Mattatholi
Anu Mattatholi
 
How To Analyze Geolocation Data with Hive and Hadoop
Hortonworks
 
CSCC-X2007
Vijay Desai
 

Similar to Frequency-based Constraint Relaxation for Private Query Processing in Cloud Databases (20)

PDF
Approximation Data Structures for Streaming Applications
Debasish Ghosh
 
PPTX
Information Retrieval Dynamic Time Warping - Interspeech 2013 presentation
Xavier Anguera
 
PPTX
Outlier and fraud detection using Hadoop
Pranab Ghosh
 
PPTX
05 k-means clustering
Subhas Kumar Ghosh
 
PDF
Unit 1 Information Storage and Retrieval
KishorMahale5
 
PPTX
Mining high speed data streams: Hoeffding and VFDT
Davide Gallitelli
 
PDF
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Vienna Data Science Group
 
PDF
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxData
 
PPTX
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
Dataconomy Media
 
PDF
Apache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
MapR Technologies
 
PPTX
Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...
Neelabha Pant
 
PPSX
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
Dataconomy Media
 
PPT
Big Data Technologies - Hadoop
Talentica Software
 
PPTX
Watson Computer
Shaurya Gogia
 
PDF
Data_Prep_Techniques_Challenges_Methods.pdf
Shailja Thakur
 
PDF
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Ontico
 
PDF
A Production Quality Sketching Library for the Analysis of Big Data
Databricks
 
PDF
Differential privacy (개인정보 차등보호)
Young-Geun Choi
 
PPTX
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxData
 
PDF
What is a Data Commons and Why Should You Care?
Robert Grossman
 
Approximation Data Structures for Streaming Applications
Debasish Ghosh
 
Information Retrieval Dynamic Time Warping - Interspeech 2013 presentation
Xavier Anguera
 
Outlier and fraud detection using Hadoop
Pranab Ghosh
 
05 k-means clustering
Subhas Kumar Ghosh
 
Unit 1 Information Storage and Retrieval
KishorMahale5
 
Mining high speed data streams: Hoeffding and VFDT
Davide Gallitelli
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Vienna Data Science Group
 
InfluxEnterprise Architectural Patterns by Dean Sheehan, Senior Director, Pre...
InfluxData
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
Dataconomy Media
 
Apache Drill: An Active, Ad-hoc Query System for large-scale Data Sets
MapR Technologies
 
Hyperoptimized Machine Learning and Deep Learning Methods For Geospatial and ...
Neelabha Pant
 
"Quantum Clustering - Physics Inspired Clustering Algorithm", Sigalit Bechler...
Dataconomy Media
 
Big Data Technologies - Hadoop
Talentica Software
 
Watson Computer
Shaurya Gogia
 
Data_Prep_Techniques_Challenges_Methods.pdf
Shailja Thakur
 
Шардинг в MongoDB, Henrik Ingo (MongoDB)
Ontico
 
A Production Quality Sketching Library for the Analysis of Big Data
Databricks
 
Differential privacy (개인정보 차등보호)
Young-Geun Choi
 
InfluxEnterprise Architecture Patterns by Tim Hall & Sam Dillard
InfluxData
 
What is a Data Commons and Why Should You Care?
Robert Grossman
 
Ad

More from Junpei Kawamoto (20)

PDF
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
Junpei Kawamoto
 
PDF
初期レビューを用いた長期間評価推定􏰀
Junpei Kawamoto
 
PDF
Securing Social Information from Query Analysis in Outsourced Databases
Junpei Kawamoto
 
PDF
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
Junpei Kawamoto
 
PDF
Privacy for Continual Data Publishing
Junpei Kawamoto
 
PDF
暗号化ベクトルデータベースのための索引構造
Junpei Kawamoto
 
PDF
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
Junpei Kawamoto
 
PDF
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
Junpei Kawamoto
 
PDF
データ共有型WEBアプリケーションにおけるサーバ暗号化
Junpei Kawamoto
 
PDF
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
Junpei Kawamoto
 
PDF
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
Junpei Kawamoto
 
PDF
プライバシを考慮した移動系列情報解析のための安全性の提案
Junpei Kawamoto
 
PDF
A Locality Sensitive Hashing Filter for Encrypted Vector Databases
Junpei Kawamoto
 
PDF
位置情報解析のためのプライバシ保護手法
Junpei Kawamoto
 
PDF
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Junpei Kawamoto
 
PDF
Private Range Query by Perturbation and Matrix Based Encryption
Junpei Kawamoto
 
PDF
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
Junpei Kawamoto
 
PPT
VLDB09勉強会 Session27 Privacy2
Junpei Kawamoto
 
PDF
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Junpei Kawamoto
 
PPTX
Security of Social Information from Query Analysis in DaaS
Junpei Kawamoto
 
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
Junpei Kawamoto
 
初期レビューを用いた長期間評価推定􏰀
Junpei Kawamoto
 
Securing Social Information from Query Analysis in Outsourced Databases
Junpei Kawamoto
 
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
Junpei Kawamoto
 
Privacy for Continual Data Publishing
Junpei Kawamoto
 
暗号化ベクトルデータベースのための索引構造
Junpei Kawamoto
 
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
Junpei Kawamoto
 
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
Junpei Kawamoto
 
データ共有型WEBアプリケーションにおけるサーバ暗号化
Junpei Kawamoto
 
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
Junpei Kawamoto
 
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
Junpei Kawamoto
 
プライバシを考慮した移動系列情報解析のための安全性の提案
Junpei Kawamoto
 
A Locality Sensitive Hashing Filter for Encrypted Vector Databases
Junpei Kawamoto
 
位置情報解析のためのプライバシ保護手法
Junpei Kawamoto
 
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Junpei Kawamoto
 
Private Range Query by Perturbation and Matrix Based Encryption
Junpei Kawamoto
 
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
Junpei Kawamoto
 
VLDB09勉強会 Session27 Privacy2
Junpei Kawamoto
 
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Junpei Kawamoto
 
Security of Social Information from Query Analysis in DaaS
Junpei Kawamoto
 
Ad

Recently uploaded (20)

PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PPTX
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
Top Managed Service Providers in Los Angeles
Captain IT
 
PDF
Rethinking Security Operations - SOC Evolution Journey.pdf
Haris Chughtai
 
PDF
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
PDF
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
PDF
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
PDF
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
PPTX
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
PDF
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
PDF
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
PDF
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
MSP360 Backup Scheduling and Retention Best Practices.pptx
MSP360
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
July Patch Tuesday
Ivanti
 
Top Managed Service Providers in Los Angeles
Captain IT
 
Rethinking Security Operations - SOC Evolution Journey.pdf
Haris Chughtai
 
Complete JavaScript Notes: From Basics to Advanced Concepts.pdf
haydendavispro
 
Why Orbit Edge Tech is a Top Next JS Development Company in 2025
mahendraalaska08
 
Persuasive AI: risks and opportunities in the age of digital debate
Speck&Tech
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
Human-centred design in online workplace learning and relationship to engagem...
Tracy Tang
 
Empowering Cloud Providers with Apache CloudStack and Stackbill
ShapeBlue
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
ShapeBlue
 
UiPath Academic Alliance Educator Panels: Session 2 - Business Analyst Content
DianaGray10
 
NewMind AI Journal - Weekly Chronicles - July'25 Week II
NewMind AI
 
TrustArc Webinar - Data Privacy Trends 2025: Mid-Year Insights & Program Stra...
TrustArc
 
Ampere Offers Energy-Efficient Future For AI And Cloud
ShapeBlue
 

Frequency-based Constraint Relaxation for Private Query Processing in Cloud Databases

  • 1. Frequency-based Constraint Relaxation for Private Query Processing in Cloud Databases Junpei Kawamoto (Kyushu University, Japan) Patricia L. Gillett (École Polytechnique de Montréal)
  • 2. Cloud services and Privacy • Cloud services as cloud databases. • Sometimes people want to keep what they request to databases secret. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 2 Find restaurants near by current location x. Location based services Want to read article x Forum sites
  • 3. Private query processing • Methodologies to obtain data w/o exposing queries. • Several protocols such as cPIR† & bbPIR†† are introduced. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 3 Cloud Database Current location x query res. Encode x Compute query results w/o decoding queries †Kushilevitz, E. and Ostrovsky, R.: Replication Is Not Needed: Single Database, Computationally- Private Information Retrieval, Proc. of the 38th Annual Symposium on Foundations of Computer Science, pp. 364-373, 1997. E.g. Location Based Services ††Wang, S., Agrawal, D., and Abbadi, A.: Generalizing PIR for Practical Private Retrieval of Public Data, Proc. of the 24th Annual IFIP WG 11.3 Working Conference on Data and Applications Security and Privacy, pp. 1-16, 2010.
  • 4. Three ideas of private queries • We introduce three ideas for our discussion; • Search intention: what users hope to obtain from cloud services, • Query: request users send to servers to obtain data, • Handled set: data set which servers must check to compute the results. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 4 Location x Want information associated with location x (search intention) query res. Cloud DB Must check these items to compute the result(Handled set) Handled set for x
  • 5. Existing private querying protocols • Most existing protocols impose two constraints: 1. Queries are encoded in such a way that servers can handle query processes but cannot actually decode queries; 2. Servers are made to check all data in the databases when computing any query result. • (2) means servers cannot distinguish any data in the DB. • Servers spend O(n) computational cost executing each query, where the database has n entries. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 5 Cloud DB check all data any time query
  • 6. • There are cases in which we do not need to retrieve the entire database to sufficiently obscure search intentions. • It may be enough to hide where I am in downtown. • What area is enough to ensure our privacy? • E.g. • Servers may guess the search intention is x with high probability. • We should consider frequency of search intentions. Constraint Relaxation May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 6 shop x shop a shop b shop c Popular place Unpopular places Specified by a handled set a handled set
  • 7. Database model & frequency • Database model • Database D consists of n items: D = {t1, t2, …, tn}. • Users request x-th item tx (The search intention is x). • Handled set H(x) • The item set servers must check to compute the results of a given query associated with search intention x. • Frequency of search intentions • Freq(x) denotes the frequency of search intention x. • We assume Freq(x) is normalized. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 7 Want tx (search intention) query res. DB H(x) Must check these items to compute the result(Handled set)
  • 8. Definition: Query Risk • A measure of exposure risk for private queries. • The query risk of search intention x and handled set H(x) is • Conditional probability that the search intention is x given that we know the handled set H used. • The more frequent x is, the higher the risk becomes. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 8   )( )(Freq )(Freq ))(|(Risk xHy y x xHx frequenc y E.g. • Risk(3|{1,2,3,4,5}) = 3/9 • Risk(1|{1,2,3,4,5}) = 1/9 • Risk(1|{1,2}) = 1/3
  • 9. Definition: Privateness • The maximum risk of complete protocols: • We assume query risks should be less than or equals to the maximum risk of complete protocols: • Complete protocols are widely accepted that is the max. risk is, too. • Our approach will also be considered private if it satisfies this condition. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 9 ))(|(Riskmax))(|(Risk completerelaxed yHyxHx Dy  ))(|(Riskmax complete xHx Dx i.e. w/o relaxation Our handled set Handled set for existing protocols
  • 10. Definition: Query processing cost • Query processing costs on servers. • We evaluate them by the size of handled sets. • The cost of search intention x is . May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 10 |)(|)(Cost xHx  query res. DB H(x) H(x) query res. DB vs.
  • 11. The problem • Find handled set H such that, for search intention x: 1. , 2. , 3. minimize Cost(x), 4. if multiple solutions have equal cost, chose the one maximizes • We solve this problem by Dynamic Programming. • O(n) algorithm but details are in our paper. • We also extend these problem and algorithm to • range queries in 1D data, • exact match queries in 2D data. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 11 x Î H(x) )|(Riskmax))(|(Risk DyxHx Dy   )( )(FreqxHy y
  • 12. The protocol • Our protocol employs some existing protocol (PIR, etc.). • We assume the frequencies are public information. • User: whose search intension is x, 1. Compute optimized handled set H(x) using query frequencies. 2. Compute a private query for x assuming a DB has only items in H(x). 3. Send the query and H(x) to the cloud server. • The cloud server receiving the query, 1. Consider a sub-database consists of items in H(x). 2. Process the received query and return the result to the user. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 12 query res. H(x) : sub-DB Cost: O(|H(x)|)
  • 13. • Dataset • Query logs from †. • Sampled 100,000 songs and 1,800,145 queries. Evaluation May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 13 †http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/lastfm-1K.html Frequency of search intention x. x-axis: search intentions x y-axis: # of times users requested item tx
  • 14. Evaluation • Comparison of query risks. • In most cases, risks are bigger than those of the complete protocols. • Do not exceed the maximum risks of the complete protocol. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 14 relaxed (avg.) is computed by   Dx xHxx ))(|(Risk)(Freq min. max. avg. comp. 5.6×10−7 3.9×10−3 1.0×10−5 relaxed 3.4×10−3 3.9×10−3 3.5×10−3 Query risk of search intention x.
  • 15. Evaluation • Comparison of query costs. • Our relaxation methodologies reduce costs in most cases. • The average cost is 6.5% that of complete protocols. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 15 relaxed (avg.) is computed by Freq(x)´Cost(x) xÎD å min. max. avg. comp. 100000 100000 100000 relaxed 2 100000 6417 Query cost of search intention x.
  • 16. Conclusion • We introduced a frequency-based constraint relaxation methodology for private queries. • We relaxed constraint (2) of the complete protocols so that only a subset of the database is retrieved for each query. • We evaluated our proposal using a real dataset from . • Our protocol can reduce computational costs in servers in most cases, • The risk of a query being exposed is not bigger than the maximum risk in complete protocols. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 16 Servers must check all data in the databases when computing any query result.
  • 17. Acknowledgement • This work is partly supported by • The Nakajima Foundation, • Artificial Intelligence Research Promotion Foundation, • Grant-in-Aid for Young Scientists (B) (26730065), Japan Society for the Promotion of Science (JSPS). May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 17 Thank you for your attention!