Responsible Data Use in AI - core tech pillars

Responsible Data Use
Sofus A. Macskássy
Data Science @ LinkedIn
smacskassy@linkedin.com
November 2019

Pillars of
Responsible
Data Use
Bias
Privacy
Explainability
Governance

The Coded Gaze [Joy Buolamwini 2016]
Face detection software: Fails for some darker faces
Bias

• Facial analysis software:
Higher accuracy for light
skinned men
• Error rates for dark skinned
women: 20% - 34%
Gender Shades
[Joy Buolamwini &
Timnit Gebru, 2018]
Bias

• Ethical challenges posed
by AI systems
• Inherent biases present in
society
• Reflected in training data
• AI/ML models prone to
amplifying such biases
Algorithmic Bias
Bias

Massachusetts Group
Insurance Commission
(1997): Anonymized medical
history of state employees
William Weld vs
Latanya Sweeney
Latanya Sweeney (MIT grad
student): $20 – Cambridge
voter roll
born July 31, 1945
resident of 02138
Privacy

64%Uniquely identifiable with ZIP +
birth date + gender (in the US
population)
Golle, “Revisiting the Uniqueness of Simple Demographics in the US Population”, WPES 2006
Privacy

A self driving car knocked down and
killed a pedestrian in Tempe, AZ in 2018.
- Who is to blame (accountability)
- Who to prevent this (safety)
- Should we ban self-driving cars
(liability and policy evaluation)
The need for XAI
Explainability
https://www.nytimes.com/2018/03/19/technology/uber-driverless-fatality.html

A recent research paper shows that a
classifier that could recognize wolves
from husky dogs was basing its decision
solely on the presence of snow in the
background.
The need for XAI
Ribeiro, Singh, and Guestrin. 2016. "Why Should I Trust You?": Explaining
the Predictions of Any Classifier. SIGKDD 2016.
Explainability

The need for XAI
Explainable AI is good for multiple reasons:
- Builds trust (why did you do this)
- Can be judged (how much do I believe
the prediction)
- Can be corrected (new training or
tweaks to correct errors)
- Is actionable (I know what do to next)
- … Explainability

Data Governance
Governance
Reflect company policies
Ensures compliance
Protects data
Protects company
Involves all orgs in a company
https://www.dama.org/sites/default/files/download/DAMA-DMBOK2-Framework-V2-20140317-FINAL.pdf

Laws against Discrimination
Immigration Reform and Control Act
Citizenship
Rehabilitation Act of 1973;
Americans with Disabilities
Act of 1990
Disability status
Civil Rights Act of 1964
Race
Age Discrimination in Employment
Act of 1967
Age
Equal Pay Act of 1963;
Civil Rights Act of 1964
Sex
And more...

Fairness Privacy
Transparency Explainability

Responsible
Data Use @
LinkedIn
Case studies
- Bias
- Privacy
- Governance

LinkedIn operates the largest professional
network on the Internet
Tell your story 645M+ members
30M+
companies are
represented on
LinkedIn
90K+
schools listed
(high school &
college)
35K+
skills listed
20M+
open jobs
on LinkedIn
Jobs
280B
Feed updates

Bias @
LinkedIn
Fairness-aware Talent
Search Ranking

Guiding Principle:
“Diversity by Design”

Insights to
Identify Diverse
Talent Pools
Representative
Talent Search
Results
Diversity
Learning
Curriculum
“Diversity by Design” in LinkedIn’s Talent Solutions

Inclusive Job Descriptions / Recruiter Outreach

Representative Ranking for Talent Search
S. C. Geyik, S. Ambler,
K. Kenthapadi, Fairness-
Aware Ranking in Search &
Recommendation Systems with
Application to LinkedIn Talent
Search, KDD’19.
[Microsoft’s AI/ML
conference
(MLADS’18). Distinguished
Contribution Award]
Building Representative
Talent Search at LinkedIn
(LinkedIn engineering blog)

Intuition for Measuring and Achieving
Representativeness
• Ideal: Top ranked results should follow a desired distribution on
gender/age/…
• E.g., same distribution as the underlying talent pool
• Inspired by “Equal Opportunity” definition [Hardt et al, NIPS’16]
• Defined measures (skew, divergence) based on this intuition

Fairness-aware Reranking Algorithm (Simplified)
• Partition the set of potential candidates into different buckets for each
attribute value
• Rank the candidates in each bucket according to the scores assigned by
the machine-learned model
• Merge the ranked lists, balancing the representation requirements and
the selection of highest scored candidates
• Algorithmic variants based on how we choose the next attribute

Validating Our Approach
• Gender Representativeness
• Over 95% of all searches are representative compared to the qualified
population of the search
• Business Metrics
• A/B test over LinkedIn Recruiter users for two weeks
• No significant change in business metrics (e.g., # InMails sent or accepted)
• Ramped to 100% of LinkedIn Recruiter users worldwide

Lessons
learned
• Post-processing approach desirable
• Model agnostic
• Scalable across different model choices
for our application
• Acts as a “fail-safe”
• Robust to application-specific business
logic
• Easier to incorporate as part of existing
systems
• Build as a stand-alone service or
component for post-processing
• No significant modifications to the existing
components
• Complementary to efforts to reduce bias from
training data & during model training

Engineering for Fairness in AI Lifecycle
Problem
Formation
Dataset
Construction
Algorithm
Selection
Training
Process
Testing
Process
Deployment
Feedback
Is an algorithm an ethical
solution to our problem?
Does our data include enough
minority samples?
Are there missing/biased
features?
Do we need to apply debiasing
algorithms to preprocess our
data?
Do we need to include fairness
constraints in the function?
Have we evaluated the model
using relevant fairness metrics?
Is the model’s effect
similar across all users?
Are we deploying our
model on a population
that we did not train/test
on?
Does the model encourage
feedback loops that can
produce increasingly unfair
outcomes?
Credit: K. Browne & J. Draper

Engineering for Fairness in AI Lifecycle
S.Vasudevan, K. Kenthapadi, FairScale: A Scalable Framework for Measuring Fairness in AI Applications, 2019

Fairness-aware Experimentation
[Saint-Jacques & Sepehri, KDD’19 Social Impact Workshop]
Imagine LinkedIn has 10 members.
Each of them has 1 session a day.
A new product increases sessions by +1 session per member on average.
Both of these are +1 session / member on average!
One is much more unequal than the other. We want to catch that.

Privacy @
LinkedIn
Framework to compute
robust, privacy-
preserving analytics

Analytics & Reporting Products at LinkedIn
Profile View
Analytics
34
Content
Analytics
Ad Campaign
Analytics
All showing demographics
of members engaging
with the product

• Admit only a small # of predetermined query types
• Querying for the number of member actions, for a specified time period,
together with the top demographic breakdowns

• Admit only a small # of predetermined query types
• Querying for the number of member actions, for a specified time period,
together with the top demographic breakdowns
E.g., Title = “Senior
Director”
E.g., Clicks on a
given ad

Privacy Requirements
• Attacker cannot infer whether a member performed an action
• E.g., click on an article or an ad
• Attacker may use auxiliary knowledge
• E.g., knowledge of attributes associated with the target member (say,
obtained from this member’s LinkedIn profile)
• E.g., knowledge of all other members that performed similar action (say, by
creating fake accounts)

Possible Privacy Attacks
38
Targeting:
Senior directors in US, who studied at Cornell
Matches ~16k LinkedIn members
→ over minimum targeting threshold
Demographic breakdown:
Company = X
May match exactly one person
→ can determine whether the person
clicks on the ad or not
Require minimum reporting threshold
Attacker could create fake profiles!
E.g. if threshold is 10, create 9 fake profiles
that all click.
Rounding mechanism
E.g., report incremental of 10
Still amenable to attacks
E.g. using incremental counts over time to
infer individuals’ actions
Need rigorous techniques to preserve member privacy
(not reveal exact aggregate counts)

Problem Statement
•Compute robust, reliable analytics in a privacy-
preserving manner, while addressing the product needs.

Defining Privacy
42
CuratorCurator
+ your data
- your data

Differential Privacy
43
Databases D and D′ are neighbors if they differ in one person’s data.
Differential Privacy: The distribution of the curator’s output M(D) on database
D is (nearly) the same as M(D′).
Curator
+ your data
- your data
Dwork, McSherry, Nissim, Smith [TCC 2006]
Curator

Keeping our data safe and
secure for members
Problem statement
• We have a lot of data
• Some may have PII data
• How do we keep this secure?
• Removing PII data
• Tracking access

Policy: Keeping the data safe
Solution through Technology
• Meta data store
• Tag all attributes in all tables
• Know which fields are PII
• Know which fields need protection
• Audit access to data
• Obfuscate data wherever possible

Tracking pedigree of data
• Tables can be combined to
create new tables
• Automatically track
pedigree of attributes and
their PII value
• Assess new attributes for
PII as well
• Have authors be
accountable
Name Type PII ?
Name string Yes
Age String Yes
A1 string No
A2 url No
Name Type PII ?
Name string Yes
Adult boolean No
B2 string No
C1 number No
Name Type PII ?
Name string Yes
B1 number No
B2 string No
B3 string No

Fairness in ML
• Application specific challenges
• Conversational AI systems: Unique bias/fairness/ethics considerations
• E.g., Hate speech, Complex failure modes
• Beyond protected categories, e.g., accent, dialect
• Entire ecosystem (e.g., including apps such as Alexa skills)
• Two-sided markets: e.g., fairness to buyers and to sellers, or to content consumers
and producers
• Fairness in advertising (externalities)
• Tools for ensuring fairness (measuring & mitigating bias) in AI lifecycle
• Pre-processing (representative datasets; modifying features/labels)
• ML model training with fairness constraints
• Post-processing
• Experimentation & Post-deployment

Explainability in ML
• Actionable explanations
• Balance between explanations & model secrecy
• Robustness of explanations to failure modes (Interaction between ML
components)
• Application-specific challenges
• Conversational AI systems: contextual explanations
• Gradation of explanations
• Tools for explanations across AI lifecycle
• Pre & post-deployment for ML models
• Model developer vs. End user focused

Privacy in ML
• Privacy-preserving model training, robust against adversarial
membership inference attacks
• Privacy for highly sensitive data: model training & analytics using secure
enclaves, homomorphic encryption, federated learning / on-device
learning, or a hybrid
• Privacy-preserving transfer learning (broadly, privacy-preserving
mechanisms for data marketplaces)

Thank you
Sofus A. Macskássy
Data Science @ LinkedIn
smacskassy@linkedin.com

Responsible Data Use in AI - core tech pillars

More Related Content

What's hot (20)

Similar to Responsible Data Use in AI - core tech pillars (20)

Recently uploaded (20)

Responsible Data Use in AI - core tech pillars