Christian’s Corner

Privacy for C2PA signers

2026-03-13T00:00:00+00:00

We can’t trust the images and videos we see online anymore. Recent generative AI improvements support the creation and modification of convincing digital media in quasi real time. We live in an era where these fakes are routinely shared online, sometimes for harmless fun, but increasingly to influence public opinion.

Fortunately, technologies exist to embed cryptographic signatures and watermarks in these digital assets, proving their origin. The Coalition for Content Provenance and Authenticity (C2PA) specification has become the leading mechanism to add cryptographic authenticity to digital media, and has been adopted by many technology providers, camera manufacturers, and news media organizations.¹ Major deployments have started in 2025 and will accelerate in 2026. We can soon imagine a world where assets with a verified origin can be positively flagged or prioritized by online platforms, versus those without, similarly to the shift of trust that happened in the web transition from HTTP to HTTPS.

In many contexts however (e.g., conflict zones, protests, corruption reporting) asset creators might be reluctant to share certified images and videos that identify them for fear of retribution. Thankfully, there are ways to reconcile the need for authenticated assets and the privacy of their creators. This post explores strategies to achieve various levels of privacy for C2PA signers, ranging from pseudonymity to full anonymity.

Note that I only consider the signer’s (a person or organization) privacy here, not the ability to redact or modify the asset itself. This is possible using the C2PA assertion redaction mechanism, or more advanced cryptographic mechanisms.²

Example scenarios

Many scenarios require privacy for the signer, for example:

A photographer documenting human rights abuses wants to remain anonymous to avoid retribution from a hostile government but would like to disclose their affiliation (e.g., AFP, Reuters, or AP). The news organization publishing the images and videos doesn’t want the ability to identify the specific photographer to avoid the risk of being compelled to disclose their identity; they just want to know it came from one member of their trusted network.
A news organization receives a signed video from a confidential source for a story, and wants to share it without disclosing the source’s identity (or leaking identifiable attributes, such as a device identifier) while preserving the authenticity guarantees of the video.
A whistleblower records some incriminating conversations using their phone and releases the audio files signed by their corporate identity. They want to remain anonymous without anyone (including the employer who issued the identity credential) being able to trace their identity.
A Banksy-style artist creates authenticated pictures and posts them at various online locations. They want to remain pseudonymous reusing the same signing credential, but without linking it to their real life identity.

Linkability

In this blog post, I explain the subtle ways an identity credential can be tracked and traced by its issuer and various verifiers. The current C2PA specification only supports X.509 certificates to generate (claim) signatures, which is an inescapably linkable credential type: an issuer (i.e., Certificate Authority) can always recognize the certificate it issued to a specific device or user. Other privacy-friendly credential types could be added to a C2PA manifest using, e.g., an identity assertion, but the claim signer’s X.509-based signature remains an unavoidably linkable element.

The following strategies will address this linkability, resulting in various levels of privacy. The strategies achieving the highest levels of privacy would require either updates to the specification, or post-processing on a C2PA asset.

Privacy strategies

Let’s now explore some techniques to provide privacy to a C2PA claim signer.

Pseudonymous certificates

The simplest strategy compatible with the current specification is to generate a self-signed X.509 certificate and use it to sign digital assets (i.e., use it as the claim generator certificate). The certificate would then need to be obtained out-of-band by verifiers. This technique doesn’t allow signers to prove attributes certified by a 3rd party (e.g., memberships, entitlements, etc.), it only demonstrates ownership of a public key; it is only useful for scenario #4. Retrieving the certificate from the signer’s well-known website would be a good way to convince verifiers that, e.g., “this image was signed by the owner of https://example.com”.

One-use certificates

Re-using an X.509 certificate creates linkable signatures: even if a certificate doesn’t identify its owner, all the resulting signatures can be associated to the same entity.

To achieve unlinkability between multiple signed assets, a signer could obtain a new certificate for each signature. Some deployers might opt to deploy this strategy, even given the extra burden on the infrastructure, like Google did for their Pixel 10 C2PA signing. This prevents a verifier from linking two images to the same signer; it does not, however, prevent the issuer from recognizing and tracing each individual certificate.

Unlinkable signatures

Cryptographic unlinkable signatures allow creating privacy-supporting certified assets without disclosing the holder’s full identity to verifiers. One of the leading algorithm candidates is BBS, undergoing standardization in the IETF. In a BBS credential flow, an issuer signs a credential containing holder attributes, and the holder later derives a selective-disclosure proof for a verifier. Augmenting the C2PA specification to support BBS-enabled presentations³ would allow a manifest to reveal only selected attributes, while binding the presentation to the asset being certified.

Zero-knowledge proofs over X.509 certificates

A Zero-Knowledge Proof (ZKP) is a cryptographic mechanism allowing someone to prove properties about some data without disclosing the data itself. Given some data signed by a X.509 certificate, a user could prove that the signature and certificate are valid without disclosing the identifiable parts of the certificate (e.g., serial number, public key, issuer signature, validity period). A C2PA manifest could be redacted using a ZKP allowing anyone to verify that:

the digital asset hasn’t been modified
the signer’s cert was valid when the asset was processed/anonymized
the signer cert’s CA is trusted (either by disclosing the CA, or proving it is part of a trusted group)

This technique is very promising as it is compatible with the current C2PA specification and doesn’t require changes to the key management infrastructure (to introduce new signing algorithms).

Comparison

The following table compares at a high-level the strategies I covered:

Strategy	Unlinkable wrt verifiers	Unlinkable wrt the issuer	Supported by current spec
Pseudonymous certificates	❌	N/A (self-signed)	✅
One-use certificates	✅	❌	✅
Unlinkable signatures (BBS)	✅	✅	❌
ZKP over X.509	✅	✅	⚠️⁴

A prototype

I’ve built a prototype for the two most privacy-preserving options explored here:

The BBS prototype models a simple issuer/holder/verifier flow: a demo issuer signs a toy credential, the holder presents that credential bound to a C2PA asset hash, and the verifier checks both the disclosed attributes and the content binding.
The zero-knowledge prototype keeps conventional X.509/ECDSA signing, then post-processes the asset into an anonymized variant whose proof is bound to the C2PA asset bytes and to the signer’s CA.

The code is available in this GitHub c2pa-signer-privacy project.

Next steps

The emergence of provenance technologies is a welcome tool to help fight disinformation and help establish trust in online content. It is however not too early to consider the negative privacy impact this might have, which could slow down adoption.

My hope is that the community will use simple prototypes like these to discuss concrete scenarios, compare privacy goals, and experiment with alternative trust models before the ecosystem hardens around only one approach. Which scenarios really need pseudonymity versus unlinkability? Which attributes should remain visible to verifiers? How much can be achieved within the current specification, and where are specification changes justified? These are design questions worth debating now, while experimentation is still cheap.

I’d like to thank my colleague Greg Zaverucha for his insights and feedback on this project.

Footnotes

Notably, the International Press Telecommunications Council (IPTC) has created a list of Verified News Publishers to help the verifier ecosystem recognize known news media organizations. ↩
One such approach, the VerITAS system, has been prototyped by Stanford researchers Trisha Datta, Binyi Chen, and Dan Boneh. ↩
This could be achieved by either specifying a X.509 profile supporting BBS signatures, or allowing other credential types supporting BBS (e.g., Verifiable Credentials) to natively sign a C2PA manifest. ↩
The initial signing step is fully spec-compliant (standard X.509/ECDSA), but the anonymized manifest uses a non-standard COSE algorithm and a custom assertion, which current verifiers cannot process without modifications. ↩

Where are you from?

2025-06-26T00:00:00+00:00

“Where are you from?” is a common introductory question we ask when meeting someone. Unfortunately, we can’t ask the same of a picture, video, or audio clip found online. In this era of generative AI, it has become practically impossible to distinguish digital assets captured from “real life” from those artificially created by generative systems. In the latest Last Week Tonight episode, John Oliver highlights the rise of “AI slop.” You know an issue has gone mainstream when it makes it onto his show.

Of course, “AI or not” isn’t the most important question. We’re long past the days when media and music weren’t modified by computer-assisted tools, many of which now incorporate AI by default. The more meaningful questions are: “Where is this asset from?”, “How was it generated?”, “By whom?”, and “How was it transformed?”

Fortunately, technologies exist to help answer these questions. Content Credentials, developed by the Coalition for Content Provenance and Authenticity (C2PA), can be attached to digital assets to describe how and by whom they were created, much like a clothing label tells us about a garment’s origin. For example, when a photographer takes a picture with a C2PA-compliant camera, cryptographically signed provenance data (including optional metadata like a secure timestamp or geolocation) can be embedded in the image.

Incidentally, the image above wasn’t taken by a camera, it was generated using Copilot. We know this because Copilot, like many generative AI systems, attaches Content Credentials to everything it creates. By inspecting the image in a validation portal or using our browser extension, the provenance is revealed.

We’re at the beginning of a major deployment phase for C2PA. Today, AI-generated content floods our digital spaces. But as C2PA adoption grows, platforms will be better equipped to determine the origin of the content they serve, allowing them to surface trusted content more prominently.

Recent highlights

I joined the C2PA Technical Working Group nearly two years ago, and we’ve been working hard to bring this technology forward. The past few months have been especially active:

We released version 2.2 of the core specification last month, incorporating feedback from implementers.
The Conformance Program v0.1 launched two weeks ago, enabling the creation of a trusted ecosystem of hardware and software implementations.
The IPTC continues to expand its Verified News Publisher list, increasing trust in the news media ecosystem, just one of many verticals to come.
Earlier this month, many of us gathered at the 3rd Content Authenticity Summit, where numerous updates were shared and implementations demonstrated. You can watch the keynote presentations here.

This momentum is encouraging, and I’m excited to see what the second half of the year brings for content provenance.

A layered approach

Of course, C2PA is not a silver bullet. Just like a clothing label, Content Credentials can be removed, omitted (for non-compliant creators), or even faked. For instance:

Credentials might be stripped from an asset, either deliberately or due to format incompatibilities.
Malicious systems generating deceptive content may omit credentials altogether.
Attackers might attempt to forge Content Credentials. While they can’t successfully impersonate another party cryptographically, their fakes could still mislead users, similar to how cybersquatting (or typosquatting) domains can trick visitors. (The C2PA UX Working Group is actively developing guidance to address these issues.)

Other provenance tools, such as digital watermarks and fingerprints, can complement C2PA. They can be used to recover missing manifests or determine an asset’s origin independently. (The C2PA Watermarking Working Group is focused on defining these capabilities.)

And in cases where provenance metadata isn’t embedded at all, AI detectors can serve as a last line of defense to flag potentially AI-generated content. Hany Farid gave a compelling keynote on this topic at the CA Summit.

Tag everything?

Longtime readers of this blog know that privacy is a major concern of mine. Adding digital signatures to all images and videos has significant privacy implications, not only for individuals, but also for organizations like newsrooms seeking to protect the identity of their journalists or sources. The C2PA Threat Model addresses many of these concerns, but today’s specification still has limits.

Fortunately, privacy-preserving cryptography, like zero-knowledge proofs (as we are developing in Crescent), can help meet stringent threat models, even enabling certified anonymity. Be assured that this is on our research radar…

Crescent Credentials

2024-12-19T00:00:00+00:00

Providing strong privacy for identity credentials is becoming an important goal. Recent frameworks such as the Selective Disclosure for JSON Web Tokens (SD-JWT) and mobile Driver’s License (mDL) support selective disclosure of attributes to prevent data over-sharing. This is an important capability, but a critical one that is hard to achieve is missing from both: unlinkability.

An unlinkable (a.k.a. untraceable or untrackable) credential is one that can be issued to a user and presented to a verifier without any data correlations between these two actions.¹ In particular, nothing in the credential’s construct (e.g., serial numbers, cryptographic values such as public keys and signatures, etc.) could be (mis-)used as a correlation handle (other than the attributes themselves, which of course could identify a user if disclosed). To achieve this feature, you not only need to sanitize the always-disclosed data to avoid linkable values (e.g., serial numbers, GUIDs, validity periods, etc.), but you also need to use special cryptography to avoid creating inescapably linkabable values (e.g., the issuer signature which acts as a unique fingerprint). I’ve described this in details in the Where’s Waldo Been post.

There are two general strategies to achieve unlinkability in an identity system. The first one is to use a cryptographic signature scheme that supports unlinkability, such as blind signatures (e.g., U-Prove) or proofs-of-knowledge (e.g., BBS).² One challenge with this approach is that it requires major changes to current identity issuance systems: new algorithms need to be standardized, implemented, and integrated into various platforms. BBS is currently being standardized by the IETF, but even when that concludes, there will be a lot of integration work required to match the ubiquity of a RSA or an ECDSA, and it will be difficult to ask for an identity ecosystem overhaul when the industry is already preparing for the post-quantum cryptographic transition.

The second strategy is to present conventional, existing credentials in an unlinkable way using zero-knowledge proofs. We recently released Crescent, a cryptographic library that achieves exactly that. Zero-knowledge proofs are cryptographic building blocks with seemingly magical properties, allowing a prover to convince a verifier of the veracity of certain facts about some data without disclosing the data and without the verifier learning anything more than the statements being proven (the verifier gains “zero” extra knowledge). These primitives were introduced close to 40 years ago, and have only recently become efficient enough for use in practice. Crescent introduces two important properties, the ability to:

share the large zero-knowledge parameters among all issuers using the same credential schema and signing algorithm (e.g., you would only need one set for the AAMVA mDL ecosystem); and
offload the expensive zero-knowledge user calculations to a “prepare” stage that only needs to run once asynchronously per credential; subsequent presentations and verifications are then very efficient.

Crescent currently supports two types of credentials: JSON Web Tokens (JWT) and mobile Driver’s Licenses (mDL); more are on our roadmap (such as X.509). To learn more about Crescent, consult our technical paper.

We’ve created a sample to illustrate the capabilities and practicality of the system.

For simplicity, the sample defines its own issuance and presentation protocol, but it is easy to imagine how this could be integrated into higher level identify framework (e.g., OpenID/OAuth, Verifiable Credentials, mDL ecosystem); some of our future work will focus on these integrations.

You can find details about the sample here, but in summary:

A Crescent Service has pre-generated the zero-knowledge parameters to create and verify zero-knowledge proofs from JWTs and mDLs.
The user has a pre-imported (mock up) mDL.
The user obtains a proof-of-employment JWT from her employer Contoso.
These credentials are stored in the browser extension Crescent wallet that communicates with a local Client Helper whose role is to handle the heavy computation and storage.
The user presents an employment proof using her JWT to a mental health clinic Fabrikam.
The user presents an over-18 proof using her mDL to a social network.

You can see these components in action in this demo video.

It’s exciting to see the rapid development in zero-knowledge proof research, which brings the technology closer to deployment in real-life systems. We continue working on each part of the system, from the low-level cryptographic building blocks to the integration in the identity layer. Stay tuned…

Footnotes

Technically, an observer (which could even be the issuer, the verifier(s), or both in collusion!) looking at the issuance and presentation messages should not be able to tell which user presented which credential. Some timing (e.g., close issuance and presentation time) or network (e.g, IP address) metadata, or presented attributes could, of course, leak some information; these can be addressed using various privacy-maximizing strategies. ↩
I’ve compared both approaches in this blog post. ↩

C2PA Browser Extension Validator

2024-04-17T00:00:00+00:00

In this era of disinformation exacerbated by ever-evolving AI tools, the creation of seemingly authentic fake content can be quite dangerous, with risks ranging from harming one’s reputation to damaging society as a whole.

Fortunately, provenance technologies are emerging to fight this problem. The Coalition for Content Provenance and Authenticity (C2PA) is the leading effort that allows creators to cryptographically sign their digital assets and editors to record subsequent transformations, helping consumers confirm their origin and authenticity while keeping an auditable history of the data. It has been adopted by leading technology providers (Microsoft, Google, Meta, Intel), camera manufacturers (Sony, Canon, Nikon), image/video editors (Adobe), generative AI systems (Copilot, OpenAI, Midjourney), and news organizations (BBC, CBC/Radio-Canada, New York Times). The C2PA is also at the forefront of the fight against election disinformation and was one of two technologies mentioned in the recent AI Elections accord signed at the Munich security conference.

I’m happy to announce the release of a new open-source browser extension validation tool to verify C2PA assets on a web page. The extension, currently a developer preview prototype, scans a web page for C2PA images and videos, validates them, and overlays a content credential logo to display their status. Our goals are to 1) allow people to experiment with C2PA technologies, and 2) be able to rapidly prototype new C2PA features (and ones from its sister Creative Assertions Working Group). This is very much work-in-progress and we’d love feedback and suggestions from the community, so please visit the GitHub project and try it out!

A validated C2PA-signed image from the Bing Creator.

Project Origin Verified Publisher Trust List

Project Origin, one of the two initiatives that merged to become the C2PA, recently announced the creation of a trust list of verified publishers to help fight disinformation in the digital news ecosystem.

Our browser extension can use the Verified Publisher Trust List hosted by the IPTC to verify content from trusted signers. You can watch a demonstration of our extension verifying a video from the BBC that recently started to use the provenance technology.

Demo of our browser extension validator using the Origin Verified Publisher Trust List to verify a video from the BBC.

Road ahead

It’s exciting to see the growing momentum behind the C2PA and all the organizations it brought together. I believe that down the line, C2PA-protected assets will be like HTTPS-protected websites: content without certified provenance information will look very suspicious! The road to get us there will however be challenging, but I hope that these initial steps will allow us to explore and prototype the capabilities that will make this technology ubiquitous.

Onward!

Recent highlights on the quantum safe journey

2024-04-12T00:00:00+00:00

NIST’s 5th PQC Standardization Conference just concluded. It was a great event and always is a great opportunity to (re-)connect with my academic and industry colleagues. A lot has happened since the first iteration of the conference. In fact, things have accelerated greatly since the release of the first PQC FIPS drafts last August.

Here are a few personal highlights:

Microsoft established its Quantum Safe Program, helping customers on their quantum safe journey.
The Open Quantum Safe project has joined the Linux Foundation’s PQC Alliance; this will allow the project to extend its activities and provide components that are better-suited for real-life deployments. It’s been humbling to see the impact of OQS, a project I joined since its early days; it has been used by almost everyone test PQC algorithms and prototype their integration into protocols and various systems. In fact, if a vendor is offering a PQC solution today; it likely uses OQS.
NIST released the first draft reports of the NCCoE PQC Migration project. I have the pleasure of leading the project’s Interoperability and Performance workstream, and our report describes results of testing quantum-safe algorithms in TLS, SSH, QUIC, X.509, and HSMs, using various software and hardware components from my esteemed collaborators.
In January, the White House held a PQC roundtable meeting I had the honor of attending. It’s good to see PQC leadership at the highest level to help move the transition forward.
Just last week, my colleagues announced an important milestone in building a reliable quantum topological qubit, bringing us one step closer to realizing the quantum vision, and reminding us that the transition work is very important.

Once upon a time

I’ve been on a quantum journey for a long time. Back when my fellow Canadian Bryan Adams’s “Have You Ever Really Loved a Woman?” was topping the charts, I started to study with Gilles Brassard (the co-inventor of Quantum Key Distribution (QKD) and Quantum Teleportation) at the University of Montreal. The whole lab was then focussed on this crazy new idea of using quantum mechanics to build computing models. I too caught the quantum bug and ended up specializing in quantum cryptography, designing ways to use quantum error correcting codes to extend QKD distances. During that time, Peter Shor and Lov Grover published their famous algorithms. I left the academic world right before Y2K and started a career working on a more “practical” side of cryptography engineering.

Decades later, Quantumania is in full force. I’m excited by the possibilities of the quantum revolution, a dream that is becoming reality, and I’m proud of the hard preparatory work our industry is doing to keep our systems safe.

What a journey. Onward!

Introducing the Cross-Platform Origin of Content (XPOC) framework

2023-09-08T00:00:00+00:00

It is quite challenging today to verify the origin of online content. Content creators (individuals or organizations) with well-known websites typically provide a way to discover their various associated accounts on (social) media platforms (e.g., YouTube, X/Twitter, Facebook, Instagram, etc.), most commonly using the well-known logo icons.

Manual validation of such information can be complicated and assumes that you know the creator’s website. Moreover, a lot of content can also be created as a collaboration of multiple collaborators (e.g., a podcast, a video interview, a conference panel, etc.) and the resulting media could be posted under the account of one of the creators or a host.

In this era of disinformation, made worse by ever-evolving AI tools, the creation of seemingly-authentic fake accounts and content can be quite dangerous, with risks ranging from damaging one’s reputation to having society-wide impact. AI detection tools are playing (and losing!) a cat and mouse game against rapid technological developments. Many media hosting platforms have account and content validation processes with various levels of quality, but these systems are disconnected and can’t rely on a standardized verification mechanism that could be shared among them.

To address this issue, we’ve created the Cross-Platform Origin of Content (XPOC) framework, allowing content creators 1) to publish an authoritative manifest of their accounts and approved content across various platforms, and 2) to tag said accounts and contents with special XPOC URI linking back to their manifest. This allows validation tools to automatically determine the origin of the protected content.

XPOC manifest for christianpaquin.github.io

This framework would be useful to politicians, celebrities, companies, government entities and many other stakeholders to provide a signal of authenticity for the accounts they control and the content they created or approved.

The XPOC framework was designed to be simple to implement and deploy, allowing content creators to publish manifests and tag protected content without the explicit collaboration of the media platforms; in turn, platforms implementing the framework would improve their user’s experience and content origin validation.

The GitHub project contains the framework specification, a TypeScript reference library, and some sample implementations for XPOC manifest editing tools and XPOC URI validation tools.

System Overview

I’ll illustrate the XPOC framework using myself as an example. My main website is christianpaquin.github.io; it hosts this very blog. This is the central point of my professional persona, where you can find links to some of the accounts I control on other platforms (e.g., my X/Twitter handle and my GitHub account). I created a XPOC manifest to list all of them in a standardized and discoverable manner. Moreover, I added links to content I created or participated in that has been posted by accounts I don’t control (e.g., this panel on societal resilience posted on the MSR YouTube channel or this post-quantum crypto presentation I gave a DEF CON 27). You can take a look at the xpoc-manifest.json file hosted on this web site.

XPOC manifest for christianpaquin.github.io

It is important to note that this manifest only contains accounts and content I associate with my MSR persona; I have other personal accounts and content I didn’t list here (but could be present in a personal webpage’s manifest).

If someone or some system would like to find all the accounts and content associated with me, they could retrieve and inspect this file. Humans would prefer more user-friendly discovery tools, such as the sample XPOC viewer portal from our GitHub repository.

Sample XPOC manifest viewer

An important feature of the framework is the ability to link these accounts and content back to my main website, to let visitors know who is behind my social handles and account names. I did this by adding the xpoc://christianpaquin.github.io! XPOC URI to my account pages and content. The xpoc:// prefix and terminating character ! help parsing tools to find the URI in the page’s HTML; they would then extract the base URL christianpaquin.github.io and fetch the corresponding manifest at https://christianpaquin.github.io/xpoc-manifest.json.

For example, I’ve put my XPOC URI in my X/Twitter bio, and in this (test) YouTube video description, allowing validation tools to find my manifest.

Our GitHub repository contains a sample browser extension that can be used to verify these URIs, allowing people to verify that these accounts and content are actual from me and not someone pretending to be.

Validation of X/Twitter XPOC URI

The road ahead

Fighting disinformation and fake content is a challenging task, and we are only at the beginning of the battle.

The XPOC framework won’t magically solve the issue, but it can help address a slice of the problem. If content creators (especially those targeted by fake content) start publishing manifests of their content, then verifiers (fact checkers, journalists, hosting platforms) would be better equipped to confirm their origin, and this would reduce the window of damage opportunity of the malicious content.

Early XPOC adopters falling victim to some attacks could help educate verifiers, letting them know that they should have detected the fake since they weren’t listed in their manifest. These early incidents would help turn the adoption wheel.

As adoption grows, it will become more difficult to target creators who are using the XPOC framework, and attackers would likely focus on those who don’t. Similarly to what happened during the HTTP-to-HTTPS transition, early adopters benefited from stronger security, and those late to the party had their unprotected site flagged as insecure or even blocked. I can imagine the same situation happening if creators start using XPOC: the last minority of unprotected accounts would look suspicious, and might go through stronger account and content review processes on the hosting platform.

Let us know what you think about the XPOC framework, how it could be used and improved.

Onward!

Of U-Prove and BBS

2023-07-13T00:00:00+00:00

We, members of the W3C DIF BBS working group, have recently published the 3rd IETF draft of the BBS specification that integrates the optimizations of Stefano Tessaro and Chenzhi Zhu, reducing the signature and proof size (and proving the security of the scheme). I updated our implementation accordingly.

Given that I’m also developing U-Prove – a technology with similar characteristics – I’m often asked about the difference between the two schemes. This post sheds some light on this topic.

A bit of history

Cryptographers have long imagined how digital identity credentials should be built to provide both security and privacy: a user should be able to present a credential containing multiple attributes to various verifiers in a way that minimizes data disclosure, presenting only the required information; nothing more, nothing less. One important feature to support is the selective disclosure of the attributes, presenting only those needed for a particular transaction.¹ Another important one is to prevent inescapable linkability between the issuance and presentation of the credential due to unique identifiers (see my previous post on the subject). We say that a system supports minimal disclosure if both principles are met.²

There are different mechanisms to achieve selective disclosure; for example the OAuth working group is currently working on a specification for JSON Web Tokens (you can experiment with my implementation). Achieving unlinkability, on the other hand, is more complicated.

David Chaum invented the concept of unlinkable signatures. His original technique, called blind signatures, allows an issuer to sign a message without learning its value. This scheme is useful when creating tokens with no attributes (as used in the privacy pass specification).³

Stefan Brands generalized the concept by creating restrictive blind signatures, which allows an issuer to unlinkably sign credentials containing multiple attributes, which can in turn be selectively disclosed to a verifier. This is the scheme at the core of the U-Prove technology.

U-Prove flow. 1) Issuer signs a credential encoding multiple attributes. 2) User randomizes the signature (during issuance). 3) User presents transformed credential as-is, selectively disclosing the attributes.

Jan Camenisch and Anna Lysyanskaya later created the CL signature scheme, which allows an issuer to sign a multi-message credential, which can also be selectively disclosed to a verifier, proving that it was signed by the issuer without disclosing the signature itself. One benefit of using CL signatures is that a user could present their credential multiple times in an unlinkable manner (a.k.a., multi-show unlinkability). In contrast, multiple one-show U-Prove credentials must be obtained to achieve this feature. One issue with CL signatures, however, is their efficiency, as they are an order of magnitude slower and bigger than Brands’. CL signatures are the core of the Idemix system developed by IBM.

Finally, Dan Boneh, Xavier Boyen, and Hovav Shacham created the more efficient pairing-based BBS signature scheme, which after a few updates (including ideas from Camenisch and Lysyanskaya) became the basis for the current BBS specification.⁴

BBS flow. 1) Issuer signs a credential encoding multiple attributes. 2) User stores the credential as-is. 3) User presents the credential selectively disclosing the attributes and proving it was properly signed (without disclosing the signature).

So, which one should I choose?

At a first glance, it seems that BBS’s multi-show unlinkability makes it more versatile, so it should be the obvious choice, but there are a few points to consider. It is important to understand where U-Prove and BBS fit in the protocol stack.

Brands’ signature scheme has been standardized in ISO/IEC 18370-2:2016. U-Prove builds on Brands’ signatures to create a credential system, specifying how issuers create their public parameters, and how users can obtain and present tokens. U-Prove must, however, be further profiled to be integrated into a framework or application. Over the years, we created X.509, SAML, and WS-Trust profiles to create privacy-preserving versions of these credentials. More recently, we created a JSON Framework to create privacy-preserving JSON Web Tokens and Signatures (and here is a proposal to integrate it into JSON Web Proofs).

BBS, similarly to Brands, is a low-level signature scheme, detailing how to sign and present messages. Some useful features are still in development, such as blind issuance of attributes and user-bound signatures. It would also need to be profiled in order to be integrated into higher-level components. There are efforts to integrate it into JSON Web Proofs, Verifiable Credential, and in version 2 of the Hyperledger anoncreds project (which evolved from the Idemix system).

U-Prove and BBS use different types of mathematical constructions. U-Prove can be implemented over any prime-order group, and currently uses standard elliptic curves making it easy to implement on any platform.⁵ BBS, on the other hand, uses pairing curves, which makes its implementation less efficient and more complicated. Fortunately, since the BBS specification uses the popular BLS curve used in many projects (e.g., ethereum and zcash), more libraries are becoming available simplifying its development.

Time is of the essence

Long-lived credentials should be revocable. We’ve learned from the PKI world that revocation is a difficult feature to deploy at scale. Indeed, deployers have a choice to use hard-to-maintain and bulky revocation lists (e.g., CRLs), privacy-reducing call-to-the-issuer (e.g., OCSP), or other in-between strategies (e.g., status list). The problem is exacerbated when dealing with minimal disclosure credentials, where we want to hide identifiable revocation identifiers from verifiers. Various revocation schemes have been designed to achieve this, the most promising ones using cryptographic accumulators to keep the revocation artifacts small.⁶ Like their conventional counterparts, these systems are also complicated and hard to deploy. Deployers of minimal disclosure credentials might therefore prefer to limit the validity period of the credentials and only issue short-lived ones, avoiding the need for a revocation system altogether. In this case, the multi-show unlinkability feature becomes less important, since the credentials need to be re-issued on a frequent basis.

Parting words

At the end of the day, the system requirements (including standards, performance, complexity) determine which cryptographic building block should be used in a given system (same as for any other cryptographic primitives we use). One-show unlinkability is often sufficient in systems where users want to establish a pseudonymous relationship with a verifier (e.g., when presenting the same identifier or claims in repeat visits). For users, it makes no difference which technology is used to present their application-level identity credential. If fact, we demonstrated this in the EU-funded ABC4Trust project where users were issued both U-Prove and Idemix credentials which could be used interchangebly and transparently to access various resources (under the hood, multiple U-Prove tokens are retrieved efficiently in-batch to maintain unlinkability across presentations).

The bottom line is that BBS signatures do not replace the need of simpler blind-signature schemes such as U-Prove, but rather complement the cryptographic toolbox with which we can create powerful privacy-preserving frameworks and the user-centric identity systems of tomorrow. I’m very excited by the progress we continue making in the BBS working group, and by the integration path into identity frameworks such as JSON Web Proofs and Verifiable Credentials, bringing the technology closer to deployment maturity.

Onward!

Footnotes

Selective disclosure of attributes can be generalized to present properties of attributes, without disclosing their values directly. For example, a user could prove that their name is not contained in a blocklist or that their credential expiration date is later than today, without disclosing either of the attributes. These predicate proofs, however, are more complicated, and we stick with the simpler notion of selective disclosure in this post, as supported by the core U-Prove and BBS specifications. You can learn more about some of these U-Prove extensions in this paper. ↩
These types of credentials are often referred to as anonymous credentials, but I prefer the term minimal disclosure credentials, as we rarely need full anonymity in identity scenarios. Minimal disclosure covers the full identity spectrum, from anonymity, to pseudonymity, to full identification, as required by the application. ↩
One could encode different attributes with few possible values using different issuer keys, but this is not scalable for general attribute schemas. This technique was used in early e-cash systems encoding different e-coin denominations. ↩
BBS was originally proposed by Boneh, Boyen, and Shacham in their Short Group Signature paper at Crypto 2004. The scheme was modified by Man Ho Au, Willy Susilo, and Yi Mu in their Constant-Size Dynamic k-TAA SNC 2006 paper, following an idea of Jan Camenisch and Anna Lysyanskaya from their Crypto 2004 Signature Schemes and Anonymous Credentials from Bilinear Maps paper. Finally, a few tweaks from Jan Camenisch, Manu Drijvers, and Anja Lehmann presented in section 4 of their TRUST 2016 Anonymous Attestation Using the Strong Diffie Hellman Assumption Revisited paper, and the optimizations by Stefano Tessaro and Chenzhi Zhu in their recent Revisiting BBS EuroCrypt 2023 paper lead to IETF BBS specification. ↩
The specifications recommends the prime curves from NIST, but any prime-order group curve could be used, such as FourQ or ristretto255. ↩
Most schemes originate from a design by Lan Nguyen, including this one we built for U-Prove, and the pairing-based ALLOSAUR and zk-SAM which could be used with BBS. ↩

Where’s Waldo been?

2023-06-16T00:00:00+00:00

Let’s imagine an identity-themed game of Where’s Waldo. The goal is to figure out where Waldo used an identity document, say his driver’s license.¹ The credential, issued by the DMV (which I’ll also call the issuer), contains his name, his date of birth, and his address. Now let’s say that Waldo presents his credential at a casino to prove he is over 18.²

Now, can the player – let’s call him Odlaw, Waldo’s infamous nemesis – figure out that Waldo visited this location after the fact? We assume Odlaw is very powerful, and has access to the DMV’s and the casino’s internal systems (e.g., logs, employees, etc.) Can he track the usage of Waldo’s identity document?

In real life³ it would be hard for Odlaw to learn where Waldo presented his driver’s license. Given that the casino clerk is not “plugged-in the Matrix” and likely doesn’t have photographic memory, the simple enter-or-not decision they make after glancing at the driver’s license protects Waldo’s privacy fairly well. (I deliberately ignore here other information sources such as surveillance cameras, ATM usage, etc.)

Now, let’s transpose this game online. Waldo now obtains an electronic version of his driver’s license and presents it at an online casino.

If the credential (a.k.a. the identity token) is obtained using a federated protocol such as OpenID Connect or OAuth, then it is retrieved on-demand from the issuer who then learns where the user is coming from,⁴ therefore allowing Odlaw to trivially win the game.

To make the game more difficult, Waldo should therefore obtain a credential that can be presented to any web site without disclosing the destination to the issuer. Let’s say this happens, then Waldo can present his e-license to enter the online casino; Odlaw must now work a bit harder to figure out where Waldo has been. Since the online casino’s system has a long and perfect memory (vs. the human checking the physical ID), it will remember (in its logs) not only Waldo’s date of birth, but also his name and address. To find Waldo, Odlaw simply needs to get access to the casino’s audit log.

Ok, ok, wait, why would Odlaw do that? Well, we’re playing a game here, so it’s Odlaw’s goal to find Waldo. It may feel like a weird game for now, but keep reading…

What if Waldo could only show his date of birth without also disclosing his name and address; would that solve the issue? Well, it would certainly help. There are different strategies to achieve this, the most practical would be to use a mechanism to disclose only the attributes (a.k.a. claims) requested by the web site. The ISO mobile Driver License (mDL) standard allows just that, by using a hash-based selective disclosure mechanism.⁵ Using such a credential, Waldo would only disclose his date of birth, therefore reducing the disclosed information and making it harder to find him. A birth date is however more information than strictly needed by the casino; they only care that their visitors are over 18. Inspecting the casino’s logs and learning the disclosed birth dates would give Odlaw an unfair statistical advantage in the game.

It’s easy to fix this by having the issuer include an over-18 binary attribute in the credential, allowing users to only disclose its true/false value.⁶ Ok, if Waldo uses this new credential, the only thing Odlaw would learn by studying the logs is that the visitor is over-18, without knowing which credential owner presented it.

This is not the end of the game, however; although the disclosed attribute value is close to anonymous, its cryptographic container is not! Indeed, the credential is signed by the issuer, and the digital signature is a unique number that acts as a digital fingerprint that can identify Waldo. To figure out where’s Waldo, Odlaw simply needs to correlate the signature between the issuer’s logs and the casino’s.

Ok, let’s take some time to justify the game. Who is Odlaw, and why would he want to track and trace Waldo across the web? Well, perhaps Odlaw is a data broker or ad platform, amassing large quantities of user data, analyzing leaked logs, having web signals on many sites, or having access to insiders at the DMV or the casino (data which can later be sold to various governments, as was confirmed in the US this week). Perhaps he is an oppressive government trying to track a specific (set of) dissident user(s), to block (censor) their access or retaliate against them. Maybe Odlaw is a private investigator or a paparazzo, trying to dig dirt on politicians or celebrities. The web as we know it was built on a business model relying on tracking user activities through all sorts of mechanisms, some of which are now being addressed through technical means (e.g., elimination of 3rd party cookies) and legislation (e.g., GDPR). As more digital identity and authentication systems are being built (some of which in response to the threat of AI), using inescapably traceable cryptography would result in a dangerous privacy decrease.

How can Waldo prevent this unfair searching capability? The issue is that a unique value (the signature) is applied on the credential by the issuer and is shown as-is by Waldo to the casino. Fortunately, cryptography isn’t necessarily a source of woe for Waldo; it also gives us the tools to break the linkage between the issuance and presentation of the credential. There are two main techniques to do so: Waldo can either randomize the issuer’s signature before presenting it to the casino or convince the casino that his credential is correctly signed without disclosing the signature itself.⁷ The former technique involves so-called blind signatures (as used in U-Prove), and the latter involves zero-knowledge proofs (as in BBS).⁸ Using these privacy-preserving signature mechanisms, Waldo can receive a credential with various attributes, and minimally disclose⁹ an anonymous over-18 claim without fear of being tracked. Indeed, by inspecting the logs of the issuer and the casino, Odlaw can’t figure out if Waldo has been here.

Are we done? Is the “Where’s Waldo been” game impossible to win now? Perhaps in theory, but not in practice. There are many signals that could still be used to track Waldo, e.g., timing correlations between issuance and presentation, or transport layer leakage (e.g., IP address tracing). These must be addressed separately, and we concentrate here on data privacy. There are many subtleties that we must consider to clean up all correlatable information in the credential. For example, long-lived identity credentials have an expiration date, which if precise enough could be used as a unique fingerprint to identity its owner. Credentials should therefore encode validity periods using statistically hiding ranges or buckets, to obfuscate to whom they belong. Privacy-minded deployers must therefore check all the credential’s metadata for information that could (help) identify the user (revocation information, validity periods, usage restrictions, etc.)

Ok, surely now, we must be done. If Waldo has a privacy-protecting credential, supporting minimal disclosure allowing him to only prove he’s over 18 in an unlinkable manner, and makes sure to leave time between credential issuance and presentation, and obfuscate his IP address and other fingerprintable signals by using TOR, surely Odlaw can’t win the “Where’s Waldo been” game.

Weeeeell… the game isn’t called where’s Wilma or where’s Wenda, it’s called where’s Waldo. If Odlaw (again, this could be the issuer, an insider, a hacker, a 3rd party, the destination website, or any combination of these) is reeeally motivated to figure out where Waldo is going, they could try to tag him in some fashion.¹⁰

What if the issuer orders the attributes differently in Waldo’s credential, e.g., everybody has a [name,address,DoB,over-18] credential but Waldo receives a [name,address,over-18,DoB] one? Easy then to spot the needle in the log stack. What if Waldo contains a capitalized “True” value for is over-18 attribute, while all other adults have a “true” value? To minimize the risk of leaking such information, the attribute schema should be made public by the issuer as part of its parameters, and Waldo’s client should check the proper formatting of his credential and values before presenting it.

What if the issuer uses a different set of parameters (and signing key) for Waldo than for other users? As if it would sign everyone’s credential using its “blue” pen but would use its “red” one for Waldo’s. It would again be trivial to spot Waldo in the web logs. This is a difficult problem to tackle, and key (and parameter) transparency techniques (like in PKI) can help Waldo protect his privacy.

I hope you have come to appreciate that deploying privacy-protecting identity credentials is a challenging effort. Even when using proper cryptography, the identity framework must offer sufficient protections to minimize data leakage. This is what we aimed to do when designing the U-Prove JSON Framework (UPJF), where the token’s public key and signature are randomized, attribute schemas are specified in the public issuer parameters, and expiration dates are specified in privacy protecting “buckets.” The User-centric Web Attestation framework further constrains the UPJF by forcing issuers to publish the attribute schema (indexed set of valid claim values) as part of their public parameters.

In an identity ecosystem where privacy-protecting cryptography is used, and where audit mechanisms are in place to make sure issuers are well-behaved, the “Where’s Waldo been” game becomes very difficult to win for Odlaw, which in turn is a win for Waldo and all internet users.

Footnotes

This post has a north American bias, using the name Waldo (vs. the original Wally) and a driver’s license as an example identity document; I’ll leave it as an exercise to the reader to mentally localize the examples (I’m for example replacing Waldo with Charlie, from my cherished childhood “Où est Charlie?” books). ↩
Let’s say the casino is in Washington state where the gambling age is 18, which matches many non-US jurisdictions. ↩
Not to say that online activities aren’t an integral part of life, but you know what I mean… ↩
Very often, the destination is encoded in the token to prevent an attacker from stealing and reusing it. Even if not explicitly stated in the token itself, the web issuer must redirect the user’s browser/app to the destination, therefore learning which site is visited. ↩
Instead of encoding the attributes directly, the issuer encodes a salted hash digest allowing the user to disclose the attributes of their choice by attaching their pre-image values. The OAuth SD-JWT specification provides a similar mechanism for general JSON Web Tokens. ↩
Some advance cryptographic techniques, e.g., zero-knowledge proofs allow users to prove that their birth date is such that they are over-18 without disclosing it, but these are quite complex and it’s unclear if we’ll see them deployed at scale anytime soon. Even if it were practical to do so, it is way simpler for this use case to encode a couple of useful Boolean values (e.g., over-13, over-18, over-21, over-65) rather then using the more complicated approach. In security, simplicity is your friend… ↩
Identity credentials often encode a public key which is also a unique number that must be dealt with using these same techniques. I omit it for simplicity here. ↩
These two classes of schemes have different pros and cons; I’ll write more about them in a follow-up post. Other simpler mechanisms exist to issue unlinkable “membership” credentials without attributes, such as privacy pass, or this recent proposal by my esteemed MSR colleagues. ↩
I use the term minimal disclosure to describe the ability to only present the required information, nothing less, nothing more. Sometimes, the minimum required is your full identity (e.g., if you board a plane), sometimes, it could simply be an over-18 proof (like in our example). The term selective disclosure is often used in systems where only a subset of the attributes are revealed. Selective disclosure is however not always minimal: if for example the signature is linkable, or if you are disclosing an attribute that reveals more information than required (reusing again our birth date vs. a simple over-18 claim example). More advanced zero-knowledge cryptographic techniques allow us to prove properties of undisclosed attributes, e.g., that my undisclosed date of birth is such that I’m over 18, that my name doesn’t appear on a block list, or that my state of residence is one of those allowing online gambling. ↩
Of course, the reader can imagine that the attacker could target a group of “Waldos”. ↩

User-centric Web Attestations

2023-05-30T00:00:00+00:00

Over the course of our life, we receive many attestations that we carry and display proudly. I framed my university diploma in my study; professionals (doctors, psychologists, etc.) display theirs in their offices. I also carry many attestations in my wallet (membership cards, employment badge, etc.) that I can easily show to anyone.

Online attestations also exist, but unlike the physical ones, they are mostly tied to one environment, making it difficult to present across system boundaries. For example, some social media sites display badges for verified users, gamers can display achievements in their profiles, NFT (especially the soul-bound ones) can represent memberships or entitlements.

What if you could display attestations on the web from whomever, wherever you wanted? What if we could build a portable blue check mark?

To explore this idea, we released a proof-of-concept User-centric Web Attestations (UWA) framework, with which users can obtain certified, cryptographic attestations from various issuers, and attach them to any web property they control (e.g., a personal blog, a social media profile, etc.). The project contains a sample issuer Express server and an Edge/Chrome browser extension that can create and verify web attestations.

Strong privacy

There is a strong push for verified user information in this era of generative AI, but to avoid further eroding the already fragile online privacy and autonomy of users, we designed the UWA framework to support the strongest privacy threat model.

It is often desirable to attach attestations to your “real life identity”: e.g., attach your academic achievements to your resumé, your employment history to your LinkedIn profile, etc. Other times, you might prefer to link them to a pseudonymous identity not linkable to your real-life one. Using the UWA framework, anyone trusting the issuer can verify the attestations, but no one (including the issuer itself) can link their issuance to their presence on a web page. In other words: I can get attestations that state “I’m a member of Community XYZ,” “I’m over-18,” or simply “I’m a human,” and that’s the only information one could infer from them without being able to link it back to the actual person to whom it was issued (even if a malicious insider actively tries to track usage of UWA it issues).

We achieve this strong privacy property by encoding attestations using U-Prove tokens.

System overview

The following diagram gives an overview of a web attestation life cycle:

An Issuer sets up its public parameters and publishes them in a well-known location. These specify the contents of the U-Prove tokens, which can contain an application-specific label to make the attestations more informative. Users and Verifiers must obtain the Issuer parameters before creating or verifying web attestations.
The User obtains U-Prove tokens from an Issuer. Authentication to the Issuer is application-specific (e.g., an Issuer might issue attestations to its members, or provide a notary validation service for paying customers). U-Prove tokens are stored in the web browser extension.
When visiting a website, the User can create a web attestation from an issued token using the web browser extension (encoded either as a string or a QR code), and attach it to the site. The U-Prove token is then deleted from the browser extension to prevent linkability with newly created attestations (new tokens will be automatically obtained from the Issuer if they expire or if they are running out.)
Other users visiting the same website can verify attached web attestations from trusted Issuers using the web browser extension. Unknown Issuers can be added to the trusted list by the User. Invalid attestations (for example: forged, or copied from a different site) are marked as such; malformed ones are simply ignored.

An example

I created a web attestation to attach to this blog post issued by the project’s sample issuer.

uwa://eyJhbGciOiJVUDI1NiJ9.eyJzY29wZSI6Imh0dHBzOi8vY2hyaXN0aWFucGFxdWluLmdpdGh1Yi5pby8yMDIzLTA1LTMwLXVzZXItY2VudHJpYy13ZWItYXR0ZXN0YXRpb25zLmh0bWwiLCJ0aW1lc3RhbXAiOjE2ODUxMzQzMzg3Mjl9.eyJ1cHQiOnsiVUlEUCI6IlVXemxCU3VyRkl3ZnVxVy0xaFdYa2VPcGNaQzlvYjNITlRKOUdvM2hUN1EiLCJoIjoiQlBhZVI2VFB2dV9YU2tLMHNoZnR0dVdXV3Z5MVBfQVRkV0txalNGZTJqSVJPM3JQSm1pSnkxZXJWMll6VFNwZEVuLWYtN1BoR0ZSMzNvd29LWmtBelVjIiwiVEkiOiJleUpwYzNNaU9pSm9kSFJ3Y3pvdkwzSmhkeTVuYVhSb2RXSjFjMlZ5WTI5dWRHVnVkQzVqYjIwdmJXbGpjbTl6YjJaMEwzZGxZaTFoZEhSbGMzUmhkR2x2YmkxellXMXdiR1V2YldGcGJpOXpZVzF3YkdVdGFYTnpkV1Z5TDNOaGJYQnNaU0lzSW1WNGNDSTZNakF3TURRc0lteGliQ0k2TVgwIiwiUEkiOiIiLCJzWnAiOiJCTm5IUHg2QllJZ1BrTlE0VVBrV0VndW5veWxIck5XR2hkM2R2TW93bkk0MlZ3QXRlS1FsSzdXMFpBREtSRHhsanV0aElYRy10Z2REcWxWTlFKenpTbHMiLCJzQ3AiOiJMUXhadGNxTVZiYnk1dEZwYVU1WlBabFFqMUhDOWpYSnJjSlNCZzl6bEF3Iiwic1JwIjoiU2t6Qng3dEdnYmdOOUtpbFJTNlJYWHNRb3RNN3FpR0tQT1l4SjFDbXBsSSJ9LCJwcCI6eyJhIjoiZGV0NDVpU3gwTUtLZ3czdU94Q2pYdV9VOGFkV2tzQ2kwUjdpckJZbUpKWSIsInIiOlsia3lRbmVJcnBGZGZHTTJIQ2VnVTloQnlKeVFDWHRMWXNPSHBxLXdoaGpkYyJdfX0

This just looks like an opaque URI¹, but if you installed the browser extension, it recognizes the UWA string, verifies the web attestation (valid issuer and scope), and displays a verified badge: .

The browser extension can also generate and verify a UWA encoded as QR code, which are useful for environment limiting the number of characters a user can provide (e.g., in social media bio or posts).

This web attestation doesn’t attest to much; it just shows that I was able to set up the sample issuer, obtain demo tokens, and create a UWA for my blog post. This, however, demonstrates the mechanics of issuing and presenting such attestations. The project’s README page walks through a deployment example describing how Alice can obtain U-Prove tokens from the https://commun.ity website and attach a corresponding membership attestation to her https://soc.ial/@pr1v4cy social profile, and how another user attach a humanity attestation to their page; see it in action in this youtube video.

Concluding thoughts

The UWA framework provides a simple, yet powerful mechanism, allowing users to openly share attestations of any type across boundaries, while preserving any degree of privacy they desire. There are many subteleties and pitfalls when designing privacy-preserving systems; the project’s README and the UWA specification contain important details to consider for a real deployment and discuss further extensions that would improve the system.

I’m curious to see what you think, so go try the project, and give us feedback.

Footnotes

The UWA string is is composed of a uwa:// prefix followed by a JSON Web Siganture (JWS) encoding a U-Prove token presentation. You can try base64url-decoding the three dot-separated part of the JWS to peak inside. ↩

U-Prove JSON Framework Overview

2023-04-03T00:00:00+00:00

In this post, I’ll give an overview of the recently released U-Prove JSON Framework (UPJF), which defines how to use JSON to encode U-Prove artifacts, allowing developers to easily integrate the privacy-protecting cryptographic technology into web applications.

U-Prove Technology Overview

First things first, let’s quickly review U-Prove: the cryptographic system that allows the issuance of signed tokens containing selectively-disclosable attributes bound to a user key pair that can be presented in an unlinkable manner. Let’s break down what that means:

Attributes can be of any types; think of them as claims in the JSON Web Token (JWT) world. Only the attribute values are encoded in a U-Prove token, the types (and ordering) are specified in the issuer parameters.
Nothing in the token other than the encoded attributes can identify the user. In particular, the token’s public key and signatures are randomized in the issuance protocol and are never seen by the issuer.¹ The issuer can still recognize tokens it issued (as would any verifier), but can’t determine in which issuance session they were created. As an analogy, imagine the issuer is a bank issuing dollar bills. All bills of the same denomination may look the same, but a withdrawal and spending can be linked on the basis of the bill’s unique serial number; coins on the other hand all look alike and can’t be distinguished.
The user can decide, at presentation time, which attributes to disclose to a verifier. Hidden attributes are perfectly hidden, they cannot be brute-forced.

The issuer creates its key pair and publishes its public parameters (including its public key and details about the to-be-issued tokens). The UPJF only supports the pre-generated recommended parameters which use the NIST prime-order curves (P256, P384, P521) and the issuance of tokens with a maximum of 50 attributes.

Multiple tokens encoding the same attributes can be issued in parallel in a single issuance session; each will have a unique key pair and signature. Tokens can be presented to verifiers by signing a message (which could be a unique challenge for access scenarios, or some application-specific data for digital signature scenarios). Token can be presented once for anonymous interactions, or reused for pseudonymous ones.

These core features are supported in the JSON profiles and implemented in the Typescript node library, and are suitable for many web scenarios.²

For more information, check out the technology overview or if you feel adventurous, the crypto specification contains all the gory details. Note that the sample values used in the post were generated using the JSONFrameworkSample in the library.

Issuer setup

The issuer first generates its parameters containing its public key, a description of which curve and generators to use, a unique identifier for the parameters, and an application-specific description of the tokens content. Here is an example of a JSON Web Key (JWK) object containing a set of issuer parameters:

{
    "kty": "UP",
    "alg": "UP256",
    "kid": "a0A3quUdeEoIJT9R_-Ysy_kr7CTmJ2w9GSSZSHBvP3I",
    "g0": "BLaj8knVriRtGjLfGVg9MX1HvaPZDbhq0PmNcpxrA4oGZYoBPV-Nkcf0yfyI0mLMA10ykCj4DHKfol4T2D3HvsQ",
    "spec": "eyJuIjozLCJleHBUeXBlIjoiZGF5IiwiYXR0clR5cGVzIjpbIm5hbWUiLCJlbWFpbCIsIm92ZXItMjEiXX0"
}

The key type kty is always set to “UP” for U-Prove.
The curve and generators are identified by the alg identifier, which could be either “UP256”, “UP384”, or “UP521” (corresponding to the NIST P-256, P-384, and P-521, respectively, and the matching pre-calculated generators).
The key identifier kid corresponds to the issuer parameters UID which needs to be unique for the application realm. Developers can simply let the library generate these as the hash digest of the other fields.
g0 is the issuer public key.
The specification spec is a base64 encoding of a JSON object describing the content of to-be-issued tokens. Here, the value is:

{
    "n":3,
    "expType":"day",
    "attrTypes":["name","email","over-21"]
}

specifying that tokens will have 3 (n) attributes of type (attrTypes) “name”, “email”, and “over-21” respectively (these should be normal JWT claim types), and that the token expiration values will be measured in days (expType).

The UPJF recommends expressing expiration dates in privacy-friendly buckets (hours, days, weeks, and years), to avoid introducing precise expiration dates that could be used to track and trace users; in this example, all expiration dates of tokens will be set to midnight after the specified number of days. We are taking great care to avoid introducing undesired correlatable values in U-Prove tokens that could break their unlinkability; this is why the issuer is committing to their structure in its parameters.³

Anatomy of a U-Prove token

U-Prove tokens have the following structure:

The values for the attributes 1 to n (given in the issuer parameters). The user will be able to decide which of these attributes to disclose to verifiers and which ones to hide. U-Prove tokens could have no attributes (when n = 0).
The token information field contains a base64url encoding of a JSON object with claims that are always disclosed during presentation. This is useful to encode token metadata, such as an expiration date, and an issuer identifier (for example, a URL describing where to retrieve the issuer parameters; the UPJF recommends publishing them at [ISSUER_URL]/.well-known/jwks.json).
The prover information field contains a base64url encoding of a JSON object with claims that are always disclosed during presentation, but are unknown to the issuer. This is useful to encode a fresh challenge from a verifier while issuing on-demand tokens, or to tie a token to an external artifact that the user specifies at issuance (e.g., an encryption key).
The token public key corresponds to the private key that remains secret to the user. It is generated by the user but never seen by the issuer.
The issuer signature provides authenticity and integrity guarantees on the token. It is generated by the issuer but randomized by the user at issuance.

Issuance protocol

The U-Prove issuance protocol is a 4-leg request-commitment-challenge-response exchange, initiated by the user. How the issuer authenticates the user and validates the attributes to issue is application specific. Multiple tokens encoding the same attributes can be obtained in batch.

The initial token issuance request might contain user-suggested parameters, for example, the desired number of tokens and attribute values; ultimately, the issuer decides on these values.

The following examples show the messages for issuing 5 tokens, encoding the attribute values “Joe Example” (name), “joe@example.com” (email), and “true” (over-21), corresponding to the types specified in the issuer parameters, and a token information field encoding the following JSON object containing the URL of the issuer, and an expiration date specifying 100 days from the issuance date (counting the total number of days since the Unix epoch):

{
  "iss":"https://issuer",
  "exp":19548
}

The first issuance message sent by the issuer (containing one shared value, and two arrays of per-token values) would look like this:

{
  "sZ": "BMH5uVuMC9Tc+4rOKMC27UwiTc0z9n8kX93MTw1reg1dG8EB1T8zOH/OFRnyS7Q90+mOx6YuEHhXLT4mZXLaqfI=",
  "sA": [
    "BMAuA4uPOURdO+S5nnw1G8m34uCfSI2XdD0fPeP/f4Vx8+FyniY4/9R9lWH06GZS6j14nfDyM0XH8tJY3cOyy40=",
    "BILCZFHxq/C+daMwRdUltW51fTw8d2a6p05tkie3SCeAylRxpDm3yNKwKPsUTY3NSfHw4XpC6M6Cr9yxivcflmk=",
    "BHlk5nsWKNAbuF33+FzyCtxYQE1FRP1jUUxYyHNH1fT0nqupI3PRmk3ZtW1Qjy5RLeHPji1pqgLjEZL0PpLXZrQ=",
    "BAEVNZiaxozc8uOTw7TbyZLOZdle2/l/GCJAiLsoxZXm7k/j+ulrjUX0zZus7S1GLAjFAXg91hu+p956Hu/Vt60=",
    "BEokpCt90Xcc126fs3gaudDdhGaKJfQffIjHFoVpcgZNkyIFrcO4nuxFVMn8AO9vaoQzsj+DhfbpRLrEELN1cdU="
  ],
  "sB": [
    "BBx9PkdtDgydZnVbiY4kwm9NevdUE+wyiPuR9RlCZXSBaYSvnXJGBNYO1aNpm+16QzHvYhBOYBXN9m/zHy1QcOU=",
    "BImKvZECOGf0PM9aBQLa3mRiOmJZ/p69t4MPjerevDl0F8c/8XwcznCqgxRYXYYSv8LXKk8LvLN/YsIvdm0vVrU=",
    "BNeSJoeb2/hRtEQlfLY9UqE12+onQIyYGELl7uXu76+S5kYdNjhiTwRrsta7B4VYMq+2akYWBvVMFvVfrim3XLI=",
    "BEzfVQw8mfxZERCUBwLLQ2Qz2S1nyFl2Q/Q8kqG3FrAmiiADBfaGkNopgaXjLbsUuEAdkvqYwEmKPKrFInG/Ths=",
    "BKF+fQ+KopnQTv+O+/yB3D3BSJYycaEScGd01wrIdisZjLWinSdK809adTUCm50V20feJ+C1DtdRWACSf5JiE4Q="
  ]
}

The second issuance message, sent by the user, would look like this:

{
    "sC": [
        "5hhj/QG76zXXPe1zks/H5Fsvq32jL81OBTsANhIFoRE=",
        "SUOh117WBGAWVHZ7eOBIGUNxe9/eT6SjS1gQwBV6Lsw=",
        "pweN4QIp72gVT53mMOrTI0RQd6HmFxLT9727qj/ZPNw=",
        "v0RQ75GEwps/trPdzvG6NETPmf4oNlVuLRew2pFVgz8=",
        "A8dUvtq6aXNyyqgUNM7NILoVnyRaYO1Ej1FhGhJy4mM="
  ]
}

Finally, the third issuance message, sent by the issuer, looks like this:

{
  "sR": [
    "hcru+mdXDgGPnfovKr7pQtpTkLmhj11N1DkfVfRuuGQ=",
    "8bkl56JadyLGfwa9KYoCNUoDdUTMYQEvbv1PJbPA0a4=",
    "1J8sxxAyGSbn3WxV3enTH1e90YOQJ3lr8hw2QedDpgg=",
    "4/Pb/94JGCGJGpNa22GjYDQMgShI1wRdXYJk0p0RQZQ=",
    "8jT5vOUytgk1pI5/ekZtdtOk2glS1LA9ioe1s64Naos="
  ]
}

The user would then generate 5 U-Prove tokens and corresponding private keys. Here is a sample token:

{
  "UIDP": "a0A3quUdeEoIJT9R/+Ysy/kr7CTmJ2w9GSSZSHBvP3I=",
  "h": "BPh3qOzYuhnUTaK6bJ74wRYDSbyPiFVDuB+T4tcqFvm03ayALw4u4zPUBMZmKpSvcWw00n2g5WvcKUQYUMfdyQM=",
  "TI": "eyJpc3MiOiJodHRwczovL2lzc3VlciIsImV4cCI6MTk1NDh9",
  "PI": "",
  "sZp": "BLwfHuKESMPexDOVuAwwPeHPSe3S9tNnRGSaD8t4ZzIUbq9Efq8Z1l0k7tNzENHU0ouQppr2RUVyjNo1c9r2R7Q=",
  "sCp": "ZX0kIbTi78a7NhInOMQVwzGFgWdWaaJr+UkdLsaefeY=",
  "sRp": "BuhVEPOJK0g+5fQLx/eAngMp4M3OuI6eeGMNy8aAOz4="
}

where UIDP is the unique cryptographic identifier of the issuer parameters; h is the token’s public key; TI and PI are the base64url encoding of the token and prover information fields, respectively (see above for the TI value; PI is unused in this sample); and sZp, sCp, and sRp form the issuer signature.

Presentation protocol

The user can later present a U-Prove token to a verifier by using the corresponding private key to sign a presentation message; the user selects which attributes to disclose in the process. The presentation proof acts as a digital signature on the presentation message, which can contain application-specific data. To prevent replay attacks (to the same verifier, or to a different one), the message should contain an unpredictable challenge; this could for example be a verifier-specified random number, or composed of a verifier identifier, a timestamp and a nonce for non-interactive presentations.⁴. Here is an example of a presentation proof, disclosing the third attribute (the “over-21” boolean), generated on a random verifier-provided challenge.

{
  "a": "SAcAOx7nmwulPc1xxdXfBjZPW5KfEycBNA0jPvkRYMc=",
  "r": [
    "0WZkALFePshcl2fxKOL0+Tr8rpN+J58gZNdYFnStymU=",
    "/ta9XKkI4hRWvT+bWaanveKhijRJVJQup8qVIJWifA0=",
    "FFp3WBMUzUYdP2lVNhdPHtT7XZAlEjI5KT8ClZxVUbY="
  ],
  "A": { 
    "3": "dHJ1ZQ=="
    }
}

The r array contains one value per hidden attributes plus an extra one, and the A object contains the disclosed attributes (in this example, the base64 encoding of the “3”^rd value: “true”). UPJF describes how token presentations can be encoded into a JSON Web Signature (JWS), where the header encodes the U-Prove algorithm (same as in the issuer parameters), the payload encodes the presentation message, and the signature encodes the presented U-Prove tokens and the presentation proof. Here is a sample JWS:

eyJhbGciOiJVUDI1NiJ9.-678Cjbk8hHW8l55QgtCXw.eyJ1cHQiOnsiVUlEUCI6IjVYV3NkbGMwclhQdk94ZVRRNkF4RjY2UVJmSGxkU0N4U2VCbitreWNRdkE9IiwiaCI6IkJQY0IzSm02MXVSa3BZektjT3h4d0wvckFGVmU3L1JucjZTbmw4b01rYU1PSW42eU9zSGNIc3VjQy8xb3ZBRVEzQkdlV2lmOGpDejFXNm5pOWJUUk9yOD0iLCJUSSI6ImV5SnBjM01pT2lKb2RIUndjem92TDJsemMzVmxjaUlzSW1WNGNDSTZNVGsxTlRGOSIsIlBJIjoiIiwic1pwIjoiQkNSdWdKUzhwQnNGbG8ydzZOUXZhRkxIYzR2K1ZKSm82OGEyVGRJaDIzdVRsY3QwbHhOU3JUQnI3QlFOQ0x2S3hrMnZTV293NTlleWo5SUpBWTd2VjhrPSIsInNDcCI6ImY3eCt5VlJ5OTVJcCtvcEpJNVlUQ2drZ3BBYktYZWhIZHRFL3lBajlNbnc9Iiwic1JwIjoiVC9kdExrZ3pQZEdxYlVEdUxXMnBoQUVLcFoyWjRlakhCME1adzlXTnc2VT0ifSwicHAiOnsiYSI6ImlGblR4ZGJ0UFg2bll4UndiV0VtbGhpRXNYU25kVEVuZTEvN2ZodFkxUUE9IiwiciI6WyJqNTY0Y081d1B3R1VLcE1jZDhTS1VoYmpKVmwyaVg3a055ODF6ODIySUh3PSIsIjRRM1Y0WWFhZ1EzcENzakFTaG1wNm1lbGZ2d1dpQUllT3NuNmFzMjFsenc9IiwiWHoyTUtISklvNTdyeWlMV3RjR2Y0eTFicVcxU1NiUWswNzQ4RDQ0bzVhZz0iXSwiQSI6eyIzIjoiZEhKMVpRPT0ifX19

The verifier can then validate the presented U-Prove token and presentation proof, given it has access to an authentic copy of the issuer parameters. These could be retrieved directly from the issuer using the URL encoded in the token information field.

Try it out!

The U-Prove technology provides unique security and privacy benefits useful in many access, authentication, and attestation scenarios. The U-Prove JSON framework and its implementation in the typescript library make it easy to experiment with the technology in web environments. I’m excited to hear what use you’ll make of it; don’t be shy a engage with us on the project’s github page.

Footnotes

The issuance protocol uses a technique called restrictive blinding, which is a generalization of blind signatures allowing them to be applied on messages known to the issuer. ↩
The cryptographic specification (and the C# SDK) support more features (including device-binding, domain/scope pseudonym generation, attribute commitments), and the extensions support an even broader set of features to build more advanced credential systems (for example, proving attribute equality and non-equality, range proofs, revocation). ↩
A malicious issuer could try to “tag” a target user by using different issuer parameters, or by encoding attributes differently or in a different order. Auditing/transparency systems can help with the former, and specifying the token’s content in the issuer parameters helps with the latter. ↩
The verifier identifier prevents the proof from being replayed to a different verifier, the nonce prevents replays in general, and the timestamp limits how long nonces should be remembered by the verifier. ↩

Christian’s Corner

Privacy for C2PA signers

Example scenarios

Linkability

Privacy strategies

Pseudonymous certificates

One-use certificates

Unlinkable signatures

Zero-knowledge proofs over X.509 certificates

Comparison

A prototype

Next steps

Footnotes

Where are you from?

Recent highlights

A layered approach

Tag everything?

Crescent Credentials

Footnotes

C2PA Browser Extension Validator

Project Origin Verified Publisher Trust List

Road ahead

Recent highlights on the quantum safe journey

Once upon a time

Introducing the Cross-Platform Origin of Content (XPOC) framework

System Overview

The road ahead

Links

Of U-Prove and BBS

A bit of history

So, which one should I choose?

Time is of the essence

Parting words

Footnotes

Where’s Waldo been?

Footnotes

User-centric Web Attestations

Strong privacy

System overview

An example

Concluding thoughts

Footnotes

U-Prove JSON Framework Overview

U-Prove Technology Overview

Issuer setup

Anatomy of a U-Prove token

Issuance protocol

Presentation protocol

Try it out!

Footnotes