SlideShare a Scribd company logo
The POST RELEASE TECHNOLOGIESOF CRYSIS 3
Twitter:@coolbeenz
Email:stewart@crytek.com
Job done?
Introduction
CONTENTS
1.The reasoning
2.Data Patching
3.Telemetry
Asset systems, Patch paks, Multiplayer flow, Handling failure & messaging
Collection, Storage, Syncing, Analysing, Matchmaking telemetry case study
Why, What, How
4.Release-Debug
Other production mechanisms for gathering data
5.Summary
Lessons learned and future developments
6.Questions?
Over to you...
THE REASONINGPART1
“What are THEY for?”
Post-Release Technologies...
TWEAKING
IMPROVING
Diagnosing
Fixing
Facilitating
the gameplay
the game experience
the cause of problems
bugs
themed weekends
“What EXACTLY are THEY?”
Post-Release Technologies...
POST RELEASE
TECHNOLOGIES
= DATAPATCHING+ RELEASE DEBUG + TELEMETRY
“WHY DO WE NEED THEM?”
Post-Release Technologies...
Because things do not always go to PLan
T200 (X360)
27th Sept
Open Beta
Jan 29th
Closed Alpha
Nov 2ndT200 (PC)
Oct 4th
T200 (PS3)
11th Oct
T200 (X360)
8th Nov
T200 (PS3)
22nd Nov
T200 (PC)
29th Nov
Because despite alphas, betas and numerous large scale tests things will still slip through the
net. The players are your most thorough QA.
The CRYSIS 3 TEST SCHEDULE T200 = EA Worldwide Tech 200
... For certification failures
... On discovering copyrighted content
... When players are abusing an exploit
As A way to Deploy ASSET
FIXES RAPIDLY
The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
BECAUSE CERTIFICATIONCOSTS
TIME & MONEY
December 2012 JANUARY 2013 FEBRUARY 2013 MARCH
03-Dec 10-Dec 17-Dec 24-Dec 31-Dec 07-Jan 14-Jan 21-Jan 28-Jan 04-Feb 11-Feb 18-Feb 25-Feb 04-Mar 11-Mar
Open-beta liveOpen-beta cert
Final cert ReleaseRTM
Day 10 cert Day 10 live
40%Of commits
DuringCERT & RTM WERE
ASSETS& DATA
BECAUSE WE WANT PEOPLE TO
KEEP PLAYING THE GAME
Because things don’t always go to PLan
SELL YOUR THEMED WEEKENDS
Because things don’t always go to PLan
SELL YOUR THEMED WEEKENDS
Because things don’t always go to PLan
SELL YOUR THEMED WEEKENDS
SO THAT WE CAN REACT
TO FEEDBACK
AND BUILD A
COMMUNITY
DATA PATCHING
PART2
CRYENGINEASSET FILE SYSTEM- OVERVIEW
objects/level_specific/airport/architecture/terminal/main.cgf
Files referenced using paths
A virtual file system
Files can be loose or part of asset packages (.pak) files
Files can be stored in memory, media or HDD
Platform agnostic API
CRYENGINEASSET FILE SYSTEM- PAK FILES
Paks are digitally signed and encrypted in mastered builds
Antitamper mechanisms
A collection of files
These are essentially zip archives of a folder hierarchy
Paks searched in order of most recently opened
Stack based searching
CRYENGINEASSET FILE SYSTEM- PAK FILES
gEnv->pCryPak->OpenPak(“objects1.pak”);
gEnv->pCryPak->OpenPak(“objects2.pak”);
gEnv->pCryPak->OpenPak(“objects3.pak”);
objects1.pak
objects3.pak
objects2.pak
Search order
gEnv->pCryPak->FOpen(“objects/level_specific/airport/architecture/terminal/main.cgf”,”rbx”);
CRYENGINEASSET FILE SYSTEM- PAK FILES
Level loading, MPModeSwitch.pak
Some created for specific loading
Contents generally organised by type
Objects, animation, scripts, music, sounds, etc
.dds0, .chr, .cgf, .cga
Some created for streaming
PATCH PAKSAsimplewayto overrideANYEXISTINGASSET?
... Create a patch.pak
... Mount this new pak file
... New assets will be prioritised
Mount it last or mark with a special ‘priority’ flag
Any subsequent file requests will be serviced by these patched files first
containing updated versions of specific assets
... Patching at the asset system level
So individual game subsystems oblivious
... Only suitable for Title Updates and DLC
As we need to hardcode the loading of this pak file in a new executable
ON DEMAND PATCHING
... Differing lifetimes
... Separate hot/cold assets
... Risk reduction
DOWNLOADING& ApplyingPATCH PAKS TRANSPARENTLY
number of patch paks?”
Double XP Weekend vs Level setup fixes
Weapon balancing vs player stats fixes
Smaller files mean less chance of failure
“Why do we need to support a variable
ON DEMAND PATCHINGCRYSIS3 IMPLEMENTATIONDETAILS
Multiplayer Only
Process hidden within the transition to MP
Cache size of 2Mb (X360 only)
We already show a loading screen and re-initialise most game systems anyway
Self imposed limitations to reduced risk
Patch paks un-mounted on returning to single player
Regularly check for new updates
So that players can be informed if they need to re-enter MP
It all starts with a file called Permissions.xml...
ON DEMAND PATCHINGDOWNLOADPAKS INTOMEMORY OVERHTTP
MULTIPLAYERFLOW
User selects
Multiplayer
Login Online
Services
TCR Reqs
Download
Permissions.x
ml
Check Cache
Download
Patch1.pak
Download
Patch2.pak
Mount paks
Init Game
systems
Overview
MULTIPLAYER FLOW
User selects
Multiplayer
TCR Reqs
Login Online
Services
Download
Permissions.x
ml
Check Cache
Download
Patch1.pak
Download
Patch2.pak
Mount paks
Init Game
systems
Points of failure
MULTIPLAYER FLOW
TCR Reqs
TCR Requirements
Hook into existing handling
Require an extra 2Mb in save game
Cannot proceed unless allowed online
User selects
Multiplayer
Login Online
Services
How do we handle these?
Online play checks
Need extra storage to cache paks
MULTIPLAYER FLOW
TCR Reqs
Download
Permissions.x
ml
Check Cache
Download
Patch1.pak
Download
Patch2.pak
Failing to download
General networking failures
Bespoke networking configurations
Abort!
No patches
No telemetry
How do we handle
these?
What can go wrong?
MULTIPLAYER FLOW
Download
Permissions.x
ml
Check Cache
Download
Patch1.pak
Download
Patch2.pak
Mount paks
Failing to download
What can go wrong?
MD5 Checks
Timeouts
General networking failures
How do we handle these?
Cache Paks (Anti-tamper checks)
Continue to download in the background
Provide help with manual download?
MULTIPLAYER FLOW
Download
Permissions.x
ml
Check Cache
Download
Patch1.pak
Download
Patch2.pak
Mount paks
Failing to download
Implement a configurable
timeout
“WON’TTHIS RESULT IN PLAYERS HAVINGMIS-
MATCHINGSETS OF PATCHES?”
But...
YESBut it is ok because we have a plan...
1.Isolate PLAYERSThis is basically using the same checks used to isolate people running
old builds (Retail & Development)
Client A
Version
oxA5BC
Client C
Version
oxA5BC
Server 1
Version
oxA5BC
Server 2
Version
ox3370
Client B
Version
ox3370 Client D
Version
ox3370
Version code used as a matchmaking filter &
during context establishment.
P1
P2
P1
P2
1.Isolate PLAYERSXOR in the MD5s of each patch pack to create a unique version code
Client A
Version
oxA5BC
Client C
Version
oxA5BC
Server 1
Version
oxA5BC
Server 2
Version
ox3370
Client B
Version
ox3370 Client D
Version
ox3370
P1 P2
P1 P2
0x96CC
0x0100
0xA4BC
XOR
XOR
Exe
P2
P1
0x3370Matchmaking
=
2.COMMUNICATELet players know that they are matchmaking against a reduced pool
DATAPATCHING FUTURE DEVELOPMENTSASSET DELTAs
Full file must be deployed for small modification
Text based assets
XML & LUA Files can easily have a delta injected after assets loading
Some of our XML files can be up to 500Kb in size
Regularly check for new updates
DATAPATCHING FUTURE DEVELOPMENTSASSET DELTAs
Patch XML Nodes
More complicated but huge savings
Extra tools & build steps required but xml patches reduced in size to 1-2% of original
Add, remove or modify at a node level
Current permissions.xml end-point fixed
Need a way to redirect the request externally
Added bonus
Using build-version, SKU-ID, Tags etc
Could use this to patch net-tests, fix dev builds etc
This makes testing new patches difficult
DATAPATCHING FUTURE DEVELOPMENTSRe-DIRECT HTTP REQUESTS
Some patches are not gameplay critical
Exclude these from any filtering
Basically, do not XOR this packs MD5 into the matchmaking version
For example cosmetic asset changes or players personal stats configurations
0xA4BC
XOR
XOR
Exe
P2
P1
0xA5BCMatchmaking
=
0x96CC
0x0100
DATAPATCHING FUTURE DEVELOPMENTSDIFFERENTIATEGAME CHANGINGPAKS
TELEMETRY COLLECTION
PART3
TELEMETRYCOLLECTION- CLIENT OVERVIEW
Data zipped up and streamed asynchronously
Compressed and streamed
Collection and uploading via HTTP
Simple API to push data from files or memory
Fire & Forget. Upload may fail for numerous reasons
No Guarantees
TELEMETRYCOLLECTION- SERVER OVERVIEW
No requirements for immediate results
No complex processing on the server
Storage of files received only
Organised by date, platform and type
Any usernames & accounts salted and hashed
Anonymous data
TELEMETRYCOLLECTION- SYNCING DATA
Data deleted after seven days
Server data kept for fixed time period
Downloaded to Crytek servers
Rsync-ed daily to internal servers
Ultimately discarded
Analysed locally
TELEMETRYCOLLECTION- PROCESSING
Considered the weakest link in the chain
Manually triggered and collated
Turning raw telemetry into useful data
Achieved with a mixture of python & Excel
Optimising has never been a high priority
Processing is slow and intensive
“HOWDO YOU HANDLE HUNDREDS OF THOUSANDS OF
CLIENTS UPLOADING SIMILTANEOUSLY?”
So...
SAMPLE PLAYERSSample deterministically at the client end
User:
coolbeenz
bool shouldUpload = (Hash( username ) % denominator) < numerator;
0x12345678 0x2E8 NO
Hash % 1000 < 100 ?
SAMPLE PLAYERSSample deterministically at the client end
Upload Do not Upload
Select a large denominator and do not change this
Choose a numerator to give you the desired sampling ratio
100
Vary the numerator to meet changing sampling demands
This sets the amount the sampling ratio can be incremented by
E.g 100/1000 = 10%
The individual users being sampled remains consistent
coolbeenz 1000
“WHATKIND OF TELEMETRY DO YOU COLLECT?”
And...
CRYSIS 3 MATCHMAKINGTELEMETRY
Matchmaking one of the top 5 complaints
Find a session fast but find a good session
For consoles & PC
This essentially boils down to ping times
PC also has a quick match option as well as a server browser
Based on MyCrysis Forum feedback
QUANTIFYINGTHE BLACKBOX
Tricky to balance and impossible to predict
Requires constant re-evaluation even with adaptive algorithms
User experience feedback not good enough
You know people are not happy but why exactly?
Create a system which is data driven
Server side
Client Side
Used Blaze servers. Rule based, highly configurable, including relaxation criteria
The rules and times used can be configured and therefore data patched
If we are going to collect telemetry we need to be able to action a response
CRYSIS 3 MATCHMAKINGTELEMETRYSOWHERE DO WE START?
Q.How many times does a player matchmake?
Q.What kind of ping times do players get during that session?
Q.How long does it take a player to get into a session successfully?
Q.What is the most popular method of joining a session?
Q.What is the average matchmaking time?
CRYSIS 3 MATCHMAKINGTELEMETRYDECIDEWHAT QUESTIONS NEED ANSWERING
Need a solution that does not result in GB’s of data
Collect a series of timestamped events in XML
Timestamps based on a zero base time
But still want to be flexible enough to answer a range of questions
Also collect meta data for each event
But still store a server timestamp for collating multiple clients data
<AttemptConnection Method="MatchMake" Timestamp="0.000" />
“GameBrowser”
“Join Session in progress”
“Friend Invite”
“Join Squad”
CRYSIS 3 MATCHMAKINGTELEMETRYIMPLEMENTAN APPROPRIATETELEMETRY SOLUTION
Q.How many times does a player matchmake?
Collect time stamped events with meta data
Q.How long does it take a player to get
into a session successfully? Q.What is the most popular
method of joining a session?
Q.What is the average matchmaking time?
CRYSIS 3 MATCHMAKINGTELEMETRYIMPLEMENTAN APPROPRIATETELEMETRY SOLUTION
RESULTS
Matchmaking Telemetry
The most surprising result was that there were still 2 major bugs in the
client side code
Eventually this was increased to 82%
One of these was fixed with a data patch. Win!
The results were very insightful
Resulted in several iterative improvements
Initially 65% of players took less than 5 seconds to find a match
Still not perfect but there are many external factors at play
1 in 15 matchmaking requests fail
RESULTS
Matchmaking Telemetry
0
10
20
30
40
50
60
70
80
90
100
1 2 3 4 5 6 7 8 9 10 11 12 13 20 255 10 15
Time (s)
% Users Matched
Time to matchmake
RESULTS
Matchmaking Telemetry
How do players join a session?
Quick Match
Join Squad - Already In Game
Join Squad - Lobby
Private Game
Join Friends Game
Server Browser
Quick Match
Join Squad - Already In Game
Join Squad - Lobby
Join Friends Game
Console
PC
Automate the analysis of the telemetry
Utilise A/B testing
User actions telemetry
The results change over time so results can be skewed by a different player pool
We did not collect all user action events. For example when the user backed out
Manual process meant delays in turning around changes
FUTURE DEVELOPMENTS
Matchmaking Telemetry
RELEASE DEBUG
PART4
DEBUG SCREENSEnsure you can gather the info you need in large scale public testing
ERROR CODESEmbed error codes as well as user friendly (TCR) messaging
SUMMARY
Start Early
Collecting telemetry is easy
Have the ability to scale collection
Turning that into useful information is difficult
Be able to balance server load and fail safe
Think ahead, the technology involved is complex and cannot be bolted on
Make it easy to test
Dont underestimate the amount of test required in development
Automate as much as you can
Any manual elements of the system become it’s weakest point
Get buy-in from management
It is difficult to justify continued support when the returns are not directly financial
“Do you HAVE ANY QUESTIONS?”
That is it!
THANKYOU FOR LISTENINGAny feedback, positive or negative welcomed
Twitter: @coolbeenz
Email: stewart@crytek.com

More Related Content

PPTX
Abusing Microsoft Kerberos - Sorry you guys don't get it
Benjamin Delpy
 
PDF
The post release technologies of Crysis 3 (Annotated Slides) - Stewart Needham
Stewart Needham
 
PDF
Writing malware while the blue team is staring at you
Rob Fuller
 
PDF
Kernel Recipes 2019 - CVEs are dead, long live the CVE!
Anne Nicolas
 
DOCX
M7 - Manual
GabrielPostigo1
 
PPTX
Once Upon a Process
David Evans
 
PDF
Kafka Summit SF 2017 - Running Streaming Apps on Docker
confluent
 
PDF
Unloading Plone
Elizabeth Leddy
 
Abusing Microsoft Kerberos - Sorry you guys don't get it
Benjamin Delpy
 
The post release technologies of Crysis 3 (Annotated Slides) - Stewart Needham
Stewart Needham
 
Writing malware while the blue team is staring at you
Rob Fuller
 
Kernel Recipes 2019 - CVEs are dead, long live the CVE!
Anne Nicolas
 
M7 - Manual
GabrielPostigo1
 
Once Upon a Process
David Evans
 
Kafka Summit SF 2017 - Running Streaming Apps on Docker
confluent
 
Unloading Plone
Elizabeth Leddy
 

Similar to The post release technologies of Crysis 3 (Slides Only) - Stewart Needham (20)

PDF
Behind story in publisher that developer doesn't know
David Kim
 
PPTX
Artem Petrov "Preparing for distribution"
Lviv Startup Club
 
DOCX
Unit 72 my computer game user guide (1) (4)
Lewis Brierley
 
PPT
Games Industry. How do I get in?
David Saltares
 
PDF
Gears of War 3 Analytics: Optimizing the Online Experience, or How I Learned ...
joe_graf
 
TXT
Readme
realhubb
 
PPTX
Photon Session / Unite12 Conference
Christof Wegmann
 
PDF
From CS:GO to VR – 11 years of Game Development at Hidden Path Entertainment
DevGAMM Conference
 
PPTX
Supersize your production pipe enjmin 2013 v1.1 hd
slantsixgames
 
PPTX
Loading___done_gdc_2008
guest8943c5
 
PPTX
New Dog, Old Tricks: Running Halo 3 Without a Hard Drive
guest8943c5
 
PPTX
Behind the Scenes: Deploying a Low-Latency Multiplayer Game Globally
James Gwertzman
 
PDF
DevSecCon Boston 2018: Busted computing by Conor Walsh
DevSecCon
 
PPTX
Games CDN
ericlevis012
 
PPTX
Hosting AAA Multiplayer Experiences with Multiplay
Unity Technologies
 
PPTX
MIGS18 Transforming from peer-to-peer to dedicated servers on a live game
Laurent Chouinard
 
PPT
2004: Söldner - a Post Mortem
Teut Weidemann
 
PPTX
Developing Multiplayer Games in Unity3D
Adrian Popovici
 
PPTX
C13_172.pptx
HudoJens
 
PPTX
Video Game Security
Cigital
 
Behind story in publisher that developer doesn't know
David Kim
 
Artem Petrov "Preparing for distribution"
Lviv Startup Club
 
Unit 72 my computer game user guide (1) (4)
Lewis Brierley
 
Games Industry. How do I get in?
David Saltares
 
Gears of War 3 Analytics: Optimizing the Online Experience, or How I Learned ...
joe_graf
 
Readme
realhubb
 
Photon Session / Unite12 Conference
Christof Wegmann
 
From CS:GO to VR – 11 years of Game Development at Hidden Path Entertainment
DevGAMM Conference
 
Supersize your production pipe enjmin 2013 v1.1 hd
slantsixgames
 
Loading___done_gdc_2008
guest8943c5
 
New Dog, Old Tricks: Running Halo 3 Without a Hard Drive
guest8943c5
 
Behind the Scenes: Deploying a Low-Latency Multiplayer Game Globally
James Gwertzman
 
DevSecCon Boston 2018: Busted computing by Conor Walsh
DevSecCon
 
Games CDN
ericlevis012
 
Hosting AAA Multiplayer Experiences with Multiplay
Unity Technologies
 
MIGS18 Transforming from peer-to-peer to dedicated servers on a live game
Laurent Chouinard
 
2004: Söldner - a Post Mortem
Teut Weidemann
 
Developing Multiplayer Games in Unity3D
Adrian Popovici
 
C13_172.pptx
HudoJens
 
Video Game Security
Cigital
 
Ad

Recently uploaded (20)

PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PDF
Doc9.....................................
SofiaCollazos
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
The Evolution of KM Roles (Presented at Knowledge Summit Dublin 2025)
Enterprise Knowledge
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
Doc9.....................................
SofiaCollazos
 
Ad

The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

  • 1. The POST RELEASE TECHNOLOGIESOF CRYSIS 3 Twitter:@coolbeenz Email:[email protected]
  • 3. CONTENTS 1.The reasoning 2.Data Patching 3.Telemetry Asset systems, Patch paks, Multiplayer flow, Handling failure & messaging Collection, Storage, Syncing, Analysing, Matchmaking telemetry case study Why, What, How 4.Release-Debug Other production mechanisms for gathering data 5.Summary Lessons learned and future developments 6.Questions? Over to you...
  • 5. “What are THEY for?” Post-Release Technologies...
  • 6. TWEAKING IMPROVING Diagnosing Fixing Facilitating the gameplay the game experience the cause of problems bugs themed weekends
  • 7. “What EXACTLY are THEY?” Post-Release Technologies...
  • 8. POST RELEASE TECHNOLOGIES = DATAPATCHING+ RELEASE DEBUG + TELEMETRY
  • 9. “WHY DO WE NEED THEM?” Post-Release Technologies...
  • 10. Because things do not always go to PLan
  • 11. T200 (X360) 27th Sept Open Beta Jan 29th Closed Alpha Nov 2ndT200 (PC) Oct 4th T200 (PS3) 11th Oct T200 (X360) 8th Nov T200 (PS3) 22nd Nov T200 (PC) 29th Nov Because despite alphas, betas and numerous large scale tests things will still slip through the net. The players are your most thorough QA. The CRYSIS 3 TEST SCHEDULE T200 = EA Worldwide Tech 200
  • 12. ... For certification failures ... On discovering copyrighted content ... When players are abusing an exploit As A way to Deploy ASSET FIXES RAPIDLY
  • 14. BECAUSE CERTIFICATIONCOSTS TIME & MONEY December 2012 JANUARY 2013 FEBRUARY 2013 MARCH 03-Dec 10-Dec 17-Dec 24-Dec 31-Dec 07-Jan 14-Jan 21-Jan 28-Jan 04-Feb 11-Feb 18-Feb 25-Feb 04-Mar 11-Mar Open-beta liveOpen-beta cert Final cert ReleaseRTM Day 10 cert Day 10 live 40%Of commits DuringCERT & RTM WERE ASSETS& DATA
  • 15. BECAUSE WE WANT PEOPLE TO KEEP PLAYING THE GAME
  • 16. Because things don’t always go to PLan SELL YOUR THEMED WEEKENDS
  • 17. Because things don’t always go to PLan SELL YOUR THEMED WEEKENDS
  • 18. Because things don’t always go to PLan SELL YOUR THEMED WEEKENDS
  • 19. SO THAT WE CAN REACT TO FEEDBACK
  • 22. CRYENGINEASSET FILE SYSTEM- OVERVIEW objects/level_specific/airport/architecture/terminal/main.cgf Files referenced using paths A virtual file system Files can be loose or part of asset packages (.pak) files Files can be stored in memory, media or HDD Platform agnostic API
  • 23. CRYENGINEASSET FILE SYSTEM- PAK FILES Paks are digitally signed and encrypted in mastered builds Antitamper mechanisms A collection of files These are essentially zip archives of a folder hierarchy Paks searched in order of most recently opened Stack based searching
  • 24. CRYENGINEASSET FILE SYSTEM- PAK FILES gEnv->pCryPak->OpenPak(“objects1.pak”); gEnv->pCryPak->OpenPak(“objects2.pak”); gEnv->pCryPak->OpenPak(“objects3.pak”); objects1.pak objects3.pak objects2.pak Search order gEnv->pCryPak->FOpen(“objects/level_specific/airport/architecture/terminal/main.cgf”,”rbx”);
  • 25. CRYENGINEASSET FILE SYSTEM- PAK FILES Level loading, MPModeSwitch.pak Some created for specific loading Contents generally organised by type Objects, animation, scripts, music, sounds, etc .dds0, .chr, .cgf, .cga Some created for streaming
  • 26. PATCH PAKSAsimplewayto overrideANYEXISTINGASSET? ... Create a patch.pak ... Mount this new pak file ... New assets will be prioritised Mount it last or mark with a special ‘priority’ flag Any subsequent file requests will be serviced by these patched files first containing updated versions of specific assets ... Patching at the asset system level So individual game subsystems oblivious ... Only suitable for Title Updates and DLC As we need to hardcode the loading of this pak file in a new executable
  • 27. ON DEMAND PATCHING ... Differing lifetimes ... Separate hot/cold assets ... Risk reduction DOWNLOADING& ApplyingPATCH PAKS TRANSPARENTLY number of patch paks?” Double XP Weekend vs Level setup fixes Weapon balancing vs player stats fixes Smaller files mean less chance of failure “Why do we need to support a variable
  • 28. ON DEMAND PATCHINGCRYSIS3 IMPLEMENTATIONDETAILS Multiplayer Only Process hidden within the transition to MP Cache size of 2Mb (X360 only) We already show a loading screen and re-initialise most game systems anyway Self imposed limitations to reduced risk Patch paks un-mounted on returning to single player Regularly check for new updates So that players can be informed if they need to re-enter MP
  • 29. It all starts with a file called Permissions.xml... ON DEMAND PATCHINGDOWNLOADPAKS INTOMEMORY OVERHTTP
  • 30. MULTIPLAYERFLOW User selects Multiplayer Login Online Services TCR Reqs Download Permissions.x ml Check Cache Download Patch1.pak Download Patch2.pak Mount paks Init Game systems Overview
  • 31. MULTIPLAYER FLOW User selects Multiplayer TCR Reqs Login Online Services Download Permissions.x ml Check Cache Download Patch1.pak Download Patch2.pak Mount paks Init Game systems Points of failure
  • 32. MULTIPLAYER FLOW TCR Reqs TCR Requirements Hook into existing handling Require an extra 2Mb in save game Cannot proceed unless allowed online User selects Multiplayer Login Online Services How do we handle these? Online play checks Need extra storage to cache paks
  • 33. MULTIPLAYER FLOW TCR Reqs Download Permissions.x ml Check Cache Download Patch1.pak Download Patch2.pak Failing to download General networking failures Bespoke networking configurations Abort! No patches No telemetry How do we handle these? What can go wrong?
  • 34. MULTIPLAYER FLOW Download Permissions.x ml Check Cache Download Patch1.pak Download Patch2.pak Mount paks Failing to download What can go wrong? MD5 Checks Timeouts General networking failures How do we handle these? Cache Paks (Anti-tamper checks) Continue to download in the background Provide help with manual download?
  • 36. “WON’TTHIS RESULT IN PLAYERS HAVINGMIS- MATCHINGSETS OF PATCHES?” But...
  • 37. YESBut it is ok because we have a plan...
  • 38. 1.Isolate PLAYERSThis is basically using the same checks used to isolate people running old builds (Retail & Development) Client A Version oxA5BC Client C Version oxA5BC Server 1 Version oxA5BC Server 2 Version ox3370 Client B Version ox3370 Client D Version ox3370 Version code used as a matchmaking filter & during context establishment. P1 P2 P1 P2
  • 39. 1.Isolate PLAYERSXOR in the MD5s of each patch pack to create a unique version code Client A Version oxA5BC Client C Version oxA5BC Server 1 Version oxA5BC Server 2 Version ox3370 Client B Version ox3370 Client D Version ox3370 P1 P2 P1 P2 0x96CC 0x0100 0xA4BC XOR XOR Exe P2 P1 0x3370Matchmaking =
  • 40. 2.COMMUNICATELet players know that they are matchmaking against a reduced pool
  • 41. DATAPATCHING FUTURE DEVELOPMENTSASSET DELTAs Full file must be deployed for small modification Text based assets XML & LUA Files can easily have a delta injected after assets loading Some of our XML files can be up to 500Kb in size Regularly check for new updates
  • 42. DATAPATCHING FUTURE DEVELOPMENTSASSET DELTAs Patch XML Nodes More complicated but huge savings Extra tools & build steps required but xml patches reduced in size to 1-2% of original Add, remove or modify at a node level
  • 43. Current permissions.xml end-point fixed Need a way to redirect the request externally Added bonus Using build-version, SKU-ID, Tags etc Could use this to patch net-tests, fix dev builds etc This makes testing new patches difficult DATAPATCHING FUTURE DEVELOPMENTSRe-DIRECT HTTP REQUESTS
  • 44. Some patches are not gameplay critical Exclude these from any filtering Basically, do not XOR this packs MD5 into the matchmaking version For example cosmetic asset changes or players personal stats configurations 0xA4BC XOR XOR Exe P2 P1 0xA5BCMatchmaking = 0x96CC 0x0100 DATAPATCHING FUTURE DEVELOPMENTSDIFFERENTIATEGAME CHANGINGPAKS
  • 46. TELEMETRYCOLLECTION- CLIENT OVERVIEW Data zipped up and streamed asynchronously Compressed and streamed Collection and uploading via HTTP Simple API to push data from files or memory Fire & Forget. Upload may fail for numerous reasons No Guarantees
  • 47. TELEMETRYCOLLECTION- SERVER OVERVIEW No requirements for immediate results No complex processing on the server Storage of files received only Organised by date, platform and type Any usernames & accounts salted and hashed Anonymous data
  • 48. TELEMETRYCOLLECTION- SYNCING DATA Data deleted after seven days Server data kept for fixed time period Downloaded to Crytek servers Rsync-ed daily to internal servers Ultimately discarded Analysed locally
  • 49. TELEMETRYCOLLECTION- PROCESSING Considered the weakest link in the chain Manually triggered and collated Turning raw telemetry into useful data Achieved with a mixture of python & Excel Optimising has never been a high priority Processing is slow and intensive
  • 50. “HOWDO YOU HANDLE HUNDREDS OF THOUSANDS OF CLIENTS UPLOADING SIMILTANEOUSLY?” So...
  • 51. SAMPLE PLAYERSSample deterministically at the client end User: coolbeenz bool shouldUpload = (Hash( username ) % denominator) < numerator; 0x12345678 0x2E8 NO Hash % 1000 < 100 ?
  • 52. SAMPLE PLAYERSSample deterministically at the client end Upload Do not Upload Select a large denominator and do not change this Choose a numerator to give you the desired sampling ratio 100 Vary the numerator to meet changing sampling demands This sets the amount the sampling ratio can be incremented by E.g 100/1000 = 10% The individual users being sampled remains consistent coolbeenz 1000
  • 53. “WHATKIND OF TELEMETRY DO YOU COLLECT?” And...
  • 54. CRYSIS 3 MATCHMAKINGTELEMETRY Matchmaking one of the top 5 complaints Find a session fast but find a good session For consoles & PC This essentially boils down to ping times PC also has a quick match option as well as a server browser Based on MyCrysis Forum feedback QUANTIFYINGTHE BLACKBOX Tricky to balance and impossible to predict Requires constant re-evaluation even with adaptive algorithms User experience feedback not good enough You know people are not happy but why exactly?
  • 55. Create a system which is data driven Server side Client Side Used Blaze servers. Rule based, highly configurable, including relaxation criteria The rules and times used can be configured and therefore data patched If we are going to collect telemetry we need to be able to action a response CRYSIS 3 MATCHMAKINGTELEMETRYSOWHERE DO WE START?
  • 56. Q.How many times does a player matchmake? Q.What kind of ping times do players get during that session? Q.How long does it take a player to get into a session successfully? Q.What is the most popular method of joining a session? Q.What is the average matchmaking time? CRYSIS 3 MATCHMAKINGTELEMETRYDECIDEWHAT QUESTIONS NEED ANSWERING
  • 57. Need a solution that does not result in GB’s of data Collect a series of timestamped events in XML Timestamps based on a zero base time But still want to be flexible enough to answer a range of questions Also collect meta data for each event But still store a server timestamp for collating multiple clients data <AttemptConnection Method="MatchMake" Timestamp="0.000" /> “GameBrowser” “Join Session in progress” “Friend Invite” “Join Squad” CRYSIS 3 MATCHMAKINGTELEMETRYIMPLEMENTAN APPROPRIATETELEMETRY SOLUTION
  • 58. Q.How many times does a player matchmake? Collect time stamped events with meta data Q.How long does it take a player to get into a session successfully? Q.What is the most popular method of joining a session? Q.What is the average matchmaking time? CRYSIS 3 MATCHMAKINGTELEMETRYIMPLEMENTAN APPROPRIATETELEMETRY SOLUTION
  • 59. RESULTS Matchmaking Telemetry The most surprising result was that there were still 2 major bugs in the client side code Eventually this was increased to 82% One of these was fixed with a data patch. Win! The results were very insightful Resulted in several iterative improvements Initially 65% of players took less than 5 seconds to find a match Still not perfect but there are many external factors at play 1 in 15 matchmaking requests fail
  • 60. RESULTS Matchmaking Telemetry 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 10 11 12 13 20 255 10 15 Time (s) % Users Matched Time to matchmake
  • 61. RESULTS Matchmaking Telemetry How do players join a session? Quick Match Join Squad - Already In Game Join Squad - Lobby Private Game Join Friends Game Server Browser Quick Match Join Squad - Already In Game Join Squad - Lobby Join Friends Game Console PC
  • 62. Automate the analysis of the telemetry Utilise A/B testing User actions telemetry The results change over time so results can be skewed by a different player pool We did not collect all user action events. For example when the user backed out Manual process meant delays in turning around changes FUTURE DEVELOPMENTS Matchmaking Telemetry
  • 64. DEBUG SCREENSEnsure you can gather the info you need in large scale public testing
  • 65. ERROR CODESEmbed error codes as well as user friendly (TCR) messaging
  • 66. SUMMARY Start Early Collecting telemetry is easy Have the ability to scale collection Turning that into useful information is difficult Be able to balance server load and fail safe Think ahead, the technology involved is complex and cannot be bolted on Make it easy to test Dont underestimate the amount of test required in development Automate as much as you can Any manual elements of the system become it’s weakest point Get buy-in from management It is difficult to justify continued support when the returns are not directly financial
  • 67. “Do you HAVE ANY QUESTIONS?” That is it!
  • 68. THANKYOU FOR LISTENINGAny feedback, positive or negative welcomed Twitter: @coolbeenz Email: [email protected]