Digitocracy without Borders
The Unifying and Destabilizing Effects of
Software Internationalization
Software Economics
Software is no longer treated
as a traditional, consumable
product that can be shipped to
stores and customers in
boxes, purchased routinely
and frequently.
Instead, software is a service
and is generally expected to
be free or mostly free.
Software Economics
Software monetization
strategies therefore rely on
“hockey stick” user growth
curves that rely on network
effects.
With enough user density,
ecosystems form that support
digital products.
Software Economics
In a competitive marketplace,
apps must target international
audiences: there is no such
thing as “farm to table”
software.
Most software is built with
localization best practices
from the start.
Traditional political and
economic institutions are
constrained by
geography and ideology:
they cannot regulate
panopolies.
We have become more
connected than ever
before and more people
have the opportunity to
have a better life and
benefit from technology.
What have been the effects of software internationalization?
The Past
Open Source
The Present
Machine Translation
The Future
Global Consistency
The Past: Open Source
What is Open
Source?
Open Source
Extremely complex, very
important, and evolving rapidly.
- Culture and activism
- A mechanism to build good
software
- An economic model
- A commercial model
- Technology-related
phenomenon
A Brief History of Open Source
Hackers and the MIT AI Lab
“Sharing recipes is as old as cooking”
Free Software Movement
“Free as in speech, not as in beer”
Open Source
“Given enough eyeballs, all bugs are shallow”
Affero GPL
“Closing the SaaS loophole”
GNU Emacs
Small number of developers hand curating every
line of code and feature.
The Cathedral and the Bazaar
Linux Kernel
Source code developed in public over the
Internet with many contributors.
Yellowbrick
NumFOCUS and District Data Labs
affiliated PyData project for machine
learning visualization.
15 releases, 78 contributors, 2125
stars, 357 forks.
Libraries.io SourceRank: 14
3 languages translated docs
https://www.scikit-yb.org/en/latest/
https://github.com/DistrictDataLabs/yellowbrick
Yellowbrick Documentation Unique Users
GSoC
PyData
2009 Open Source Index: Georgia Tech and Red Hat
http://jsk-sde.blogspot.com/2009/04/open-source-activity-map.html
https://www.redhat.com/en/about/press-releases/open-source-index
Of companies run on open source; less than 3% don’t use OSS in any way.
https://www.zdnet.com/article/its-an-open-source-world-78-percent-of-companies-run-open-source-software/
78%
Critical Infrastructure runs on Open Source
In 2012 the heartbleed bug was
introduced into OpenSSL, which was
publically disclosed in 2014.
Security flaw creates a hole that
affects 17.5% of servers and 68% of
Internet traffic.
OpenSSL is maintained by 1 full-time,
underpaid maintainer.
The Present: Machine Translation
Automatic Translation
State of the Art
Attention Based RNN
(LSTM) Encoder/RNN
(LSTM) Decoder network
with source and target
embeddings.
Deep Neural Networks.
Artificial Neural Networks
Arbitrary set of inputs →
arbitrary set of outputs
Hidden layer of weight
functions (neurons) that map a
lot of other functions (a series
of simple weighted functions)
Deep Learning
Neural architectures that
embed increasingly
complex learning layers to
optimize for computation
or error minimization.
Essentially, brute force
function approximation.
Bringing neural networks
to mobile devices
Method for modeling neural
networks’ power consumption could
help make the systems portable.
Meanwhile, more smartphones in
Africa than ever before.
http://news.mit.edu/2017/bringing-neural-networks-cellphones-0718
https://www.gsma.com/r/mobileeconomy/sub-saharan-africa/
NLP/AI toolset has failed
Facebook in addressing
hate speech, which may
have unintentionally made
the social media
application a framework
for genocide.
In Burmese
ကုလား (Kalar) commonly used as a slur against Muslims
ကုလားပဲ kalar-pe (chickpeas)
ကုလားထိုင် kalar-tain (armchair)
ကုလားတည် kalar-de (pickle).
Zawgyi vs Unicode encoding
Facebook in
Myanmar
The Future: Global Consistency
Context: A Single Server
Let’s suppose a simple data
storage system where
clients can:
- GET(key) → value
- PUT(key, value)
- DEL(key)
Context: A Single Server
The order of the operations
determines the responses
the system gives.
In a single server system,
the server can determine
any order it wants.
It is always consistent.
1 2
3
4
5
6
Consistency
The system responds to
requests predictably
Context: A Distributed System
A system is distributed when it
contains more than one server that
must communicate.
If one server in the system fails, it
doesn’t necessarily become
unavailable because other servers
can answer requests.
If data is replicated we can also
avoid data loss.
Inconsistency
When multiple servers participate
in the system, they need to
communicate in order to ensure
they remain in the same state.
Communication takes time
(latency) and the more servers are
in the system, the more time it
takes to synchronize.
PUT(x, 42)
ok
GET(x)
not found
Concurrency
Because of delays in
communication, it is possible for
two clients to concurrently
perform two operations - from the
system’s perspective, they
happen at the same time.
The order of these operations
determines the response.
PUT(x, 42)
PUT(x, 27)
GET(x) → ?
Distributed
Consensus
Fault Tolerant Decision Making
for a Network of Replicas.
?
Geo Replication
Distributed Systems of Servers
that Span Continents
State of the Art: Spanner
Academic Research: Alia/Calvin
Geographic Distribution
Things get worse when you try to
put servers in different data
centers:
- Latency increases because of
physical limits
- Network partitions can cut off
groups of servers
Geographic Distribution
There are big benefits to
geographic distribution, however:
- Resilient even in catastrophic
failure of a data center.
- Servers respond more quickly
to colocated clients.
Global Clouds
Cloud service providers are expanding their network of data centers across the globe.
Legalities of Global Systems
Systems are regulated in the
countries where the data resides
(where the servers are).
Cloud coverages means its easy
for service providers to quickly
switch the location of their
service.
Project Loon and Facebook Athena
Destabilizing Effects
Unifying Effects

Digitocracy without Borders: the unifying and destabilizing effects of software internationalization

  • 1.
    Digitocracy without Borders TheUnifying and Destabilizing Effects of Software Internationalization
  • 2.
    Software Economics Software isno longer treated as a traditional, consumable product that can be shipped to stores and customers in boxes, purchased routinely and frequently. Instead, software is a service and is generally expected to be free or mostly free.
  • 3.
    Software Economics Software monetization strategiestherefore rely on “hockey stick” user growth curves that rely on network effects. With enough user density, ecosystems form that support digital products.
  • 4.
    Software Economics In acompetitive marketplace, apps must target international audiences: there is no such thing as “farm to table” software. Most software is built with localization best practices from the start.
  • 5.
    Traditional political and economicinstitutions are constrained by geography and ideology: they cannot regulate panopolies. We have become more connected than ever before and more people have the opportunity to have a better life and benefit from technology.
  • 6.
    What have beenthe effects of software internationalization? The Past Open Source The Present Machine Translation The Future Global Consistency
  • 7.
  • 8.
  • 9.
    Open Source Extremely complex,very important, and evolving rapidly. - Culture and activism - A mechanism to build good software - An economic model - A commercial model - Technology-related phenomenon
  • 10.
    A Brief Historyof Open Source Hackers and the MIT AI Lab “Sharing recipes is as old as cooking” Free Software Movement “Free as in speech, not as in beer” Open Source “Given enough eyeballs, all bugs are shallow” Affero GPL “Closing the SaaS loophole”
  • 11.
    GNU Emacs Small numberof developers hand curating every line of code and feature. The Cathedral and the Bazaar Linux Kernel Source code developed in public over the Internet with many contributors.
  • 12.
    Yellowbrick NumFOCUS and DistrictData Labs affiliated PyData project for machine learning visualization. 15 releases, 78 contributors, 2125 stars, 357 forks. Libraries.io SourceRank: 14 3 languages translated docs https://www.scikit-yb.org/en/latest/ https://github.com/DistrictDataLabs/yellowbrick
  • 13.
  • 14.
    2009 Open SourceIndex: Georgia Tech and Red Hat http://jsk-sde.blogspot.com/2009/04/open-source-activity-map.html https://www.redhat.com/en/about/press-releases/open-source-index
  • 15.
    Of companies runon open source; less than 3% don’t use OSS in any way. https://www.zdnet.com/article/its-an-open-source-world-78-percent-of-companies-run-open-source-software/ 78%
  • 16.
    Critical Infrastructure runson Open Source In 2012 the heartbleed bug was introduced into OpenSSL, which was publically disclosed in 2014. Security flaw creates a hole that affects 17.5% of servers and 68% of Internet traffic. OpenSSL is maintained by 1 full-time, underpaid maintainer.
  • 17.
  • 18.
  • 19.
    State of theArt Attention Based RNN (LSTM) Encoder/RNN (LSTM) Decoder network with source and target embeddings. Deep Neural Networks.
  • 20.
    Artificial Neural Networks Arbitraryset of inputs → arbitrary set of outputs Hidden layer of weight functions (neurons) that map a lot of other functions (a series of simple weighted functions)
  • 21.
    Deep Learning Neural architecturesthat embed increasingly complex learning layers to optimize for computation or error minimization. Essentially, brute force function approximation.
  • 23.
    Bringing neural networks tomobile devices Method for modeling neural networks’ power consumption could help make the systems portable. Meanwhile, more smartphones in Africa than ever before. http://news.mit.edu/2017/bringing-neural-networks-cellphones-0718 https://www.gsma.com/r/mobileeconomy/sub-saharan-africa/
  • 24.
    NLP/AI toolset hasfailed Facebook in addressing hate speech, which may have unintentionally made the social media application a framework for genocide. In Burmese ကုလား (Kalar) commonly used as a slur against Muslims ကုလားပဲ kalar-pe (chickpeas) ကုလားထိုင် kalar-tain (armchair) ကုလားတည် kalar-de (pickle). Zawgyi vs Unicode encoding Facebook in Myanmar
  • 26.
    The Future: GlobalConsistency
  • 27.
    Context: A SingleServer Let’s suppose a simple data storage system where clients can: - GET(key) → value - PUT(key, value) - DEL(key)
  • 28.
    Context: A SingleServer The order of the operations determines the responses the system gives. In a single server system, the server can determine any order it wants. It is always consistent. 1 2 3 4 5 6
  • 29.
    Consistency The system respondsto requests predictably
  • 30.
    Context: A DistributedSystem A system is distributed when it contains more than one server that must communicate. If one server in the system fails, it doesn’t necessarily become unavailable because other servers can answer requests. If data is replicated we can also avoid data loss.
  • 31.
    Inconsistency When multiple serversparticipate in the system, they need to communicate in order to ensure they remain in the same state. Communication takes time (latency) and the more servers are in the system, the more time it takes to synchronize. PUT(x, 42) ok GET(x) not found
  • 32.
    Concurrency Because of delaysin communication, it is possible for two clients to concurrently perform two operations - from the system’s perspective, they happen at the same time. The order of these operations determines the response. PUT(x, 42) PUT(x, 27) GET(x) → ?
  • 33.
    Distributed Consensus Fault Tolerant DecisionMaking for a Network of Replicas. ?
  • 34.
    Geo Replication Distributed Systemsof Servers that Span Continents State of the Art: Spanner Academic Research: Alia/Calvin
  • 35.
    Geographic Distribution Things getworse when you try to put servers in different data centers: - Latency increases because of physical limits - Network partitions can cut off groups of servers
  • 36.
    Geographic Distribution There arebig benefits to geographic distribution, however: - Resilient even in catastrophic failure of a data center. - Servers respond more quickly to colocated clients.
  • 37.
    Global Clouds Cloud serviceproviders are expanding their network of data centers across the globe.
  • 38.
    Legalities of GlobalSystems Systems are regulated in the countries where the data resides (where the servers are). Cloud coverages means its easy for service providers to quickly switch the location of their service.
  • 40.
    Project Loon andFacebook Athena
  • 41.