Urban Wage Gaps and Geography
Urban Wage Gaps and Geography
Abstract
We document that isolated cities have less wage inequality in American census data. To explain
this correlation and other correlations between population and wages, we build an equilibrium
empirical model that incorporates high and low-skill labor, costly trade, and both agglom-
eration and congestion forces. Our paper bridges the gap between the economic geography
literature which abstracts from inequality, and the spatial inequality literature which abstracts
from geography. We find that geographical location explains 16.5% of observed variation in
wage inequality across American cities. We use our model to simulate counterfactual trade and
technology shocks. Reductions in domestic trade costs benefit both skill groups but low-skill
workers benefit more.
∗
An earlier version of this paper was circulated with the title “Trade and Inequality in the Spatial Economy”.
We gratefully acknowledge support from the Danish Council for Independent Research Grant 8019-00031B, as well as
support from the Otto Mønsteds Fond. We thank Sina Smid and Karolina Stachlewska for excellent research assistance.
We thank two anonymous referees, Treb Allen, Nathaniel Baum-Snow, Jonathan Dingel, Jeff Lin, Tobias Seidal, and
participants in seminars at the Asian Meeting of the Econometric Society, Cardiff University, Copenhagen Business
School, Copenhagen University, Hitotsubashi University, Indiana Bloomington University, the Nordic International
Trade Seminars, the European Trade Study Group, the North American Regional Science Association, Penn State,
Purdue, the Society for Economic Dynamics, the Shanghai University of Economics and Finance, and University of
Tokyo for helpful suggestions.
1
1 Introduction
Inequality has long fascinated economists, and growing income inequality has been recently
and heatedly discussed in public forums.1 This public discussion has been complemented by a
number of academic studies highlighting the spatial distribution of wage inequality. We have
learned that there is a strong and increasing positive relationship between wage inequality and
city size (Baum-Snow and Pavan, 2012; Moretti, 2013; Lindley and Machin, 2014), and that high
and low-skill workers are increasingly segregated across cities (Diamond, 2015). In this paper,
we add a further attribute of a city to this discussion: spatial position. We first document
the relationship between inequality and geography in the American data. Then we build and
estimate an equilibrium model to measure the importance of geography for wage inequality, and
to study the effects of trade and productivity shocks on welfare and inequality.
Using American census data, we show that geographical location has significant power in
explaining observed skill wage premia. This result holds across a wide variety of specifications
and weighting strategies. In a word, the closer a city is to the ocean and the nearer it is to
other cities, the more unequal it tends to be.2 For example, Minneapolis is around one standard
deviation more isolated than Miami, and has wage inequality around two standard deviations
lower than Miami. In order to explain this correlation together with previously documented
facts on population and wages, we develop an estimable equilibrium model of domestic trade
and inequality.
While our research speaks to several literatures, our primary contribution is in developing
an estimable equilibrium model of spatial wage inequality in which geography matters. Follow-
ing and contributing to the popular debate on inequality, several authors have expanded our
understanding of wage and welfare inequality in American data (Baum-Snow and Pavan, 2012;
Combes et al., 2012b; Moretti, 2013; Davis and Dingel, 2014; Diamond, 2015; Farrokhi, 2018).
As a shorthand, we refer to these papers as the spatial inequality literature. To date, the spatial
inequality literature has abstracted from geography. Either cities are unable to trade with each
other, or able to trade with each other costlessly. In both of these extremes, the geographic
location of a city relative to other cities is irrelevant, so questions about the interaction of geog-
raphy with inequality cannot be addressed. By including costly trade between cities in a model
of mobile heterogeneous labor, we can measure the contribution of geography to inequality.
In order to solve an equilibrium model of inequality, we use tools recently introduced to
the economic geography literature by Allen and Arkolakis (2014). We follow a growing body
of literature estimating structural economic geography models to evaluate the effects of eco-
nomic policy on migration and welfare (Bartelme, 2015; Desmet et al., 2016; Allen et al., 2016).
The economic geography literature as a whole has typically focused on welfare at the aggregate
1
The literature on the causes of the rise in American wage inequality in the United States is large. For an extensive
treatment, see Goldin and Katz (2009). There is also a growing body of literature on consequences of inequality. For
example some studies link income inequality to the recent rise of populism in the United States (McCarty et al., 2016),
others to adverse health outcomes (Wilkinson and Pickett, 2006).
2
These concepts will be defined precisely in Section 2.2.
2
(Krugman, 1991b; Fujita et al., 2001; Fajgelbaum et al., 2015; Monte et al., 2015).3 We com-
plement this literature by studying the effects of policy not only on aggregate welfare but also
on welfare inequality.
Our modeling approach allows us to fully solve for counterfactual outcomes taking general
equilibrium effects into account. In contrast, the spatial inequality literature has often used
equilibrium models without solving for equilibrium. Recent spatial inequality contributions
employ instrumental variables and equilibrium relationships to identify a handful of parameters
of interest (Moretti, 2013; Baum-Snow et al., 2014). This methodology is sufficient to test
alternative hypotheses about sources of inequality, but it limits a researcher’s ability to run
counterfactual policy experiments. The closest paper in this recent literature to ours is Diamond
(2015), who estimates a rich structural spatial inequality model based on discrete choices of
workers over where to live. While Diamond (2015) allows for a more flexible specification, we
adopt a more stylized model. The advantage of our stylized model is that we can fully solve our
model for equilibrium population, wage, and inequality levels at a wide range of counterfactual
parameter values.
In our model, we have a continuum of locations. In each location, there are immobile
landlords, immobile firms, and perfectly mobile workers. Workers come in two types, high-skill
and low-skill, and each worker has an idiosyncratic utility from living in each location. A worker
decides where to live taking prices and wages as given. A firm also takes local wages as given,
and produces a tradeable good using high-skill and low-skill labor as inputs. The key difference
between high and low-skill workers is that high-skill workers benefit more from agglomeration.4
In equilibrium, welfare of marginal workers in each skill group equalizes across space.
We require a model that generates higher skill wage premia in less remote cities. The
interplay between two critical features of our model deliver the required relationship. These two
features are stronger agglomeration forces for high-skill workers, and heterogeneous location
preferences. The intuition behind this interaction can be described in a few sentences. Consider
a city near other cities, a centrally-located city. Its access to cheap tradeable goods and nearby
markets make this city attractive to live in. This leads the city, all else equal, to have a
relatively high population of both high and low-skill workers compared with a remote city. Due
to agglomeration forces, high-skill workers are relatively more productive in the centrally-located
city. If the ratio of high to low-skill wages in the centrally-located city were the same as in the
remote city, firms would demand a larger ratio of high to low-skill workers in the centrally-
located city. In order for the demand for high-skill labor to equal its supply, in equilibrium
the high to low-skill wage ratio must be higher in the centrally-located city. Because location
preferences matter, high-skill workers elsewhere do not fully arbitrage the higher wages in the
3
One notable exception is Fujita and Thisse (2006), which focuses on inequality and costly trade in an interna-
tional trade context with only two regions and only high-skill workers mobile. In addition, Fajgelbaum and Gaubert
(2018) characterize optimal spatial policy in a setting with trade and skill groups where they compare the observed
concentration of skill compared to the efficient outcome in the United States.
4
Davis and Dingel (2019) microfound a mechanism for this assumption related to complementary between idea
exchange and ability.
3
centrally-located city away.
We interpret American census data in 2000 as the equilibrium outcome of our model, and
Core Based Statistical Areas as our cities or geographical units of observation. We estimate our
model parameters using equilibrium relationships that describe labor supply and demand across
these cities. In addition, we estimate costs of trading goods between cities in a similar way to
Allen and Arkolakis (2014).
We use our estimated model to perform several quantitative exercises. First, we use our es-
timated amenities, productivities, and trade costs to decompose sources of variation in observed
wage premia across American cities. We find that geographical position explains 16.5% of the
variation in wage premia across cities.
In addition, we simulate the model equilibrium when domestic trade costs change. We find
that reductions in domestic trade costs benefit both types of labor, but low-skill labor gains
more than high-skill labor. This result is in contrast to a number of papers that study the
effects of international trade on inequality (Antràs et al., 2006; Hummels et al., 2014).5 In our
exercise, better trading infrastructure tends to spread out the population in the United States
so that high-skill workers lose some of their agglomeration advantage over low-skill workers. The
negative effect of trade on wage inequality in the international context is reversed when labor
is mobile in the presence of agglomeration economies in the national context.6
Lastly, we simulate the equilibrium effects of the rise of Silicon Valley by implementing a
counterfactual productivity shock to all cities in California such that our model generates actual
changes to the share of high and low-skill population in California between 1980 to 2000. We
find that this skill-biased technology shock increased the expected welfare of high-skill workers
nationally by 1.3% and of low-skill workers nationally by 0.3%.
4
We want to compare inequality in different locations. As agglomeration will be an important
component of our model, the size of a location will be critical for our analysis. Different authors
in the literature have used different regions as units of analysis. For our purposes, a location
will be either a Core Based Statistical Area (CBSA) or the non-CBSA part of a census area
known as a Public Use Microdata Area (PUMA). As shorthand, we will sometimes refer to these
areas as “cities”. A CBSA is a set of counties with a high degree of social and economic ties
to a central urbanized area as measured by commuting ties (Census, 2012). PUMAs are drawn
to completely cover the United States. In order to comply with census disclosure rules, each
PUMA contains between 100,000 and 300,000 residents. By including the non-CBSA parts of
PUMAs in our analysis, we widen the scope of our study to the entire continental United States.
In addition to the IPUMs data, we need information on the geographical position of each
location as well as information on trade flows between locations. We use geographical position
data from the Missouri Census Data Center. For trade flows we use publicly available data
from the U.S. Commodity Flow Survey (CFS). Our data on trade flows is from 2007, as this
is the first year in which data is available at the required level of disaggregation. The 2007
CFS covers business establishments with paid employees in mining, manufacturing, wholesale
trade, and selected retail and services trade industries. In the survey, a total sample size of
approximately 102,000 establishments are selected from a universe of 754,000 establishments.
For further discussion of data sources and manipulation, see Appendix A.
5
wage premium.7
Next we assign to each location several measures of isolation from other locations. None of
these simple, atheoretical measures is completely satisfying alone. We simply aim to use these
measures of isolation to build a case that geography matters, and motivate our subsequent
structural analysis rooted in preferences, production technology, and trade costs. To this end,
we assign each location two primary measures of isolation: distance to ocean and remoteness.8
We measure a location’s domestic isolation using remoteness, a concept we borrow from the
international trade literature (Head, 2003). Each location is labeled with a number i = 1 . . . N .
The distance between location i and location j is dij . The distance we use here is structurally
estimated later in this paper, and captures the iceberg trade cost between every pair of locations
given the network of transportation infrastructure in the United States. The remoteness of
location i, Remi , is the weighted, generalized mean of the distances between location i and all
other locations:
1
1−σ
X
1−σ
Remi = ωj dij
j
In words, a location with low transport costs to other locations will have low remoteness. In a
standard trade model with a CES demand system, the price index of tradeable goods in location
i follows a similar expression, with weights ωi related to economic size, and σ interpretable as
the elasticity of substitution in utility of differentiated goods. We set ωj to the population of
location j, and set σ = 4, following recent estimates in the international trade literature (Broda
and Weinstein, 2006; Simonovska and Waugh, 2014).
In addition, we use distance from coastlines to proxy for a location’s isolation from inter-
national trade.9 We measure nearest distance to the ocean as the crow flies using data from
Natural Earth.10 This data comes at a very fine level of disaggregation. To aggregate up to
the level of our locations, we assign each location the mean distance to the ocean within its
borders.11
Table 1 reports descriptive statistics, and Figure 1 shows how our measures vary across
the United States. The borders in this map are the intersection of PUMAs and CBSAs, but
are colored based on the geographical unit of analysis described in Section 2.1. Remoteness is
highest in the North and North-West of the United States. Distance from the ocean is highest
7
Our results are robust to alternative measures of wages such as a direct measure of mean log hourly wage as well
as log wage residuals obtained from regressing log hourly wage against gender, experience, and race. Our results are
also robust to defining a high-skill worker as a worker with 4 years of college or more, and low-skill as all other workers.
8
To be clear here about terminology, remoteness is one specific measure of isolation. By isolation, we mean
aggregated distance from economically important locations.
9
Coşar and Fajgelbaum (2016) show how distance from the ocean affects trade patterns within a country.
10
Natural Earth is a free source of physical geographical data in the public domain maintained by the North American
Cartographic Information Society. More information at naturalearthdata.com .
11
Our data comes projected in spherical coordinates. For ease of interpretation, we convert our spherical distances
to approximate kilometers using the rule of thumb that one spherical degree in the United States is approximately
equal to 100 km. All of our analysis is in logarithms, so scaling errors will only affect the constant.
6
in the Center-North of the United States.12 The skill wage premium is higher in the parts of
the country which are less isolated and have higher population.
Below, we report correlations between our reduced-form measures of isolation and skill pre-
mium to motivate that geography matters for inequality. In Section 5, we will use a structurally-
estimated price index to quantify precisely how much of the observed variations in skill premium
can be explained by geography.
7
(a) Remoteness (b) Distance from Coast
8
and other policies implemented at the level of states. The disadvantage is that by including
state fixed effects, our main right-hand-side variable of interest, geography, might not vary suf-
ficiently within states. We weight all regressions by population, because our dependent variable
is itself composed of data means. Removing these weights does not affect the signs or statistical
significance of our estimates. We report results from additional specifications in Appendix B.
In columns (1)-(4), we find that locations that are more remote within the United States or
more distant from coastlines appear to have less wage inequality. This relationship is statistically
significant in all regressions in Panel A, but it drops in absolute value, from 0.167 to 0.074,
when population is controlled for. In Panel B, with state fixed effects, the remoteness coefficient
loses statistical significance when population is included. Results remain the same when we
replace population with college population. The effect of geography on the skill wage premium
is mitigated when population is controlled for, and in some specifications the effect is not
significantly distinguishable from zero.
In columns (5)-(6), we find that remoteness is negatively correlated with population or with
college population across cities. This negative correlation together with the positive association
between city size and wage premium are consistent with a hypothesis in which remoteness affects
wage premium through the population channel.
Overall, these regressions demonstrate that geographic features of cities correlate with wage
inequality across a wide range of specifications, but this effect is mitigated or loses its statistical
significance when population (or college population) is controlled for.
9
Dependent variable: Log wage of individual workers
(1) (2) (3) (4) (5) (6) (7)
10
(1) (2) (3) (4) (5) (6)
Log college wage premium Log pop Log col pop
Panel A: Not controlling for state effects
3 Theory
In the last section, we presented evidence that skill wage premium tends to be lower in more
isolated locations. To explain this correlation, we build a model that incorporates high- and
low-skill labor, costly trade, and both agglomeration and congestion forces. The model helps us
examine the equilibrium responses of inequality to shocks that stem from trade or technology.
11
3.1 Setup
The model is static, with a continuum of locations j ∈ J, a continuum of high-skill workers
labeled as H, and a continuum of low-skill workers labeled as L. The set of locations J, and
total population of skill groups, NL and NH , are given. Workers can choose to reside and work
in any single location. Firms in each location produce a location-specific variety of a tradeable
final good using the two types of labor as inputs into a constant elasticity of substitution
production function. Each location produces a single tradeable location-specific final good.
Consumers cannot perfectly substitute across these location-specific final goods. That is, trade
is Armington. Both workers and firms are price takers in perfectly competitive markets.
Here, δ ∈ (0, 1) is the share of expenditures on tradeables.13 The tradeable goods are differen-
tiated by the location of production. The bundle Q(i) aggregates quantities of consumption in
location i from goods produced in j, q(j, i), under a constant elasticity of substitution σ > 0,
Z σ
σ−1 σ−1
Q(i) = q(j, i) σ dj
J
A worker with skill s who resides in location i earns wages ws (i), and faces the following budget
constraint,
Z
ws (i) = C(i)Z(i) + p(j, i)q(j, i) dj, (2)
J
where C(i) is price per unit of housing in i, and p(j, i) is price of good j in destination i.
While the system of preferences is homothetic, we capture potential heterogeneity across skill
groups by letting them value local amenities differently. The idiosyncratic preference shock, ε,
is independent across workers and locations, and follows a Fréchet distribution, Pr(ε ≤ x) =
exp(−x−θ ), where θ governs the dispersion of the location preference shocks.
A worker has two decisions to make. She decides where to live, and how much to consume.
Given a choice of location, the second problem is standard. Utility maximization implies that
a worker spend δ share of her income on tradeable goods and the rest on housing. A worker of
13
While housing services is usually estimated to be a weak necessity good (Aguiar and Bils (2015) report an income
elasticity of 0.92), for simplicity we follow the recent spatial inequality literature in assuming constant expenditure
shares (Moretti, 2013; Diamond, 2015).
12
type s in location i spends xs (j, i) on goods produced in j,
h p(j, i) i1−σ
xs (j, i) = δws (i) (3)
P (i)
Land is owned by immobile landlords who receive housing rents as their income, and like
local workers, decides how much of each good and residential land to consume. The supply
of residential land, denoted by Z̄(i), is inelastically given. The land market clearing condition
immediately pins down the price per unit of housing,
1−δ
C(i) = nL (i)wL (i) + nH (i)wH (i) , (5)
δ Z̄(i)
where ns (i) denotes the population of skill group s in location i. The price index in location
i, combines prices of tradeable goods and of housing, given by P (i)δ C(i)1−δ . Total income in
location i, equals total wages plus housing rents, given by 1δ (nL (i)wL (i) + nH (i)wH (i)).
The second decision a worker makes is where to live. A worker ω with skill level s faces the
following discrete choice problem of where to reside:
ws (i)
max ūs (i)εω (i)
i∈J P (i)δ C(i)1−δ
Using the properties of the Fréchet distribution, the supply of type s labor in location i relative
to j is given by:
θ
ws (i)ūs (i)/(P (i)δ C(i)1−δ )
ns (i)
= .
ns (j) ws (j)ūs (j)/(P (j)δ C(j)1−δ )
The elasticity of relative labor supply to relative wages equals θ. The variance of ε across
both workers and locations is decreasing in θ. When θ is large, unobserved location preferences
are similar across locations. Thus, small changes to wages, prices, or amenities induce large
movements of workers. That is, the supply curve of workers to a location is flat. When θ is
small, workers have widely varying preferences over locations, so that large changes in wages,
prices, or amenities are necessary to induce movement.
We define the well-being index, denoted by Ws , for population of skill s:
Z w (j)ū (j) θ θ1
s s
Ws ≡ δ C(i)1−δ
dj
j∈J P (i)
This index is proportional to the expected welfare of a worker of type s before she draws her
13
location preferences.14 The share of workers of type s in location i is given by:
!θ
ns (i) ws (i)ūs (i)/ P (i)δ C(i)1−δ
= (6)
Ns Ws
If a location offers higher wages, better amenities, lower prices of tradeables, and lower housing
rents, it will attract more population, with the extent of the relationship governed by θ.
where A(i) is total factor productivity in location i. ρ > 0 is the elasticity of substitution
between high- and low-skill workers. βH (i) > 0 and βL (i) > 0 are factor intensities. We
incorporate agglomeration forces by distinguishing two sources of productivity externalities.
First, we specify total factor productivity as:
with α > 0. This agglomeration force changes productivity of both low and high-skill workers.
A standard Krugman-type economic geography model with monopolistic competition and free
entry generates the same relation through endogenous measure of firms, with the exact relation
if α = 1/(1 + σ).15
In addition, the empirical literature on urban and labor economics substantiates that ag-
glomeration forces are stronger for high-skill workers.16 On the theoretical side, the literature
explains this fact by modeling spillovers through the exchange of ideas within high-skill workers
(Davis and Dingel, 2019). To capture this mechanism in our empirical model, we let skilled
worker’s productivity covary positively with the population of skilled workers in a location,
where ϕ > 0 governs the agglomeration advantage that is specific to high-skill workers. By a nor-
14
To get the expected welfare, we must multiply Ws by Γ(1 + θ1 ) where Γ is the gamma function. This scaling term
depends only upon θ, an exogenous preference parameter.
15
The literature has considered alternate sources of aggregate productivity externalities, for example the sorting of
firms as in Gaubert (2014) and Ziv (2017).
16
For example, Glaeser and Resseger (2010) find that “productivity increases with area population for skilled places,
but not for low-skill places,” and Bacolod et al. (2009) find that workers with stronger cognitive skills experience
stronger agglomeration. See Gould (2007), Matano and Naticchioni (2011), and Combes et al. (2012a) for more details
on stronger agglomeration gains for workers with higher skills and wages.
14
malization, we assign no further agglomeration benefit to low-skill labor. By cost minimization,
the unit cost of production equals
ν(i) h i 1
ρ 1−ρ ρ 1−ρ 1−ρ
, where ν(i) = βH (i) wH (i) + βL (i) wL (i) (9)
A(i)
βH (i)ρ wH (i)1−ρ
b(i) = (10)
βH (i)ρ wH (i)1−ρ + βL (i)ρ wL (i)1−ρ
Lastly, as markets are perfectly competitive, price equals marginal cost. Let d(i, j) be the
trade cost of shipping a good from i to j. The price of a good produced in location i and
consumed in location j is:
ν(i)d(i, j)
p(i, j) = (11)
A(i)
On the supply side of the labor market, employment shares described by equation (6) imply
−θ θ θ
nH (i) NH WH ūH (i) wH (i)
= (13)
nL (i) NL WL ūL (i) wL (i)
A necessary condition for labor market clearing is that skill premia simultaneously satisfy the
pairs of relative demand (12) and relative supply (13). Combining, we get:
θ −θ −1 ρ
θ+ρ
wH (i) WH θ+ρ ūH (i) θ+ρ NH θ+ρ β̄H (i) ϕρ
= nH (i) θ+ρ (14)
wL (i) WL ūL (i) NL β̄L (i)
Labor market clearing also requires total wages received by all workers to be equal to total
payments to them,17
Z
wH (i)nH (i) wH (j)nH (j) h p(i, j) i1−σ
= dj (15)
b(i) J b(j) P (j)
17
Total wages in location i equal wH (i)n
b(i)
H (i)
, and total income (wages plus rents) equals wH (i)n
δb(i)
H (i)
. Both workers
and landlords spend δ share of their income on tradeables and the rest on housing. Thus, total wages in i equal
R h ih i1−σ
p(i,j)
J
δ total income in j P (j) dj.
15
Equations 14 and 15 describe labor market clearing in relative terms and in levels. (Equivalently,
equation 15 describes the goods market clearing condition.)
A “spatial equilibrium” consists of wH (i), wL (i), nH (i), and nL (i) such that: (1) firms
optimize their labor demand, (2) workers optimize their labor supply, (3) markets clear, and (4)
the labor allocation is feasible.18 This completes our description of the economy.
3.1.4 Discussion
Suppose we were to shut down preference heterogeneity, θ → ∞. From (14) we see that skill
wage premia will be constant across locations. Alternatively, suppose there is no agglomeration
advantage for high-skill workers, ϕ = 0. Then skill premia can vary between destinations only
due to exogenous differences in tastes and productivities between skill groups. To have equilibria
with endogenously varying skill premia, we need both heterogeneity in unobserved location
preferences (finite θ), and an agglomeration advantage for high-skill workers (ϕ > 0). That is,
large cities demand relatively more high-skill workers due to agglomeration, but since unobserved
location preferences matter, high-skill workers do not fully arbitrage the wage increase away.19
To provide further intuition, suppose trade costs to and from a remote city fall. This shock
decreases the price of incoming tradeables, hence the supply of workers to the city rises. In
addition, the shock increases outgoing sales, hence labor demand in the city rises. If the em-
ployment of low- and high-skill workers increase proportionately, agglomeration advantages will
make high-skill workers relatively more productive. That is, firms demand a higher ratio of high
to low-skill workers than their relative supply in the city. Equilibrium is restored only by raising
skill wage premium in the city.
This relationship between trade costs and inequality does not depend on exogenous differ-
ences across skill groups. The exogenous differences are residuals in the relation between skill
population ratio and skill wage premium in equations of relative demand (12) and relative sup-
ply (13). These residuals reflect factors we do not model such as state and local tax incidence,
provision of welfare, and non-labor factor endowments.
Our model implies that the spatial distribution of workers contributes to welfare inequality.
Specifically, writing the distribution of low-skill labor as a function of that of high-skill labor,
18
R R
That is simply J nH (j) dj = NH and J nL (j) dj = NL .
19
Our model relies on differential agglomeration forces between high and low-skill workers and an upward sloping
labor supply curve to generate the observed negative relationship between remoteness and the skill premium. Alter-
natively, such a relationship could potentially be generated by non-homothetic preferences. In particular, suppose
that (poorer) low-skill workers consume a higher share of tradeables and a lower share of housing services. Then, in a
spatial equilibrium, low-skill workers would need to be compensated more for living in remote areas, delivering a lower
skill premium in remote areas.
Empirically, however, it is high-skill workers that consume a higher share of tradeables and a lower share of housing
services. As mentioned in footnote 14, studies typically find housing services to be a slight necessity good, and recent
research has found that higher income Americans consume a higher share of tradeables (Hummels and Lee, 2017).
Since non-homothetic preferences predict the opposite of the observed relationship between remoteness and the skill
premium, by assuming homothetic preferences we may be underestimating the strength of the agglomeration advantage
of high-skill workers.
16
and after some algebra, we decompose three forces behind average welfare inequality,
"Z #− θ+ρ
N − 1 θρ
WH H ρ
= × (NH )ϕ × π(i) di (16)
WL N
| L{z J
| {z }
aggregate agglom.
} | {z }
aggregate scarcity distributional effect
The first term reflects aggregate scarcity of high- to low-skill workers; the second term repre-
sents aggregate agglomeration advantage of high-skill workers; and the last term summarizes
dispersion forces.20 This last term depends on the entire distribution of population which in
turn endogenously changes with geography.
In addition, normalizing the land supply to one, we can write housing rents as a function of
employment and wages of high-skill workers:
(1 − δ)nH (i)
C(i) = C̃(i)wH (i), where C̃(i) ≡ (18)
δb(i)
First, replacing the price index of tradeables P (j) from employment share (6) into the goods
market clearing condition (15) results in:
Second, substituting the price index of tradeables P (j) from employment share (6) into the CES
price formula (4), results in:
1−σ σ−1
Z
1−σ (σ−1)(1−δ) σ−1 1−σ
ūH (i) δ C(i) δ nH (i) δθ wH (i) δ = WH δ NHδθ d(j, i)1−σ A(j)σ−1 ν(j)1−σ dj (20)
J
The pair of 19–20 gives us two systems of integral equations. Assuming that trade costs are
symmetric, we can reduce the two systems into one using a method from Allen and Arkolakis
(2014). If either of integral equations hold along with the following relation, both systems of
−θρ
θ+ρ
θ(1−ρϕ)+ρ
θ+ρ
20 β̄H (i)ūH (i) nH (i)
Specifically, π(i) = β̄L (i)ūL (i) NH
17
integral equations must hold:
1−σ σ−1 1−σ (σ−1)(1−δ)
A(i)1−σ ν(i)σ−1 nH (i)wH (i)b(i)−1 = λūH (i) δ nH (i) δθ wH (i) δ C(i) δ (21)
4 Estimation
In this section we estimate our structural model. Our data consists of four vectors: high and
low-skill populations in each location, and high and low-skill wages in each location. Using
our model structure, we invert these four vectors of data to recover four vectors of exogenous
shifters: high-skill factor intensity β̄H (with β̄L = 1 − β̄H ), total factor productivity shifter Ā(i),
and amenity values to low and high-skill workers ūL (i) and ūH (i).
The inversion of the data into these exogenous shifters depend on the matrix of trade costs as
21
The way we model congestion is a little different than in Allen and Arkolakis (2014), but our models are iso-
morphic once the conditions (i), (ii), and (iii) are fulfilled. We interpret the source of congestion as limited land for
housing. Allen and Arkolakis are agnostic about the source of congestion, only assuming that amenities are reduced
by population.
18
well as six key parameters: (i) high-skill agglomeration advantage ϕ, (ii) elasticity of substitution
across skill groups ρ, (iii) labor supply elasticity θ, (iv) common agglomeration parameter α, (v)
share of expenditures on housing 1 − δ, and (vi) elasticity of substitution across goods σ. We
estimate trade costs between American cities in a similar way to Allen and Arkolakis (2014). We
calculate housing share, 1 − δ = 0.355, based on the Consumer Expenditure Survey 2000.22 We
set the elasticity of substitution across goods σ = 4, in line with the empirical literature using
international trade data (Broda and Weinstein, 2006; Simonovska and Waugh, 2014). Following
a large literature, we use instrumental variables and equilibrium relationships to estimate the
other four parameters (Moretti, 2013; Desmet et al., 2016; Allen et al., 2016).
Since our estimation procedure contains several sequential steps, we present intermediate
results directly after we describe intermediate estimation steps. Trade costs are estimated first.
Next key elasticities are estimated from equilibrium labor demand and supply relationships.
We then invert a set of equilibrium integral equations to recover exogenous location-specific
productivities and amenities.
19
After we finish the first step, we know how much it costs to move goods on the road between
two locations, but only in terms of the units we assigned to road travel. We cannot compare
the cost of road travel to the cost of water transport because we do not know the exchange
rate, as it were, of road travel to water transport. The second step is to use a discrete choice
framework and data on trade flows via each mode between each pair of locations in order to
back out these cost ratios. Shippers have idiosyncratic, extreme value distributed costs for each
mode of transportation. If a large share of transport is via road, then it must be that road is
on average a cheaper mode of transportation.
The discrete choice model will only give us the cost ratio between any two modes of trans-
portation, but we still need to pin down the level of costs. To do so, we use the gravity
specification implied by our model. Consistent with our later structural estimation, we set the
elasticity of substitution across goods equal to four. Estimating the gravity equation gives the
scale of trade costs. With the scaling parameter in hand, we can then calculate expected trade
costs between every pairs of locations.
Our estimates for trade costs are summarized in Table 4. Road, by normalization, has no
fixed cost, and according to the estimation, has a mid-level marginal cost. Rail has a significant
fixed cost, but lower marginal cost than road transport. Water has both high fixed and marginal
cost, reflecting that little shipment within the United States is done by water. Air has a high
fixed cost, but a low marginal cost. To be more concrete, we estimate that the average iceberg
cost of shipping from Chicago to New York City and the average cost of shipping from Chicago
to Fargo, Minnesota are almost the same (1.27 and 1.26 respectively). Even though Chicago is
closer to Fargo (569 miles) than it is to New York City (714 miles) as the bird flies, the highway
system connecting Chicago to New York City is both higher quality and more direct.
Readers familiar with Allen and Arkolakis (2014) or Desmet et al. (2016) will notice that
our estimates are quantitatively somewhat different than those of these earlier studies, although
the ranking of variable and fixed costs is similar. One reason for the difference is that we set a
lower trade elasticity in our structural estimation, σ = 4 rather than σ = 9.24 Our trade costs
are likely higher in absolute terms than in Allen and Arkolakis (2014), as our products are more
differentiated.25
24
Regarding the difference between our estimates and those in Allen and Arkolakis (2014), even if we use σ = 9 we
get somewhat different results, even though we implement the same algorithm on the same data. We discuss reasons
for these differences in Appendix C.
25
A further technical issue is that 4.7% of our iceberg trade costs are estimated to be less than one. In the structural
estimation below, we normalize trade costs by scaling up all trade costs proportionally until the lowest iceberg trade
cost has a value of one.
20
Road Rail Water Air
1
w̃(i) = κ̃ + ñ(i) − ũ(i) (22)
θ
ñ(i) = −ρw̃(i) + ρϕ log nH (i) + ρβ̃(i) (23)
where
h n (i) i h w (i) i h β̄ (i) i h ū (i) i
H H H H
ñ(i) = log , w̃(i) = log , β̃(i) = log , ũ(i) = log
nL (i) wL (i) β̄L (i) ūL (i)
and, κ̃ is a constant.26 Estimating these equations using OLS can be problematic due to cor-
relations between error terms and regressors. In equation (22), the skill population ratio, ñ,
is expected to be higher in locations where the ratio of amenity values for high-skill relative
to low-skill, ũ, are greater. This correlation means that OLS presumably underestimates 1/θ.
In addition, in equation (23), skill premium, w̃, and high skill population, nH , are presumably
higher in locations where the ratio of high-skill to low-skill productivity, β̃ are larger. This
correlation implies that OLS underestimates ρ and overestimates ϕ.
We use instrumental variables to estimate equations (22) and (23). To estimate θ in the
relative supply function (22), we instrument skill population ratio ñ using a variable that is
meant to exclusively capture shifts from the demand side. We are inspired by a large urban and
spatial inequality literature in constructing our exogenous shock using industry-level variation
across locations (Bartik (1991); Moretti (2013); Diamond (2015)). Let d index industry, Ed (i)
N (−i)
be the employment share of industry d in location i with d Ed (i) = 1, and NH,d
P
L,d (−i)
be the
h −θ i
NH WH
26
κ̃ = − θ1 log NL WL
21
national skill population share in industry d excluding location i itself. Our instrument is
X NH,d (−i)
Ed (i) log
NL,d (−i)
d
We assume that industry composition only affects the wage premium through its effect on the
skill population ratio, and is uncorrelated with relative amenities. Suppose relative employment
of high-skill workers is greater nationwide in certain industries. Then, cities with larger employ-
ment shares in those certain industries will have more demand for high-skill relative to low-skill
workers. This creates a shift in demand for high-skill workers, which is presumably uncorrelated
with supply factors (amenities) in a location.
The exogeneity assumption is that our instrument is uncorrelated with relative amenities–
the amenities assigned to a location by high skill workers relative to the amenities assigned to a
location by low skilled workers. In our model, we assume that amenities are immutable features
of locations. Our exogeneity assumption would be violated if skill-intensive industries chose
to locate in places relatively preferred by skilled workers. We believe that industry location is
more driven by other factors such as proximity to markets as in Krugman (1980), proximity to
natural resource endowments as in Ellison and Glaeser (1999), or simply historical accident as
in Krugman (1991a).
Since we do not have strong priors about what sort of amenities high skill workers prefer
relative to low skill workers, the exogeneity condition is difficult to test directly. We can however
show that the instrument is uncorrelated with a number of different measures of amenity levels.
Figure 2 contains scatter plots of the instrument against air pollution, humidity in the summer,
temperature in the winter, distance from the coast, log remoteness and the quality of life index
described in the following paragraph.27 The instrument is uncorrelated with all of these measures
except a weak negative correlation with the quality of life index which is only measured for MSAs.
To estimate ρ and ϕ in the relative demand function (23), we use the residuals of the relative
supply function, ũ, as an instrument for skill premium, w̃. The orthogonality between this
instrument and the error terms is based on the assumption that the relative amenity valuation,
ũ, as a supply factor is uncorrelated with relative factor intensities, β̃, as a demand factor. In
addition, we instrument high-skill population nH (i) using an extended quality of life index that
we borrow from Albouy (2012).28 This index is only reported for MSAs. We extend the index to
our broader set of geographical units by regressing the index on a large set of observables, and
predicting missing values. Our estimates remain virtually the same if we restrict our sample to
only MSAs. This quality of life index is by construction uncorrelated with prices and wages in a
location, but as Albouy shows, it strongly correlates with a wide range of natural and artificial
amenities in a location. The orthogonality between this instrument and error terms is based on
the assumption that this measure of quality of life is not correlated with relative factor intensity.
27
Information on air quality is from Agency (2019), and maps of the US by temperature and humidity are from
Oceanic and Administration (2016).
28
Specifically, we use Albouy’s “adjusted” measure of quality of life.
22
(a) (b) (c)
Estimation results are summarized in Table 5. The Cragg-Donald F-statistics for the first
stage strongly reject that the instruments are weak. The more conservative heteroskedacticity-
robust Kleibergen-Paap F-statistics are somewhat lower, especially the F-statistic on the labor
demand regression of 10.0. For all parameters the IV regressions push the OLS estimates in
directions consistent with our priors explained above. According to our estimates, the dispersion
1
of location preferences θ = .072 = 13.8, the elasticity of substitution in production between
high and low-skill labor ρ = 3.276, and the agglomeration advantage of high-skill labor ϕ =
0.368/3.276 = 0.112.29 In addition, the residuals in equations (22) and (23) give us the exogenous
shifters of relative productivities and amenities β̃ and ũ.
29
Our estimate of the elasticity of substitution between high skill and low skill labor ρ is a bit higher than estimates
reported by the literature. In a literature review, Katz et al. (1999) reports values for this elasticity between 1.40 to
1.70. Ciccone and Peri (2006) come up with estimates between 1.3 and 2, Diamond (2015) estimates ρ = 1.6, and
Card (2009) finds that ρ = 2.5. There is a shorter literature estimating the dispersion of location preferences θ. Our
estimate of θ is close to the point estimate of 11.7 in Allen and Donaldson (2018) and higher than what others have
found in the literature. Monte et al. (2015) estimate a preference dispersion parameter of 3.30, and Serrato and Zidar
(2016) estimate a parameter between one and two.
23
log skill premium, Eq. (22) log population ratio, Eq. (23)
OLS IV OLS IV
Step 1. We first estimate total factor productivity A inclusive of spillovers as well as high
skill amenity values ūH . To do so, we rewrite the two systems of integral equations as follows:
1−σ σ−1
A(i)1−σ = WH δ NHδθ ν(i)1−σ nH (i)−1 wH (i)−1 b(i)
Z
σ−1 (σ−1)(δ−1) 1−σ+δθ σ−1+δ
× d(i, j)1−σ ūH (j) δ C(j) δ nH (j) δθ wH (j) δ b(j)−1 dj (24)
J
1−σ 1−σ σ−1 1−σ σ−1 (σ−1)(δ−1)
ūH (i) δ = WH δ NHδθ nH (i) δθ wH (i) δ C(i) δ
Z
× d(j, i)1−σ A(j)σ−1 ν(j)1−σ dj (25)
J
Here, A(i) and ūH (i) are unknown variables, whereas population and wages are known. As long
as trade costs are symmetric d(i, j) = d(j, i), we can further reduce the two systems of equation
into one. If either of above integral equations hold along with the following relation, then both
systems will hold:
σ−1 (σ−1)(δ−1) 1−σ+δθ σ−1+δ
ūH (i) δ C(i) δ nH (i) δθ wH (i) δ b(i)−1 = λA(i)σ−1 ν(i)1−σ , (26)
where λ > 0 is a constant. The numerical algorithm by which we solve these equations is
described in detail in Appendix D.
24
Step 2. We use our recovered productivities A(i) to estimate common agglomeration param-
eter α and to recover base productivities Ā(i). Taking logs of (7) we get:
We regress recovered log total factor productivity on log population, instrumenting population
with our estimated high-skill amenity values ūH (i). Results are reported in Table 6. We find
that the elasticity of Hicks-neutral productivity with respect to population is 0.305. The IV and
OLS results are similar. While not reported, removing population weights barely changes these
estimates.
Our estimate is broadly in line with the estimates of other recent quantitative economic
geography models, for example, Giannone (2017) finds a common agglomeration elasticity of
0.31. Estimates in the urban and macro literature, on the other hand, are typically less than
0.10 (see the survey by Rosenthal and Strange (2004)). The lower agglomeration elasticity in
the urban and macro literature is caused by the assumption that goods produced in different
locations are perfect substitutes, i.e. σ is infinity. In contrast, we assume a finite elasticity of
substitution across goods differentiated by location. If goods produced in different locations
are imperfect substitutes, demand falls less when productivity is low. Thus, relative to the
case of perfect substitutability, producers in a smaller city are in a better position to compete
with producers in larger cities. To explain the data, our model must therefore assign a larger
productivity advantage to larger cities.
At the calibrated values of Ā and ūH , our solution algorithm reproduces the exact data on
wages and population of low- and high- skill workers. This check confirms the accuracy of both
our calibration and simulation algorithms.
25
4.2.3 Results for the productivity and amenity shifters
In Figure 3, we present the estimated geographical distribution of the four exogenous shifters:
base productivity, high-skill amenities, relative productivity, and relative amenity valuation.
We estimate that common base productivity is higher in the coastal regions of the United
States as well as the Rocky Mountains. It is worth pointing out that, unlike Allen and Arkolakis
(2014), we do not find that cities are fundamentally more productive than other regions.30 Here
we avoid to some degree the critique of the new economic geography literature that cities are
exogenously more productive than nearby, naturally similar areas. Instead, we find that areas of
the United States which are either near coastline, or areas such as the Rocky Mountain region
that have relatively low humidity are more fundamentally productive.31 We do find, however,
that exogenous high-skill amenities ūH are strongly correlated with city size. Our results are
consistent with those of Albouy (2012) who shows that in many ways cities are attractive places
to live for reasons not related to productivity.
ūH
(c) Base high-skill amenities ūH (i) (d) Relative high-skill amenities ūL (i)
Turning to the relative measures, we find that both are reasonably smooth across geography.
We find that in relative terms, low-skill people prefer to live in the South, and tend to be
exogenously more productive in the Lower Midwest region, possibly reflecting the relatively
high soil quality in that region. High-skill people prefer to live in the Upper Midwest, Mountain
regions, and Northwest.
30
Neither do we find them consistently less productive than other regions.
31
The reader should keep in mind that these estimates are neither observed productivity nor amenities. Those
objects are functions of the distribution of population which in turn is an equilibrium object in our model.
26
5 Quantitative exercises
5.1 Role of geography in wage inequality
We motivated our modeling exercise in part as adding geography into a spatial inequality model.
To measure the contribution of geography to wage inequality, we decompose observed variation
in wage premia into variations in exogenous base productivities and amenities in absolute and
relative terms, as well as geographic position. Consider the following relation:
w (i) β̄ (i) ū (i)
H H H
log = γ1 log + γ2 log + γ3 log Ā(i) + γ4 log ūH (i) + γ5 log P (i) + ζ(i)
wL (i) β̄L (i) ūL (i)
The first four terms on the right hand side are the four exogenous shifters in our model. The
fifth term is the tradeables price index P . The price index of tradeables in a location exclusively
embodies the geography of a location with respect to all other locations because it is the only
term that incorporates bilateral trade costs. Lastly, as our model does not imply the above
relation in closed form, we include an error term ζ.
Notation Sk. prem Sk. prem Shp. R2 Sk. prem Shp. R2 Sk. prem Shp. R2
Log tradeable price P -0.052*** -0.008*** 16.5%
Log remoteness Rem -0.053*** 9.8%
Log unweighted remoteness -0.005*** 1.4%
Log amenity level ūH 0.063*** 15.7% 0.067*** 26.7% 0.07 *** 30.4%
Log base productivity Ā 0.002 2.4% 0.033*** 2.0% 0.004** 2.1%
β̄H
Log relative productivity β̄L
0.228*** 10.0% 0.242*** 8.1% 0.227*** 7.5%
ūH
Log relative amenities ūL -0.826*** 55.3% -0.800*** 53.3% -0.830*** 58.6 %
Observations 1267 1267 1267 1267
R-squared 0.303 0.992 1.000 0.991
Note: Regressions report robust standard errors. All observations are weighted by population.
*** p<0.01, ** p<0.05, * p<0.1.
Table 7: Decomposition
We use this relation to quantify how much observed variation in geographic features across
American cities explain variation in their wage premia. In the first column of Table 7, we report
R2 for a simple regression of the log skill wage premium on the log price index of tradeables.
We find that the price index alone can explain 30% of the variation in the wage premium.
In the third column of Table 7, we report results from the full decomposition. Our five
shifters explain 99% of the variation in observed wage premia. Using the Shapley decomposition
method, we find that 16.5% of observed variation in the skill wage premium are due to the
variation in geographic features across American cities. Geographic features explain more of the
variation in wage premia than relative productivity and productivity levels combined. While
both geography and productivities contribute measurably to wage inequality across space, we
find that the largest part of the variation in wage premia is explained by variation in relative
amenities. The signs of each factor in the regression is as expected. We expect more productive
27
and nicer places, all else equal to have higher population and thus more wage inequality. We
expect more remote places to have lower wage inequality. We also expect places with higher
relative productivity to have more inequality. Finally, we expect places which high-skill workers
value more to have lower wage inequality, since high-skill workers will be relatively attracted to
these places even if their wages there are relatively low.
In the last four columns of Table 7, we report two similar decompositions, but with the price
index replaced by the remoteness measure from Section 2.2 and an unweighted version of the
remoteness measure, which is simply an unweighted average of iceberg trade costs. Results are
similar across decompositions, although geography explains less of the variation in skill premium
when we use remoteness, and even less when we use the unweighted version of remoteness. This
is expected, because the price index is the model consistent weighted average of trade costs.
The population weights in the remoteness measure do not fully reflect relevant productivity
differences between cities. The unweighted remoteness measure weights all trade costs the same
way, but having low trade costs with a productive city is important, while having low trade
costs with an unproductive city is irrelevant.
28
(a) High-skill welfare (b) Low-skill welfare
It is not immediately obvious how we should expect population to change after an increase in
trade costs. One force causes additional concentration. After trade costs increase, cheap goods
from productive cities are no longer cheap in small towns, and conversely, already expensive
goods produced in small towns become even more expensive in cities. This mechanism encour-
ages workers to move to cities in order to access cheap goods and find lucrative jobs, leading
to an overall concentration of population and thus an increase in agglomeration forces. Since
high-skill workers benefit relatively more from agglomeration, the concentration in population
raises welfare inequality. A second force tends to make population diffuse. If trade costs ap-
proach infinity, all cities are equally isolated. The decrease in the cost advantage of formerly
well-connected cities encourages workers to move to formerly isolated cities to benefit from lower
housing costs there. This diffusion leads to a lessening of the strength of agglomeration forces
and drives welfare inequality down.
We find that the force causing concentration dominates. Figure 4 summarizes our results
from a large number of counterfactual experiments. In each experiment we increase all trade
costs from their baseline values uniformly by 1, 2, ..., 500 percent. Our basic finding is that both
high and low-skill welfare fall with increases in trade costs, but low-skill welfare falls more. In
the extreme case of five times measured trade cost, high and low-skill welfare decrease by 17.1%
29
and 21.8% respectively. Accordingly, the ratio of high to low-skill welfare increases by 6.0%. To
make a connection to the intuition we provided above, we also report changes to a Herfindahl
index in population, that is, the sum of squared population shares of American cities. As shown
in Figure 4, the Herfindahl index in population monotonically increases with trade costs.
In addition to overall effects on welfare, our model allows us to analyze exactly which areas of
the United States grow and shrink as trade costs rise.32 Figure 5 contains percentage population
changes relative to our data when all trade costs are 500 percentage higher. Blue indicates an
increase in population (light greater than 5%, and dark greater than 25%), white indicates no
change (-5% to 5% population growth), and red indicates a decrease in population (light greater
than 5%, and dark greater than 25%). Population concentrates in a small number of cities and
their surrounding areas across the United States.33
We find that population moves away from formerly well-connected areas in the South and
Midwest regions of the United States to the coasts. In particular, the formerly less connected
cities Seattle and Portland in the Pacific Northwest expand markedly. Conversely, formerly
highly connected cities with lower amenity levels, such as St. Louis or Cincinnati, largely shrink.
The five cities with the highest and the five cities with the lowest changes in population are
listed in Table 8.34 Also in Table 8, we calculate the elasticities of high and low-skill population
to a uniform increase in trade costs at the initial level. As a rule, high-skill population is
more sensitive to changes in trade costs. As cities grow skilled workers become relatively more
productive, inducing more skilled in-migration. As cities shrink skilled workers become relatively
less productive, inducing more skilled out-migration.
We have argued that rising trade costs both lead to concentration in large cities, and disper-
sion away from well-connected cities. Since people prefer to live in well-connected cities, these
two variables are highly positively correlated. We provide evidence for these two mechanisms
by first regressing log initial population on the log initial price index of tradeables. A city with
a low residual from this regression is more well-connected than we might expect given its pop-
ulation. We would expect the dispersion force to dominate in such a city, and the population
to fall. The opposite is true for a city with a high residual. We would expect the concentration
force to dominate and the city to grow. Figure 6 is a scatter plot of log population change
against the residual from the regression described above. The circle size is proportional to the
initial population of the city. As predicted we see a positive slope, indicating that population
falls more on average in cities with low residuals.
These results complement a literature that studies the effects of international trade on in-
equality in developing countries (Antràs et al., 2006; Hummels et al., 2014). Indeed, and in
contrast to the Stolper-Samuelson theorem, globalization has increased inequality even in de-
veloping countries (Davis and Mishra, 2007). We find instead that domestic trade costs and
32
We can also examine changes to population ratios and skill premia across cities. These closely track changes to
population so that the maps do not qualitatively differ from that presented here on total population changes.
33
Surrounding areas grow because although trade costs are very high, they are not infinite.
34
Locations are only included in this table if their total population of high school graduates and 4-year college degree
is greater than 300,000 in the data.
30
inequality are positively correlated. The key difference in our context is that workers are mobile,
and thus agglomeration economies change endogenously with market integration. The negative
effect of trade on wage inequality in the international context is reversed when labor is mobile
across locations within a nation.
Figure 5: Population changes when all trade costs are five times larger
Note: Blue: increase in population, White: no or little growth, Red: decrease in population.
Table 8: The five top regions with the highest and the top five with the lowest counterfactual
population growth (reported for locations initially larger than 300,000 high school and college workers)
31
Figure 6: Log population change against residual of log initial population on log price index
32
Average national high-skill welfare 1.3
Average national low-skill welfare 0.3
Average national welfare ratio 1.0
Table 9: The effects of California’s productivity shocks on welfare, prices, and wages (percentage change)
33
increase in housing price which dominates the fall in the price of tradeables, while in the rest of
the US cheaper tradeables are the main driver of the lower cost of living.
6 Extensions
In this section, we discuss a few ways we might extend our results.
Sorting into industries and occupations. We have shown that more centrally located cities
tend to have more wage inequality. We look into how geographic location would matter for
inequality through sorting of skills into industries and occupations. To provide suggestive ev-
idence, we add industry- and occupation-related controls to our individual-level regressions in
Section 2. We classify all industries into 30 groups and all occupations into 23 groups based
on IPUM classifications (see Table 11 in the appendix). We then define skill intensity for an
industry (or occupation) as the share of college employment in that industry (or occupation) at
the national level. Table 10 reports our regressions results. In column (1) we include industry
and occupation fixed effects. In column (2) we instead control for industry and occupation skill
intensity. In column (3), we reproduce column (5) of Table 2 where we allow for college dummy
interacted with remoteness and population without controlling for industry or occupation. In
column (4), we also allow industry and occupation skill intensity to interact with population and
remoteness. As in Table 2, we are mainly interested in the coefficient on the interaction of col-
lege dummy with remoteness, which reflects how large wages of college graduates are compared
to high-school graduates in more remote cities.
Geographic location of a city remains correlated with skill premia across all specifications.
The coefficient of remoteness interacted with college slightly decreases from -0.113 in column
(3) to -0.094 in column (4) where we add industry and occupation controls.36 We take this
as suggestive evidence that sorting to industries and occupations may mediate the relationship
between remoteness and skill wage premia, but only marginally.
36
Furthermore, the coefficient of remoteness for high-school graduates is not statistically significant in column (4),
suggesting that once industry and occupation are controlled for, the remoteness effect operates primarily through
college graduates.
34
dependent variable: log wage of individual workers
(1) (2) (3) (4)
Notes: Standard errors, clustered at city level, are reported in parentheses. In all regres-
sions, there are 3,050,723 observations, we weight individuals based on census sampling
weights, and we include individual-level gender and race dummies, a cubic polynomial of
years of experience, and state fixed effects. *** p<0.01, ** p<0.05, * p<0.1.
Table 10: Wages and remoteness controlling for industry and occupation, at the level of individual workers
35
equation (12) gives
θ −θ −1 ρ
e
θ+e
wH (i) WH θ+eρ ūH (i) θ+e
ρ NH θ+e
ρ β̄H (i) ρ ϕe
ρ
= nH (i) θ+eρ (28)
wL (i) WL ūL (i) NL β̄L (i)
where ρe ≡ ρ 1 − θ(ηH − ηL ) . This equation, as an equilibrium relationship in relative terms,
collapses to equation (14) if ηH − ηL = 0. We examine how our parameter estimates may change
in this extended model. On the one hand, we rewrite our estimable equation (22),
1 − θ(ηH − ηL )
w̃(i) = κ̃ + ñ(i) − ũ(i) (29)
θ
Our instrumental variables strategy used to estimate the dispersion of location preferences the
relative supply equation (22) should still remain valid for estimating (29). In this endogenous
amenities version of the model, however, we interpret the estimate differently. In particular,
1
the estimated coefficient ñ(i), 0.072, implies that θ = 0.072+(η H −ηL )
. Suppose ηH > ηL meaning
that high-skill workers attach a higher valuation to amenities derived from relative supply of
high-skill workers. Then, relative to our baseline estimates, we would infer a smaller θ. That
is, we would estimate more dispersion in unobserved location preferences. Intuitively, the labor
supply equations describe the way that the relative supply of high skill workers reacts to the
relative wage level. This is informative about worker preferences over locations, because the
more indifferent they are between locations, the more population will react to wage level. Since
endogenous amenities increase the incentive for high skill workers to live in cities with a high
relative supply of high skill workers, to explain the observed relationship between relative wages
and relative population we do not need workers to be as indifferent between locations as in our
baseline model.
Since ηs does not enter into relative demand equation (23), we will have the same estimates
of ρ and ϕ as in our baseline. Substituting these expressions into equation (28), we find that the
exponent of nH in the extended model is exactly the same as that in equation (14) in our baseline
model. Hence, the estimated elasticity of the wage premium to high-skill employment will remain
unchanged. On the other hand, our estimates of the fundamental productivity and amenity
shifters would be somewhat different in a version of the model with endogenous amenities. The
key challenge in such an exercise would be to devise a model-consistent estimation method to
separate the elasticity parameters ηs from the location dispersion parameter θ.
Migration costs. In this paper we have developed a medium-run static model building on an
empirical spatial equilibrium literature. Geography enters our model purely through trade costs.
Another potential way in which geography might affect the distribution of wages is through
migration costs. For example, if it is more costly to move far away, then initial placement
will be an important determinant of final location choice. Since our model is static and we
are considering medium-run outcomes, in our baseline we abstract from dynamic frictions like
36
moving costs.37 Several studies on China using similar static models in the spatial equilibrium
tradition have developed methods by which to include some sense of moving cost (Fan, 2019;
Tombe and Zhu, 2019). This exercise is of critical importance in studying China, as the hukou
system permanently reduces the public services available to migrants, particularly people born
in the countryside wishing to move to the city. Consistent with the Chinese context, migration
costs in Tombe and Zhu (2019) are modeled not as a one-time cost of relocation, but rather as
a flow cost which substantially scales down welfare according to their estimates.38 If relocation
costs are less substantial and paid once at the time of moving, as we might presume about the
American context we study, then in the medium-run they are likely less important than they
are in China.
Although we do not include moving costs in our baseline model, we can speculate how large
flow costs as in Tombe and Zhu (2019) might affect our results. It is easiest to consider the case
in which moving costs are uniform across skill groups, and are paid only if a worker currently
lives in a location other than where he is born. There will be an intuitive trade off between the
dispersion of preferences over cities and moving costs. The higher are moving costs, the higher
will θ need to be in order to rationalize the observed spatial distribution of workers. Since
moving costs and preference dispersion both discourage movement, counterfactual population
flows in response to productivity or policy shocks may be similar in such an extension to our
baseline results.
7 Conclusion
We document that isolated cities tend to have less wage inequality. We develop a theory in which
the higher cost of tradeables in isolated cities makes them less attractive to live in, and high-skill
workers are less productive in smaller cities. We build a quantitative model to understand and
measure this mechanism. Our model bridges the gap between the spatial inequality literature
which abstracts from geography, and the economic geography literature, which abstracts from
inequality. We find that 16.5% of observed variations in skill wage premium is due to the
geographic location of cities. In addition, we find that a uniform increase in domestic trade costs
causes inequality to rise due to the interaction between a higher concentration of population
and the agglomeration advantage of high-skill labor. In a counterfactual experiment, we find
that the rise of Silicon Valley increased skill wage premium in California by 3.4% and welfare
inequality across the United States by 1.0%.
37
A recent literature has developed dynamic spatial models more suited to studying dynamic frictions. Kennan
and Walker (2011) use a dynamic discrete choice model to understand the role of learning and moving costs in US
interstate migration. Caliendo et al. (2017) develop a dynamic equilibrium model to understand how trade reforms
interacted with population movement in the European Union.
38
Tombe and Zhu (2019) specify migration costs as proportional to destination welfare, analogously to how iceberg
trade costs are specified as proportional to origin price. They estimate that these iceberg welfare costs of migration are
on average 2.8 in 2000 across Chinese province pairs. To get a sense of the magnitude of these costs we can compare
them with their trade counterpart. Anderson and Van Wincoop (2004) report iceberg trade costs across US states to
be around 1.7 on average.
37
References
Acemoglu, D. and Autor, D. (2011). Skills, tasks and technologies: Implications for employment
and earnings. In Handbook of labor economics, volume 4, pages 1043–1171. Elsevier.
Aguiar, M. and Bils, M. (2015). Has consumption inequality mirrored income inequality. Amer-
ican Economic Review, 105:2725–2756.
Albouy, D. (2012). Are big cities bad places to live? estimating quality of life across metropolitan
areas. mimeograph.
Allen, T. and Arkolakis, C. (2014). Trade and the topography of the spatial economy. The
Quarterly Journal of Economics, 129(3):1085–1140.
Allen, T., Arkolakis, C., and Li, X. (2016). Optimal city structure. mimeograph.
Allen, T. and Donaldson, D. (2018). The geography of path dependence. Technical report,
Working Paper.
Anderson, J. E. and Van Wincoop, E. (2004). Trade costs. Journal of Economic literature,
42(3):691–751.
Antràs, P., Garicano, L., and Rossi-Hansberg, E. (2006). Offshoring in a knowledge economy.
Quarterly Journal of Economics, 121(1).
Bacolod, M., Blum, B. S., and Strange, W. C. (2009). Skills in the city. Journal of Urban
Economics, 65(2):136–153.
Bartelme, D. (2015). Trade costs and economic geography: evidence from the us. Ann
Arbor, MI: Department of Economics, University of Michigan, https://drive. google.
com/file/d/0B fRktLO V0ncHNNcFowMlF0Yms/view.
Bartik, T. J. (1991). Boon or boondoggle? the debate over state and local economic development
policies.
Baum-Snow, N., Freedman, M., and Pavan, R. (2014). Why has urban inequality increased?
Working paper.
Baum-Snow, N. and Pavan, R. (2012). Understanding the city size wage gap. The Review of
economic studies, 79(1):88–127.
Broda, C. and Weinstein, D. E. (2006). Globalization and the gains from variety. The Quarterly
journal of economics, 121(2):541–585.
38
Caliendo, L., Opromolla, L. D., Parro, F., and Sforza, A. (2017). Goods and factor market
integration: a quantitative assessment of the eu enlargement. Technical report, National
Bureau of Economic Research.
Card, D. (2009). Immigration and inequality. The American Economic Review, 99(2):1.
Ciccone, A. and Peri, G. (2006). Identifying human-capital externalities: Theory with applica-
tions. The Review of Economic Studies, 73(2):381–412.
Combes, P.-P., Duranton, G., Gobillon, L., Puga, D., and Roux, S. (2012a). The productivity
advantages of large cities: Distinguishing agglomeration from firm selection. Econometrica,
80(6):2543–2594.
Combes, P.-P., Duranton, G., Gobillon, L., and Roux, S. (2012b). Sorting and local wage and
skill distributions in france. Regional Science and Urban Economics, 42(6):913–930.
Coşar, A. K. and Fajgelbaum, P. D. (2016). Internal geography, international trade, and regional
specialization. American Economic Journal: Microeconomics, 8(1):24–56.
Davis, D. R. and Dingel, J. I. (2014). The comparative advantage of cities. Technical report,
National Bureau of Economic Research.
Davis, D. R. and Mishra, P. (2007). Stolper-samuelson is dead: And other crimes of both theory
and data. In Globalization and poverty, pages 87–108. University of Chicago Press.
Desmet, K., Nagy, D. K., and Rossi-Hansberg, E. (2016). The geography of development.
Journal of Political Economy.
Diamond, R. (2015). The determinants and welfare implications of us workers diverging location
choices by skill: 1980-2000. American Economic Review.
Ellison, G. and Glaeser, E. L. (1999). The geographic concentration of industry: does natural
advantage explain agglomeration? American Economic Review, 89(2):311–316.
Fajgelbaum, P. and Gaubert, C. (2018). Optimal spatial policies, geography and sorting. Tech-
nical report, National Bureau of Economic Research.
Fajgelbaum, P., Morales, E., Surez-Serrato, J. C., and Zidar, O. (2015). State taxes and spatial
misallocation. Technical report.
39
Fan, J. (2019). Internal geography, labor mobility, and the distributional impacts of trade.
American Economic Journal: Macroeconomics, forthcoming.
Farrokhi, F. (2018). Skill, agglomeration, and inequality in the spatial economy. Technical
report.
Fujita, M., Krugman, P. R., and Venables, A. (2001). The spatial economy: Cities, regions, and
international trade. MIT press.
Fujita, M. and Thisse, J.-F. (2006). Globalization and the evolution of the supply chain: Who
gains and who loses? International Economic Review, 47(3):811–836.
Giannone, E. (2017). Skilled-biased technical change and regional convergence. Technical report,
University of Chicago Working Paper, available at: http://home. uchicago. edu/˜ elisagian-
none/files/JMP ElisaG. pdf.
Glaeser, E. L. and Resseger, M. G. (2010). The complementarity between cities and skills*.
Journal of Regional Science, 50(1):221–244.
Goldin, C. D. and Katz, L. F. (2009). The race between education and technology. Harvard
University Press.
Gould, E. D. (2007). Cities, workers, and wages: A structural analysis of the urban wage
premium. The Review of Economic Studies, 74(2):477–506.
Hummels, D., Jørgensen, R., Munch, J., and Xiang, C. (2014). The wage effects of off-
shoring: Evidence from danish matched worker-firm data. The American Economic Review,
104(6):1597–1629.
Hummels, D. and Lee, K. Y. (2017). The income elasticity of import demand: Micro evidence
and an application. Technical report, National Bureau of Economic Research.
Katz, L. F. et al. (1999). Changes in the wage structure and earnings inequality. Handbook of
labor economics, 3:1463–1555.
Kennan, J. and Walker, J. R. (2011). The effect of expected income on individual migration
decisions. Econometrica, 79(1):211–251.
Krugman, P. (1980). Scale economies, product differentiation, and the pattern of trade. The
American Economic Review, 70(5):950–959.
Krugman, P. (1991a). History and industry location: the case of the manufacturing belt. The
American Economic Review, 81(2):80–83.
40
Krugman, P. (1991b). Increasing returns and economic geography. The Journal of Political
Economy, 99(3):483–499.
Lindley, J. and Machin, S. (2014). Spatial changes in labour market inequality. Journal of
Urban Economics, 79:121–138.
Matano, A. and Naticchioni, P. (2011). Wage distribution and the spatial sorting of workers.
Journal of Economic Geography, 12(2):379–408.
McCarty, N., Poole, K. T., and Rosenthal, H. (2016). Polarized America: The dance of ideology
and unequal riches. mit Press.
Monte, F., Rossi-Hansberg, E., and Redding, S. J. (2015). Commuting, migration, and local
employment elasticities.
Moretti, E. (2013). Real wage inequality. American Economic Journal: Applied Economics,
5(1):65–103.
Rosenthal, S. S. and Strange, W. C. (2004). Evidence on the nature and sources of agglomeration
economies. Handbook of regional and urban economics, 4:2119–2171.
Serrato, J. C. S. and Zidar, O. (2016). Who benefits from state corporate tax cuts? a local labor
markets approach with heterogeneous firms. The American Economic Review, 106(9):2582–
2624.
Simonovska, I. and Waugh, M. E. (2014). The elasticity of trade: Estimates and evidence.
Journal of international Economics, 92(1):34–50.
Tombe, T. and Zhu, X. (2019). Trade, migration and productivity: A quantitative analysis of
china. American Economic Review, forthcoming.
Wilkinson, R. G. and Pickett, K. E. (2006). Income inequality and population health: a review
and explanation of the evidence. Social science & medicine, 62(7):1768–1784.
41
Appendix For Online Publication
A Data appendix
In this appendix, we describe in detail the data we used, where we got it, and how we processed
it. The goal is that a researcher wishing to replicate our analysis will be able to use this section
and code available on our website to exactly replicate and understand our results.
42
A.3 Industry and Occupation Classification
Industries
1 Agriculture, Forestery, and Fisheries
2 Mining
3 Construction
4 Food and kindred products:
5 Textile mill products:
6 Apparel and other finished textile products
7 Paper and allied products
8 Printing, publishing, and allied industries
9 Chemicals and allied products
10 Petroleum and coal products
11 Rubber and miscellaneous plastics products
12 Leather and leather products
13 Lumber and wood products, except furniture
14 Stone, clay, glass, and concrete products
15 Metal industries
16 Machinery and computing equipment
17 Electrical machinery, equipment, and supplies
18 Transportation equipment
19 Professional and photographic equipment, and watches
20 Transportation
21 Communications
22 Utilities and sanitary services
23 Wholesale Trade
24 Retail Trade
25 Finance, Insurance, and Real Estate
26 Business and Repair Services
27 Personal Services
28 Entertainment and Recreation Services
29 Professional and Related Services
30 Public Administration
Occupations
1 Executive, Administrative, and Managerial Occupations
2 Management Related Occupations
3 Engineers, Architects, and Surveyors
4 Technical, Sales, and Administrative Support Occupations
5 Sales Occupations
6 Administrative Support Occupations, Including Clerical
7 Private Household Occupations
8 Protective Service Occupations
9 Service Occupations, Except Protective and Household
10 Farm Operators and Managers
11 Other Agricultural and Related Occupations
12 Mechanics and Repairers
13 Mechanics and Repairers, Except Supervisors
14 Construction Trades
15 Extractive Occupations
16 Precision Production Occupations
17 Machine Operators, Assemblers, and Inspectors
18 Transportation and Material Moving Occupations
19 Math and Computer Scientists, Natural Scientists, Teachers(Postsecondary), Social Scientists and Urban Planners
20 Health Diagnosing Occupations, Health Assessment and Treating Occupations, Therapists
21 Teachers(Except Postsecondary), Librarians, Archivists, and Curators, Social, Recreation, and Religious Workers$
22 Lawyers and Judges
23 Writers, Artists, Entertainers, and Athletes
43
population. Table 13 documents the positive relationship between the skill wage premium and
the skill population ratio. The point estimates vary significantly, but the relationship is positive
across a number of specifications.
Table 14 documents the relationship between skill wage premium and remoteness. The
relationship is negative and statistically significant in the simple regression of skill premium
against remoteness (columns 1-2). The correlation is smaller in size, but still significant, when
we control for city population (columns 3-4). However, the coefficient on remoteness loses its
significance when in addition to city population we include state fixed effects (columns 5-6).
Log wage Log skill wage prem Log skill pop ratio
Table 12: Regressions documenting the relationship between wage, skill premium, skill ratio and population
Table 13: Regressions documenting the relationship between skill wage premium and skill population ratio
44
Dependent variable: Log skill wage premium
(1) (2) (3) (4) (5) (6)
Table 14: Regressions documenting the relationship between remoteness and the skill wage premium
45
with our structural model. Allen and Arkolakis (2014) use σ = 9. This difference alone
does not explain the different estimates. In this section, all reported estimates are for
σ = 9 whether it be our estimation or those of Allen and Arkolakis. Furthermore, in
our estimation because our water map is somewhat different from that used in Allen and
Arkolakis (discussed below), we penalize off water transport when shipping by water less
than Allen and Arkolakis. Allen and Arkolakis assume that when shipping via water, it is
ten times as expensive to transverse a non-water pixel on the map than a water pixel. In
our baseline we assume it is only 3.5 times as expensive. For the purposes of comparison,
in this section all reported estimates use the Allen Arkolakis value of ten.
2. Differences in data: The input value of truck transport in Allen and Arkolakis’ repli-
cation data is exactly twice what is reported in the 2007 Commodity Flow Survey data
we downloaded. It appears this is a bug. In column (3) in Table 15, we run the code of
Allen and Arkolakis with half the value of truck transportation in their replication data.
This change increases the estimated value of the variable cost of water and air transport,
bringing their estimates closer to ours (though still significantly lower).
In the Commodity Flow Survey data, pure water transport and pure rail transport
are separated from transport via water and truck and rail and truck.39 Allen and Arko-
lakis use only pure water and rail transport in their input data, whereas we count both
categories. In column (4) we run our code using only pure water and pure rail figures.
This changes our estimates, but does not bring them significantly closer to those in Allen
and Arkolakis.
The fast marching algorithm used to compute distances between locations uses maps
of the United States. The maps we use to compute distances for road and rail are visually
nearly exactly the same as those in Allen and Arkolakis. The water maps differ, however.
We allow (cheap) water transport only along common shipping routes in the ocean. Allen
and Arkolakis allow water transport along any part of the ocean. Column (5) reports the
results when run our baseline code with a water map similar to that of Allen and Arkolakis.
That is, we use our map, but also allow cheap movement in the coastal waters around the
United States. This change hardly affects our baseline estimates. A difference we did not
examine, but could potentially affect estimates is that our maps and those of Allen and
Arkolakis use different projections. Ours use the projection NAD83:4269.40
3. Differences in code: Allen and Arkolakis estimate the parameters for the shippers’
discrete choice of mode of transport minimizing the following loss function. Let ε(β)m
od
be the difference between the predicted and observed fraction of shipments of mode m
39
Explicit category definitions for CFS data can be found here: www.rita.dot.gov/bts/sites/rita.dot.gov.bts/files/publications/com
40
In addition to these differences, the coordinates used by Allen and Arkolakis for CFS areas appear to be rounded
as is typical when exporting data from Stata. Their coordinates range from -2.2 to 2.1 million on the x-axis and from
-1.2 to 1.4 million on the y-axis. All coordinates with absolute value above one million have the final five digits rounded
to zero. As we were unable to precisely link our data sets by CFS region, the extent that this affects estimates is not
clear.
46
between origin o and destination d evaluated at parameter vector β. Let N be the total
number of bilateral pairs:
X 1 XX
| ε(β)m
od |
m
N o d
XXX 2
(ε(β)m
od )
m o d
The two loss functions deliver qualitatively different solutions to the problem both in Allen
and Arkolakis’ code and in ours. We show how this affects our results by first running
Allen and Arkolakis’ baseline code using our data, and then running their code using our
data as well as our objective function.41 In column (6) we see results that are more similar
to Allen and Arkolakis than in the baseline. Using our objective function in column (7), we
move the results much closer to our baseline. In column (8) we run Allen and Arkolakis’
code with their data but with our objective.42 Column (9) is our code and data, with
the Allen and Arkolakis objective. In all of these exercises, we see substantial, but not
complete convergence in our respective results. One caveat is that in column (8) the air
transport variable costs become even smaller than those estimated in Allen and Arkolakis.
To sum up, we find that the coefficients on all the variable costs except roads are quite
sensitive to changes in the exact data and specification used to perform the Allen and Arkolakis
bilateral trade cost estimation. Across specifications in Table 15, our estimates of the variable
cost of truck transport are fairly similar, at least of the same order of magnitude, as those
estimated by Allen and Arkolakis. Since almost all shipping in the continental United States is
by truck (more than 97% of the value in our CFS data), the final bilateral trade costs produced
by the original Allen and Arkolakis code and our code are quite similar. Table 16 presents
summary statistics for our estimates. We conjecture that the estimates of the cost of shipping
for other modes of transport are sensitive to specification and inputs, but ultimately matter
little for the final bilateral trade costs estimates. To thoroughly examine this conjecture would
require a more careful analysis which is out of the scope of this paper.
As a final comment, the results in our paper continue to be based on our baseline estimates.
We believe that it is proper to count water and truck as a water shipment and rail and truck as
a rail shipment since around 50% of the value of rail shipments in our data also involve trucking,
and around 30% of the value of water shipments involve trucking. We also believe that forcing
water shipments to be along trade routes to ports is also a realistic assumption, since loading
and unloading cargo without a port is costly. Finally, we prefer the smoother least squares loss
41
Because our input data on demographics was not in the same format as in Allen and Arkolakis, in our final gravity
regressions we altered Allen and Arkolakis’ code to omit demographic similarity between locations. This may be
driving some of the results we report here.
42
Demographic similarity variables are included in the gravity regression here.
47
Transport Type (1) (2) (3) (4) (5) (6) (7) (8) (9)
Road var 0.5636 0.4702 0.5675 0.4764 0.4702 0.4760 0.4287 0.4377 0.4654
Rail var 0.1434 0.4174 0.1426 0.4529 0.4174 0.0599 0.3614 0.3541 0.3936
Water var 0.0779 0.7736 0.2153 0.7153 0.7799 0.1770 0.4322 0.6364 0.6049
Air var 0.0026 0.1744 0.0354 0.1200 0.1747 0.0059 0.2622 0.0000 0.4106
Rail fixed 0.4219 0.3729 0.3986 0.4719 0.3726 0.3564 0.1772 0.4217 0.6115
Water fixed 0.5407 0.4126 0.3986 0.5695 0.4175 0.3564 0.2480 0.4853 0.8737
Air fixed 0.5734 0.6769 0.5315 0.7843 0.6763 0.4603 0.3189 0.6691 0.8800
Table 16: Summary statistics for geographic component of bilateral trade costs (Tg )
1. AA Baseline
2. FJ baseline
3. AA Half Truck
4. FJ Double Truck / Only water,only rail
5. FJ with AA-like water map
6. AA code FJ data
7. AA code FJ data FJ mode obj
8. AA code/data FJ mode obj
9. FJ code/data AA mode obj
D Numerical algorithms
D.1 Solving for productivities and amenities
We treat data on wages and employment, wH , wL , nH , and nL as the outcome of a spatial
equilibrium. Given these data and our recovered productivity ratios β̄H /β̄L and amenity ratios
ūH /ūL from the residuals of relative labor demand and supply, the following algorithm solves
for productivities inclusive of spillovers A’s and amenities of high-skill workers ūH ’s. At these
calibrated values of A’s and ūH ’s, the model exactly predicts the data on wH , wL , nH , and nL .
48
Given that trade costs are symmetric, we reduce the two systems of equations described by
(25) using relation (26),
1−σ σ−1
Z
A(i)1−σ = λWH δ NHδθ ν(i)1−σ nH (i)−1 wH (i)−1 b(i) d(i, j)1−σ ν(j)1−σ A(j)σ−1 dj (30)
J
As long as (26) holds, the solution to (30) will be the solution to both systems of equations (25).
The following algorithm solves for amenities and productivities up to scale.
1. Start with an initial guess for productivity, A(0) (i).
2. Compute the kernel,
1−σ σ−1
3. Define κ ≡ λWH δ NHδθ , and in iteration t, f (i) ≡ A(i)1−σ . Define
f (i)
f˜(i) ≡ R
J f (i)di
as a normalization that sets the integral over f˜ to one. Then, the system of integral
equations described by (30) is equivalent to:
Z
f˜(i) = κ K(j, i)f˜(j)−1 dj (31)
J
Initial guess equals f (0) (i) = A(0) (i)1−σ . In iteration t ≥ 1, update f (t) (i) according to
this updating rule:
Since we divide integrals in (32), we do not need to know κ to update our guess. If at
iteration t, |f˜(t) (i) − f˜(t−1) (i)| < 10−12 for all i, stop updating and go to the next step.
Otherwise, continue iterating using the updating rule (32).
The output of this step is a vector of f˜(i)’s that satisfy (31), and so (30), and so the two
systems of equations (25).
4. As a check that the solutions are correct, the following must be a constant equal to κ for
all i,
f˜(i) f˜(i0 )
κ= R =R
˜ −1 dj 0 ˜ −1 dj
J K(j, i)f (j) J K(j, i )f (j)
49
Normalize A(i0 ) = 1 for city i0 , and calculate all other A(i)’s.
5. Using equation (26),
σ−1−δθ 1−σ−δ δ
ūH (i) C(i)1−δ nH (i) (σ−1)θ wH (i) σ−1 b(i) σ−1 A(i)δ ν(i)−δ
= σ−1−δθ 1−σ−δ δ
ūH (j) C(j)1−δ nH (j) (σ−1)θ wH (j) σ−1 b(j) σ−1 A(j)δ ν(j)−δ
Normalize the amenity value of city i0 , ūH (i0 ) = 1, and calculate all other ūH (i)’s.
where ν̃ and C̃ are replaced from equations 17-18. The pair of 34–35 (or equivalently the pair
of 19–20) give us two integral equations. The two systems can be reduced to one using the
following relation, that is equivalent to equation (21),
1−σ σ−1 (σ−1)(1−δ)
A(i)1−σ ν̃(i)σ−1 nH (i)wH (i)σ b(i)−1 = λūH (i) δ nH (i) δθ wH (i)1−σ C̃(i) δ (36)
Given exogenous parameters we can write every endogenous variable as a function of popu-
lation of high-skill workers nH (i). Our solution algorithm takes advantage of this feature of the
model to update our guess for nH (i) in each iteration. The algorithm is as follows:
50
6. Let
−1
w̃H (i) ≡ λ 2σ−1 wH (i)
1−σ σ−1
7. Let f (i) ≡ w̃H (i)1−σ , κ ≡ WH δ NHδθ , and
Then, system of integral equations (35) can be written as follows (notice that the scale
parameter λ cancels out):
Z
f (i) = κ K(j, i)f (j) dj (37)
J
The solution to (37) is equivalently the solution to the pair of systems of equations 34–35.
In iteration t, update f (t) (i) according to
Equation (38) is our updating rule. Note that we do not need to know κ to update our
guess. If f (t+1) (i) is not close enough to f (t) (i), go to step 2 in order to continue iterations.
Otherwise, go to the next step.
The output of this step is a vector of f (i)’s that satisfy (37) and equivalently the systems
of equations 34–35.
R
8. As J wH (j)dj = 1 (the normalization defined in equilibrium), calculate wages:
w̃H (i)
wH (i) = R
J w̃H (j)dj
9. Calculate λ: Z Z
1
1= wH (j)dj = λ 2σ−1 w̃H (j)dj
J J
So,
hZ i−(2σ−1)
λ= w̃H (j)dj
J
10. Find κ,
f (i) f (`)
κ= R =R
J K(j, i)f (j) dj J K(j, `)f (j) dj
The above should hold for all i and `. This step, thus, is also a check that the solutions to
51
integral equations are correct. Then, calculate:
1 δ
WH = NHθ κ 1−σ
Once wH (i) and WH are known, it is straightforward to calculate all other equilibrium
objects.
52